How to extract data from a google doc

cedricdebono · April 15, 2019, 8:07am

Can I run a snippet inside another snippet?

Let's say I have a snippet (called snippet1) that I want to use at a later date.
I put a number of text strings that I want to eventually replace, but I won't have all the info available at the same time,
So I want to be able to run a called snippet2, which will bring up a form that I can fill and find/replace those text strings INSIDE the snippet1
Then on a subsequent day, (once I get the info to fill in the remaining fields) I wanna be able to run snippet2 inside snippet1 again to update the remaining strings.

Here's a practical example.

I wanna write an announcement for a community event. I would use these strings to find/replace:

[startdate]
[starttime]
[enddate]
[endtime]
[discountpercentage]
[giftnames]
[wideimage]
[newsimage]
[alertimage]
[animatedgif]
[landingpageurl]

Those are just a few of them. I would write my announcement for various platforms, and then use the above as placeholders e.g.

[img=[wideimage]] (in this case, this would be for a forum post using bbcode)

Hey! tomorrow ([startdate] at [starttime], we'll be having a [discountpercentage] discount on all shop items plus [giftnames]

I might not have all the data in hand at the same time, plus those fields are going to be used multiple times for various announcement.

So, it the content above were part of snippet1, is there a way for me to run snippet2 as described earlier?

Thank you in advance, and my apologies if I couldn't make this scenario clear enough. I know it sounds pretty convoluted.

Regards,

Cedric

scott · April 15, 2019, 10:44am

The {import} command let's you include one snippet within another:

https://blaze.today/commands/import

My understanding though of your question, is you want to change the actual source contents of a snippet, which is not possible.

You could create multiple snippets, one for each of your placeholders so you have

/startdate -> [startdate]
/starttime -> [starttime]
...

And then in your snippet's you then use them like: "start date is {import:/startdate} and start time is {import:/starttime}".

You can then go in and update the snippets individually when you find out the necessary data.

You may also want to look at {urlload} and {urlsend} to dynamically load and save data.

cedricdebono · April 15, 2019, 7:09pm

Hi,

Yes in fact that's a method I used for a short while but I didn't find it practical for my needs.

As for {urlload} and {urlsend}, I'm not sure how they might help me

Thanks for answering though. Really appreciate it.

Cedric

scott · April 17, 2019, 1:43pm

{urlsend} and {urlload} let you persist information from your snippet and reload. They can be complex to use though.

Here is a brief guide on how to use them to save data to a Google Spreadsheet:

cedricdebono · April 17, 2019, 3:40pm

Ah neat. That opens up a lot of doors!

Thank you!!!

cedricdebono · April 17, 2019, 7:45pm

Ok so a step further now. Can the {urlsend) and {urlload} be used with Google Docs too?

scott · April 18, 2019, 6:40am

What are you attempting to do with Google Docs? They can be used anywhere you want to trigger a URL so if there is a Google Docs URL you want to trigger you should be able to use them there.

Note we are working on developing integrations with the Google API's (like the Sheets or Docs API) that would allow many more possibilities. I would love to hear about the different use cases you or others have for these types of integrations.

Cedric_Debono · April 18, 2019, 7:31am

It's a very specific scenario and pretty hard to explain in writing because I would have to show you multiple examples. If you want, we can jump on a Skype call and I can use screen sharing to walk you through the whole process. It could actually make for a very interesting case study for the community too actually.

Shall we do that?

scott · April 18, 2019, 10:55am

Sounds good, let me follow up with you by email.

cedricdebono · April 22, 2019, 2:35pm

Hi Scott,

So, after our exhange, I tried to apply the extractregex function in Text Blaze. But I'm faced with a little problem.

I'm using this:

{formtext: name=URL}

{urlload:{=URL}; done=(res) -> [
"date"=extractregex(res, "Date: (\w+)"),
"time"=extractregex(res, "Time: (\w+)")
]}

Form result after loading:
Found field 1: **{=Date}**
Found field 2: **{=Time}**

The data to extract is the following:

Date: datetest
Time: timetest1timetest timetest timetest timetest timetest timetest timetest timetest timetest timetest timetest timetest timetest timetest timetest timetest timetest timetest timetest timetest timetest timetest timetest timetest 1

The problem is that the command is only returning up to the first space, so in this case:

timetest1timetest

I tried googling other parameters so as to have it stop at the line break, but couldn't quite understand how to apply them

So, once again, I need your expert guidance.

P.S. Maybe it would be worth renaming this thread to "How to extract data from a google doc"?

scott · April 22, 2019, 3:46pm

Something like the following should work. The []'s mean "match any character" in these brackets. So "[\w ]" means match any letter, number or space.

Since you might have other things such as commas, dashes, question marks, or periods; it's probably best to just use "." which will match any character except a newline.

{res="Date: datetest
Time: timetest1timetest timetest timetest timetest timetest timetest timetest timetest timetest timetest timetest timetest timetest timetest timetest timetest timetest timetest timetest timetest timetest timetest timetest timetest 1
Another line....."}
Date is {=extractregex(res, "Date: (\w+)")}
Time v1 is {=extractregex(res, "Time: ([\w ]+)")}
Time v2 is {=extractregex(res, "Time: (.+)")}

I've renamed the thread as you suggest.

cedricdebono · April 22, 2019, 4:36pm

Yep, works a charm.

Version 1 gave me what I wanted. Version 2 gave me everything till the very end (which I don't need).

This leads me to another question. From my research I found that you can specify a character or string that will make the regex stop; something with

?:

How would I apply that in conjunction with version 2?

In other words: "Give me all the text after the specified string and up to the second specified string.

scott · April 22, 2019, 5:40pm

If you wanted everything up to " END", you could do something like:

{res="Date: datetest
Time: timetest1timetest timetest timetest timetest timetest timetest timetest timetest timetest timetest timetest timetest timetest timetest timetest timetest timetest timetest timetest timetest timetest timetest before the END timetest 1
Another line....."}

Time is {=extractregex(res, "Time: (.+) END")}

Cedric_Debono · April 23, 2019, 8:10am

Ok so that worked, but now I've stumbled into another issue. If I try to extract multiple paragraphs up until END it says nothing found.

I tried extracting everything and it looks like regex stops reading after a set number of characters.

scott · April 23, 2019, 9:16am

The "." doesn't match new lines. You can match new lines by using "[\s\S]" instead of a period.

See here for more info:

cedricdebono · April 23, 2019, 1:33pm

Ok, sorted!

Here's the finished snippet:

{formtext: name=URL; default=https://docs.google.com/document/d/1aZH6K3iLK3NdmG5wQpyimgvpiGRxi7GGMU3iEw7tLis/edit}
{urlload:{=URL}; done=(res) -&gt; [
"date"=extractregex(res, "Date: (\w+)"),
"time1"=extractregex(res, "Time: ([\w ]+)"),
"time2"=extractregex(res, "Time: ([\s\S].+) END")
]}

Form result after loading:
Date: {=date}
Time: {=time1}
Time: {=time2}

The only minor issue is that line breaks get replaced with

/n/n

But I guess a simple find/replace should do the trick (unless you have a simpler solution.

scott · April 23, 2019, 2:01pm

I'm not seeing the "\n\n" issue. If I edit the example I wrote above to include newlines as use [\s\S] it looks like everything is working correctly to me.

Could you post an example (your Google docs is accessible to me).

cedricdebono · April 23, 2019, 2:17pm

Doc is publicly available and with open commenting rights.

scott · April 25, 2019, 9:32am

I checked your doc and it looks like the "\n\n" are in the source itself, so yes you would need to find and replace them.

cedricdebono · April 25, 2019, 12:55pm

Weird. They're just regular line breaks in google docs. Anyway, thanks a ton for the help

I hope my turmoils will help someone else along the road lol