Regular Expression Help

Danielle_Carollo · July 12, 2021, 7:39pm

I am trying to create a snippet that reads specific information from another website and returns it.
I think using extractregx is the best way to do this, but I am getting an error No Match Found.
The website does have drop downs when viewing, where you have to select show all, maybe that is the issue, because the info is inside that show all view.

Here is an example of what I'm trying to do.....

{urlload: https://xyz.co/index/payment#/query/PBhUZMBTumlTxQI1Ga0C7qQuvaB; done=res -> ["Transaction_ID" =extractregex(res, "\d\d\d\d\d\d\d\d\d\d\d\d\d\d\d\d\d"), "Auth_Code"=extractregex(res, "\d\d\d\d\d\d"), "Last_Debit"=extractregex(res, "\d\d\d\d\d\d\w\w\w\w\w\w\d\d\d\d"]}

Transaction ID {=Transaction_ID}
Authorization Code: {=Auth_Code}
Last 4 of Card Number: {=Last_Debit}

Cedric_Debono_Blaze · July 14, 2021, 10:45am

Hi @Danielle_Carollo,

Let's start one step at a time. Unfortunately, I can't verify whether this will work as the website is behind a login and I can't access that data. This might actually be an issue for you too (@scott - can you confirm please?).

First off, I noticed a mistake in your regex syntax, so I've fixed it below. Additionally, I changed the multiple \d and \w to \d+ and \w+, which is basically telling the command to keep matching digits/word characters until they run out.

Can you test this please?

{urlload: https://xyz.co/index/payment#/query/PBhUZMBTumlTxQI1Ga0C7qQuvaB; done=res -> ["Transaction_ID" =extractregex(res, "\d+"), "Auth_Code"=extractregex(res, "\d+"), "Last_Debit"=extractregex(res, "\d+\w+\d+")]}
Transaction ID {=Transaction_ID}
Authorization Code: {=Auth_Code}
Last 4 of Card Number: {=Last_Debit}

Cedric_Debono_Blaze · July 14, 2021, 1:45pm

@Danielle_Carollo,

Best way to be able to create a solution for you, is if you could provide me with some dummy text to test the regex on. Please make sure there's no private information, since this is a public forum.

Christian_Adams · July 21, 2021, 12:35pm

Hey Cedric! Thanks for your help. I'm a teammate of Danielle. If we wanted to use the regex on some JSON data, how would we go about doing that? Adding some dummy JSON below:
{
"Transaction_ID":"1234567890123456",
"Auth_Code":"123456",
"Last_Debit":"0000"
}

scott · July 21, 2021, 1:34pm

You should use the fromJSON function to parse JSON:

{data=fromJSON("{
\"Transaction_ID\":\"1234567890123456\",
\"Auth_Code\":\"123456\",
\"Last_Debit\":\"0000\"
}")}

Transaction ID: {=data.Transaction_ID}

Christian_Adams · July 21, 2021, 1:46pm

Awesome! Thanks Scott! I'll give that a go

Christian_Adams · July 21, 2021, 4:08pm

So if I wanted to combine urlload, extractregex, and fromJSON, would the syntax look like this?

{urlload: https://xyz.co/index/payment#/query/PBhUZMBTumlTxQI1Ga0C7qQuvaB; done=res -> {data=fromJSON("{\"Transaction_ID\": =extractregex(res, "\d{16}"))}")}}

for JSON example like:
{
"Transaction_ID":"1234567890123456"
}

scott · July 22, 2021, 12:12pm

No, most likely it would look something like this:

{urlload: https://xyz.co/index/payment#/query/PBhUZMBTumlTxQI1Ga0C7qQuvaB; done=res -> ["data"=fromJSON(res)]}

Transaction ID: {=data.Transaction_ID}

If your URL is returning JSON structured data, you won't need extractRegex.

Christian_Adams · July 22, 2021, 4:25pm

Thanks Scott! I'm continuing to dig into this more and I'm finding out that this could get a bit more complex since there's a layer of XML on top of the JSON. My thoughts are to use the {site} function to load in the XML and then use the fromJSON function to pull the data underneath the XML. Think I'm taking a good approach here? It's using Angular as a framework so it's a bit more complex than we first anticipated.

scott · July 23, 2021, 7:50am

Unfortunately, the {site} command can only read contents from the page your are currently on. It can't be used to parse the response of the {urlload}.

So your URL returns XML that embeds JSON, there are two approaches you might take.

Try to extract the xml embedded json field values directly. So a single "extractregex()" call.
Use a "extractregex()" to get the JSON from the XML, then use "fromJSON()" to parse it.

If you have only a single field you are interested, I would suggest (1).

If there are many fields on the JSON object you need, then I would suggest (2).