Non-scary regex

Hi all,

Some time ago, @scott tasked me with making a video about the use of regex in Text Blaze. Now I love exploring all the cool stuff that Text Blaze can do, but I've always looked at regex with fearful apprehension lol.

That said, I'm glad I made the video, because it helped me understand how regex works in Text Blaze, and how incredibly useful it is. So, I'm writing this thread to share the video with you, as well as give you some cool regex examples that you can swipe for your own snippets.

First, here's the video:

Before I jump into the examples, here's a list of regex characters for easy reference.

. matches any single character
\s matches any whitespace character
\d matches any digit
\w matches any word character a-z A-Z 0-9 _
? matches zero or one of the preceding character
* matches zero or more of the preceding character
+ matches one or more of the preceding character

Now, here are some examples:

Extracting a phone number in the following format: 555-1842

{formtext: cols=50; name=text; default=The phone number is 555-1842. I want to match it.}

Option 1: {=extractregex(text, "\d\d\d-\d\d\d\d")}
Option 2: {=extractregex(text, "\d{3}-\d{4}")}
Option 3: {=extractregex(text, "\d+-\d+")}

In both cases, I'm telling Text Blaze to match three digits (specified by the "\d"), followed by a dash and four more digits.

The second example is more economical because I'm using the number inside the curly brackets to tell the regex string how many times I want the "\d" (character) to be matched.

In option 3, instead of specifying the number of digits to match, I'm asking the regex to keep matching as many digits as it can find.

These three options might be useful in different scenarios. For instance, options 1 and 2 will strictly match a phone number in the format I specified, whereas option 3 will allow for any number of digits before and after the dash.

However, in all cases, the dash is required.

Try the following:

  1. Change the numbers in the text field.
  2. Change the number of digits before and after the dash in the text field.
  3. Change the dash character.
  4. Add another phone number (using the same format) at the beginning and at the end of the text. You'll notice that regex will give you the first phone number it encounters.

Extracting an email address

{formtext: cols=50; name=text; default=The email address is test123@test.co.uk. I want to match it.}

{=extractregex(text, "\w+@[.\w]+\b")}

This one's a little bit scary, but I'll try to explain it as best I can. If anything is unclear, please ask me in the comments and I'll be happy to clarify.

Here's what's happening:

  1. \w+@ is matching all word characters prior to the @ symbol.
  2. [.\w]+\b is matching as many periods and word characters as it can, up until a word boundary (meaning a punctuation mark followed by a space).

Some clarifications:

  • the period character has a special meaning in regex (see the list I provided earlier). Since I want regex to interpret the period character literally, I need to precede it with a backslash.
  • the square brackets mean "any combination of the included".
  • the \b represents the word boundary.

Try the following:

  1. Change the email address in the text field.
  2. Try removing the space after the .co.uk.
  3. Try changing .co.uk to ..com

You can see both of these examples at work in the video, as well as how they're used in conjunction with extractregex, testregex and replaceregex.

Of course, this is just the tip of the iceberg.

Now here's a challenge β€” hit me with all you've got! Come up with scenarios where we can use regex in our snippets, and tell me in the comments. I want to rack my brain and help you build them.

COME AT ME REGEX! I DO NOT FEAR YOU!!! :crazy_face:

3 Likes

Thank you so much for this article! It’s extremely helpful in understanding how to format regex formulas.

Hey @Peter_Monterubio, I'm so glad you found it helpful. Regex has long been the bane of my existence lol. But I'm finally warming up to it.

If you have any questions about how to use regex, drop them here and I'll be happy to help. Where I fail, I'm sure Scott's expertise will be able to fill in :crazy_face:

Here's a cool little variant that uses testregex with a formtext command, together with an error command to check whether the user's input matches the pattern of a valid email address:

{formtext: name=email}
{if: not testregex(email, "\w+@\w+.[\w.]+")}{error: Please enter a valid email address; block=yes}{endif}