Removing bullet points form regex

jheilmann · October 17, 2024, 4:27pm

I often copy large blocks of text from pdfs and remove the pesky new line breaks in paragraphs using

{=replaceregex({clipboard}, "\r\n", " ", "sg")}

but sometimes I copy lists and I don't know how to easily remove the bullet points.

example text:

paragraph text
• major list item
o minor list item

ideally the snippet would replace all • and o characters with a new line break

Gaurang_Tandon · October 18, 2024, 6:15am

Hey @jheilmann , you can nest the replace calls like so:

{example_text="paragraph text
• major list item
• major list item 2
o minor list item 1
o minor list item 2"}
{=replaceregex(replaceregex(replaceregex(example_text, "\r\n", " ", "sg"), "\s*•\s*", "\n", "g"), "\s*o\b", "\n", "g")}

I used the regex \s*•\s* to also delete (unlimited) whitespace characters (\s) before and after the bullet.

In the case of the minor bullet point, I used a word boundary (\b) to avoid matching o inside words.

Let me know if it works for you.

jheilmann · October 18, 2024, 1:33pm

That's awesome, I changed it slightly, because, as is, it caught the o's at the end of words such as "to" and I moved the "\r\n", " ", "sg" to the last position so it would only remove the new lines between bullets

{=replaceregex(replaceregex(replaceregex({clipboard}, "\s*•\s*", "\n", "g"), "\s*\bo\b\s*", "\n", "g"), "\r\n", " ", "sg")}

thanks so much for getting me on the right track!