Search and Delete text between two different search words

Find, replace, find in files, replace in files, regular expressions

Search and Delete text between two different search words

Postby cmaxnavy » Thu May 21, 2009 10:54 am

I'm new to UltraEdit and I'm using v14.20. I have been through the many help files and forum comments and cannot solve my search problem. I'm trying to search a large text doc and delete text between word1 and word2. In other words, I want to highlight and delete all of the text starting at word1 and ending at word2, including the search words (word1, word2). I have to do this many times in the document. So, I think a recursive routine would be helpful if that's doable. Otherwise, I think a macro that I invoke multiple times to complete the tasks would help. Any ideas?
Max
cmaxnavy
Newbie
 
Posts: 1
Joined: Thu May 21, 2009 10:47 am

Re: Search and Delete text between two different search words

Postby pietzcker » Thu May 21, 2009 3:11 pm

This sounds like you don't need a macro at all - just one single search and replace routine, using a Perl regular expression.

Open the "Replace" dialog, check the checkbox "Regular Expressions" and set the radio button "Perl regular expression" in the "Advanced" section of that dialog. Then search for

(?s)\bword1\b.*?\bword2\b

and replace all with nothing.

Caution: This fails if word1/word2 pairs can be nested (e. g., "word1 text text word1 text text word2 text text word2")
Also caution if your "word1/2" contains characters that are special to regular expressions like .*[]\+? and a few others. In that case, please be more specific about your exact words.

Explanation:
(?s) allows searches to span multiple lines
\b matches a word boundary, so if your word1 is "cat", only "cat" will match and not "advocate"
. matches any character (including newlines, thanks to (?s) above)
* allows for any number of matches (including zero)
? makes the * lazy so that it will only match as much as is absolutely necessary. This is mandatory because otherwise, in the text "word1 deletethis word2 dontdeletethis word1 deletethis word2" the regular expression would match from the very first word1 to the very last word2, deleting everything in-between.

HTH,
Tim
User avatar
pietzcker
Master
Master
 
Posts: 242
Joined: Sun Aug 22, 2004 11:00 pm

Re: Search and Delete text between two different search words

Postby ridgerunner » Fri Jun 12, 2009 5:50 pm

By adding a little negative lookahead to the ".*?" portion of pietzcker's perl style regex, you can match nested instances of "word1 word1 blah blah word2 word2" as follows:

Code: Select all
(?s)\bword1\b(?:(?!\bword1\b).)*?\bword2\b

Then you can run this regex recursively to remove nested word1-word2 instances from the inside out.
User avatar
ridgerunner
Basic User
Basic User
 
Posts: 18
Joined: Thu Sep 15, 2005 11:00 pm
Location: SLC, UT USA

Re: Search and Delete text between two different search words

Postby pietzcker » Sat Jun 13, 2009 4:26 am

Cool.
User avatar
pietzcker
Master
Master
 
Posts: 242
Joined: Sun Aug 22, 2004 11:00 pm

Re: Search and Delete text between two different search words

Postby zrob » Fri Oct 22, 2010 10:31 am

Hi,

I would like to delete everything between word1 and a semicolon. I've modified the perl regular expression
(?s)\bword1\b.*?\bword2\b
which works fine with two 'real' words, into:
(?s)\bword1\b.*?\b;\b

But that doesn't work. What do I have do add to make it accept the semicolon?

Thanks

Rob
zrob
Newbie
 
Posts: 1
Joined: Fri Oct 22, 2010 9:50 am

Re: Search and Delete text between two different search words

Postby Bracket » Fri Oct 22, 2010 11:32 am

The reason your modification isn't working is because "\b" references a word boundary, and a semicolon *is* a word boundary. If you want to make this work, you need to remove the "\b" on either side of the semicolon.
User avatar
Bracket
Basic User
Basic User
 
Posts: 32
Joined: Fri Oct 26, 2007 11:00 pm


Return to Find/Replace/Regular Expressions