Find string B somewhere before string A

Find, replace, find in files, replace in files, regular expressions

Find string B somewhere before string A

Postby jan78 » Thu Jan 27, 2005 7:42 pm

Perhaps someone can help.

In a group of 15,000 html files, a few contain a certain error I need to fix.

Every file contains string A and string B.

String B should appear only after string A, never before it.

What I want to do is this:

Find in Files, all instances where string B appears before string A. (That means anywhere before... it could be immediately before, or several lines before).

UltraEdit will then generate a list of instances. (I can fix them manually since there won't be many.)

I can't seem to figure out what expression will find this. Do concepts like "somewhere before" and "somewhere after" (maybe spanning multiple lines) translate into any regexp syntax?

Any insight is appreciated.
User avatar
jan78
Newbie
 
Posts: 6
Joined: Thu Jan 27, 2005 12:00 am
Location: California, USA

Re: Find string B somewhere before string A

Postby Mofi » Fri Jan 28, 2005 2:25 am

Try Find In Files with Results to Edit Window and following regular expression in UltraEdit style:

string B[~|]+string A

The character | is an example for a character, which does not exist in any of your files.
User avatar
Mofi
Grand Master
Grand Master
 
Posts: 4051
Joined: Thu Jul 29, 2004 11:00 pm
Location: Vienna

Re: Find string B somewhere before string A

Postby palou » Fri Jan 28, 2005 2:31 am

The following macro let you test if StringA is before StringB
in a file:
Code: Select all
InsertMode
ColumnModeOff
HexOff
UnixReOn
EndSelect
Top
Find "StringA"
IfFound
StartSelect
SelectToBottom
Key Ctrl+END
Find  "StringB"
Replace All SelectText "StringB"
IfFound
EndSelect
Do what you want, you know StringA is before StringB
EndIf
EndIf


HTH (Hope This Help)
User avatar
palou
Basic User
Basic User
 
Posts: 46
Joined: Fri Dec 17, 2004 12:00 am
Location: Geneva / Switzerland

Re: Find string B somewhere before string A

Postby jan78 » Sun Jan 30, 2005 9:16 pm

OK.. success! Thank you both for the replies.

How funny that while looking for an arcane syntax, I didn't think of the most basic concept... use Find. Find string B, then find string A. If found, it means *error*!

So, I wrote a macro that does this. It works off of a list of the html filenames, opens each file, searches for error, then if found puts it in a list of only the files w/ the error.

May not be the most efficient thing to do, but it worked!

I tried to use your regexp Mofi because it would be much simpler to execute than what I did. But no success. Entering it just as you wrote (substituting my strings) came up with nothing found. That was with UE regexp (and I tried Unix regexp just to be sure I wasn't missing something).

What does the "~" character mean? I could not find this anywhere. If you would not mind would you elaborate in English what the expression does? Thanks!
User avatar
jan78
Newbie
 
Posts: 6
Joined: Thu Jan 27, 2005 12:00 am
Location: California, USA

Re: Find string B somewhere before string A

Postby Mofi » Mon Jan 31, 2005 2:46 am

"~" is described in the help at regular expression UltraEdit style.

[~|]+ means: Find one or more occurrences of any character (including \r\n) except |.

The problem with string B[~|]+string A is, that UltraEdit sometimes have problems to identify where to stop [~|]+. It should stop at first occurence of "string A", but this does not work always.
User avatar
Mofi
Grand Master
Grand Master
 
Posts: 4051
Joined: Thu Jul 29, 2004 11:00 pm
Location: Vienna

Re: Find string B somewhere before string A

Postby jan78 » Mon Jan 31, 2005 6:33 pm

Ah, you're right it IS in the UE help. It didn't turn up on my search for "~". This time, I went down the list of UltraEdit syntax regular expressions & eyeballed them one by one. Yes it was there.

I see many uses for this expression if only I could get it to work! (In my case it doesn't have a problem where to stop, it has a problem finding anything at all.)

It must be something I'm doing wrong... which is impossible for you to see, of course :). I'll keep at it, something will come.

--

By the way, maybe this is dense of me but why is the "except |" needed? As I understand it, the goal is to find String B and String A connected by one or more occurrences of any characters including line breaks. That in itself would accomplish the purpose. I don't quite see what the "except |" is for.
User avatar
jan78
Newbie
 
Posts: 6
Joined: Thu Jan 27, 2005 12:00 am
Location: California, USA

Re: Find string B somewhere before string A

Postby palou » Tue Feb 01, 2005 1:56 am

It is because the * (which means every char) doesn't match with
line break. So an expression like this:
"StringA*++StringB"
only find a match on the same line.

Regards,
Alain
User avatar
palou
Basic User
Basic User
 
Posts: 46
Joined: Fri Dec 17, 2004 12:00 am
Location: Geneva / Switzerland

Re: Find string B somewhere before string A

Postby jan78 » Thu Feb 03, 2005 4:46 am

Very interesting.

So if I understand correctly, you're saying that the "except" expression is a workaround for the pain of specifying all the different possible characters that might be included in the range.... & that it's much easier to turn it on its head and say, "Any character that isn't expressly excluded is included!" That way, the line breaks and other various characters are automatically included without having to specify them.

Is that the reason for it? Just guessing but it seems to make sense.
User avatar
jan78
Newbie
 
Posts: 6
Joined: Thu Jan 27, 2005 12:00 am
Location: California, USA

Re: Find string B somewhere before string A

Postby jan78 » Thu Feb 03, 2005 5:01 am

Also I am thinking that it solves a particular dilemma around line breaks which you seemed to be getting at.... that is, of having to specify line breaks when the number of them is unknown. For some reason, this can't be expressed directly in a regexp, or at least not in this implementation of it. Do I understand that correctly?
User avatar
jan78
Newbie
 
Posts: 6
Joined: Thu Jan 27, 2005 12:00 am
Location: California, USA

Re: Find string B somewhere before string A

Postby Mofi » Thu Feb 03, 2005 11:48 am

Yes, you understand it correctly. [~c]+ with c as character surely not exist in the text is the expression to select a whole block without knowing the number of line breaks. But as I already mentioned, sometimes UltraEdit does not stop selecting at the correct position defined by the string after [~c]+.
User avatar
Mofi
Grand Master
Grand Master
 
Posts: 4051
Joined: Thu Jul 29, 2004 11:00 pm
Location: Vienna

Re: Find string B somewhere before string A

Postby jan78 » Wed Feb 09, 2005 3:41 am

Thank you so much for the help.
User avatar
jan78
Newbie
 
Posts: 6
Joined: Thu Jan 27, 2005 12:00 am
Location: California, USA


Return to Find/Replace/Regular Expressions