Tagged expression with an OR

Find, replace, find in files, replace in files, regular expressions

Tagged expression with an OR

Postby SAbboushi » Fri Nov 09, 2007 5:53 pm

Have been through FAQ, Powertips, and searched this find forum but not able to find what I am looking for...

I am using UE 11.20+4 with Unix regexp

I have several text files that I need to manipulate. Here is a bogus sample for discussion purposes:

abc bbb [abcdefg:hijk] qwe bbb ccc
asd
aaa bbb THEWORD aaa bbb
qwe
aaa bbb [bbcdefg:hijk] aaa bbb ccc
asd
aaa bbb THEWORD aaa bbb THEWORD
qwe THEWORD


I want the resulting file to look like this:

abcdefg:hijk,THEWORD
bbcdefg:hijk,THEWORD THEWORD THEWORD

I am basically looking for any text within the square brackets and I am also looking for a specific string ("THEWORD" in this example)

I wanted to use tagged expression search/replace but couldn't get past the search part:

(\[.+\]|THEWORD)

To test the search component, I was thinking to "List Lines Containing Strings" which would show me a list of all lines that had either a string within square brackets or "THEWORD" in it. BUT it only shows lines that have "THEWORD" in it. "\[.+\]|" works fine by itself, but not when I add it to the OR search expression.

I want to delete all text from the beginning of the line to the "[" and to delete all characters from the "]" to (and INCLUDING) the end of line EXCEPT for any occurences of "THEWORD"

I also want to make sure that on those lines that do NOT have square brackets in it but have "THE WORD", that all characters OTHER THAN occurences of "THE WORD" will be deleted

I figured I could use tagged expressions (e.g. /1) to help, although I guess I would need to strip out the square brackets afterwards if I did it that way.


Would appreciate some help...!

With Regards-
Sam
User avatar
SAbboushi
Basic User
Basic User
 
Posts: 37
Joined: Sun Nov 07, 2004 12:00 am

Re: Tagged expression with an OR

Postby jorrasdk » Fri Nov 09, 2007 10:33 pm

There's always a risk of getting wrong feedback when using bogus data. I'm not sure I understand how your before example ends up as the shown result. But the macro below will produce the result you want but I'm not sure it necessarily will work on your "live" data.

InsertMode
ColumnModeOff
HexOff
UnixReOn
Top
Find "^p"
Replace All " "
Find "THEWORD"
Replace All "ÿ "
Find RegExp "^[^ÿ\[]*(.)"
Replace All "\1"
Find "["
Replace All "^p["
Find RegExp "\][^ÿ\r\n]*"
Replace All "] "
Find RegExp "ÿ[^ÿ\r\n]*"
Replace All "ÿ "
Find "ÿ"
Replace All "THEWORD"
Find "["
Replace All ""
Find "]"
Replace All ","


Notes: "THEWORD" is replaced temporarily with a non-occuring character "ÿ". If this character can occur in your data or is problematic in your code page, then choose another one instead.

I repeat the macro with comments:

InsertMode
ColumnModeOff
HexOff
UnixReOn
Top
Find "^p"
all text on one line (assuming DOS line endings)
Replace All " "
Find "THEWORD"
Temporarily replacing THEWORD
Replace All "ÿ "
Find RegExp "^[^ÿ\[]*(.)"
From start to first ÿ or [
Replace All "\1"
Find "["
insert line breaks at each new [
Replace All "^p["
Find RegExp "\][^ÿ\r\n]*"
Remove text after ] to the end of line or first ÿ
Replace All "] "
Find RegExp "ÿ[^ÿ\r\n]*"
Remove text after ÿ to next ÿ or end of line
Replace All "ÿ "
Find "ÿ"
change back to THEWORD
Replace All "THEWORD"
Find "["
Remove [
Replace All ""
Find "]"
Remove ] and insert comma
Replace All ","


If this was not exactly what you wanted, either change the macro yourself or post some real examples of your data. Enclose it in [code][/code] tags.
User avatar
jorrasdk
Master
Master
 
Posts: 275
Joined: Mon Mar 19, 2007 11:00 pm
Location: Denmark

Re: Tagged expression with an OR

Postby SAbboushi » Sat Nov 10, 2007 7:59 pm

Thank you very much for your time. I have never done macros before so will need to do some learning so I can understand what you have done.

As an aside, I am curious: - if you or someone else could help me understand why my search string did not work as I expected, I would also be grateful!

i.e. (\[.+\]|THEWORD)
User avatar
SAbboushi
Basic User
Basic User
 
Posts: 37
Joined: Sun Nov 07, 2004 12:00 am

Re: Tagged expression with an OR

Postby jorrasdk » Sat Nov 10, 2007 9:23 pm

Because OR expressions do only support literal text (A|B) - no wildcards etc.
User avatar
jorrasdk
Master
Master
 
Posts: 275
Joined: Mon Mar 19, 2007 11:00 pm
Location: Denmark

Re: tagged expression with an OR

Postby Jane » Sat Nov 10, 2007 11:18 pm

It's interesting that when I tried version 11.20b the expression (\[.+\]|THEWORD) would not find text matching \[.+\], but when I tried it in version 12.20 with regexp set to perl compatible it matches both terms; i.e. it would match any text for \[.+\].

Jane
User avatar
Jane
Basic User
Basic User
 
Posts: 22
Joined: Sat Aug 05, 2006 11:00 pm
Location: Canada

Re: Tagged expression with an OR

Postby jorrasdk » Sat Nov 10, 2007 11:58 pm

Yes, perl regexp is more powerful than legacy unix style regexp.
User avatar
jorrasdk
Master
Master
 
Posts: 275
Joined: Mon Mar 19, 2007 11:00 pm
Location: Denmark

Re: Tagged expression with an OR

Postby SAbboushi » Sun Nov 11, 2007 4:20 am

So (\[.+\]|THEWORD) search WILL work with 12.20 perl regexp?

Thanks for your posts!
User avatar
SAbboushi
Basic User
Basic User
 
Posts: 37
Joined: Sun Nov 07, 2004 12:00 am

Re: Tagged expression with an OR

Postby SAbboushi » Mon Nov 12, 2007 1:55 pm

jorrasdk - EXCELLENT! FANTASTIC. Macros are my new friend! I just have to figure out what that code means! (thanks for your comments). EXCELLENT!

Your code works perfectly for me except for the last line of my text documents: it does not delete the text to the right of the last occurence of "THEWORD".

Your point is well taken re: bogus data... my example did not CONTAIN any text to the right of the last occurence of "THEWORD"... but assuming that it did, does anyone have an easy fix to delete that text too?

Thanks again-
Sam
User avatar
SAbboushi
Basic User
Basic User
 
Posts: 37
Joined: Sun Nov 07, 2004 12:00 am

Re: Tagged expression with an OR

Postby Mofi » Mon Nov 12, 2007 2:35 pm

Try this macro. The new code is with red color.

InsertMode
ColumnModeOff
HexOff
UnixReOn
Bottom
Find Up "THEWORD"
IfFound
Key LEFT ARROW
Key RIGHT ARROW
SelectToBottom
IfSel
Find RegExp "(\[.*\])"
Replace All SelectText "\1"
IfNotFound
Delete
EndIf
EndSelect
EndIf
EndIf

Top
Find "^p"
Replace All " "
Find "THEWORD"
Replace All "ÿ "
Find RegExp "^[^ÿ\[]*(.)"
Replace All "\1"
Find "["
Replace All "^p["
Find RegExp "\][^ÿ\r\n]*"
Replace All "] "
Find RegExp "ÿ[^ÿ\r\n]*"
Replace All "ÿ "
Find "ÿ"
Replace All "THEWORD"
Find "["
Replace All ""
Find "]"
Replace All ","
Key END
IfColNum 1
DeleteLine
EndIf
Bottom
IfColNumGt 1
InsertLine
EndIf
Top
User avatar
Mofi
Grand Master
Grand Master
 
Posts: 4039
Joined: Thu Jul 29, 2004 11:00 pm
Location: Vienna

Re: Tagged expression with an OR

Postby SAbboushi » Tue Nov 13, 2007 2:32 pm

Thanks Mofi! PERFECT!
User avatar
SAbboushi
Basic User
Basic User
 
Posts: 37
Joined: Sun Nov 07, 2004 12:00 am

Re: Tagged expression with an OR

Postby SAbboushi » Tue Nov 13, 2007 6:01 pm

Hi folks - instead of hardcoding "THEWORD", I would instead like to highlight a word within the text file before running the macro. I've been messing around with ^s and ^c, but don't seem to be making progress. I've downloaded Mofi's macro examples, and been looking at help files and searches on this forum, but spending a LOT of time and not feeling any closer. I found a comment that Mofi had made back in 2004

"I noticed, that ^c and ^s are simply not working in a regular expression search with Unix style. " - but wasn't sure if this had any relevance.

Can anyone please help me convert Mofi's macro above so that instead of hardcoding "THEWORD", the macro will instead use the highlighted text?

With Regards-
Sam
User avatar
SAbboushi
Basic User
Basic User
 
Posts: 37
Joined: Sun Nov 07, 2004 12:00 am

Re: Tagged expression with an OR

Postby Mofi » Wed Nov 14, 2007 8:58 am

Yes, ^c works only in UltraEdit style regular expressions or in non regular expression finds/replaces. In your macro THEWORD is used 3 times in non regular expression replaces. So you only have to replace THEWORD by ^c and of course copy the selected text to a clipboard, best not to Windows clipboard. The macro runs now only if something is selected on macro start.

IfSel
Clipboard 9
Copy

InsertMode
ColumnModeOff
HexOff
UnixReOn
Bottom
Find Up "^c"
IfFound
Key LEFT ARROW
Key RIGHT ARROW
SelectToBottom
IfSel
Find RegExp "(\[.*\])"
Replace All SelectText "\1"
IfNotFound
Delete
EndIf
EndSelect
EndIf
EndIf
Top
Find "^p"
Replace All " "
Find "^c"
Replace All "ÿ "
Find RegExp "^[^ÿ\[]*(.)"
Replace All "\1"
Find "["
Replace All "^p["
Find RegExp "\][^ÿ\r\n]*"
Replace All "] "
Find RegExp "ÿ[^ÿ\r\n]*"
Replace All "ÿ "
Find "ÿ"
Replace All "^c"
Find "["
Replace All ""
Find "]"
Replace All ","
Key END
IfColNum 1
DeleteLine
EndIf
Bottom
IfColNumGt 1
InsertLine
EndIf
Top
ClearClipboard
Clipboard 0
EndIf
User avatar
Mofi
Grand Master
Grand Master
 
Posts: 4039
Joined: Thu Jul 29, 2004 11:00 pm
Location: Vienna

Re: Tagged expression with an OR

Postby SAbboushi » Wed Nov 14, 2007 1:44 pm

Thanks Mofi! Absolutely WONDERFUL! I don't yet understand it, but it's WONDERFUL!!

I am a little embarrassed to ask the forum for more help on this macro - I have been reading up on macros to try to get up to speed, but I am having a tough time finding the pieces I need to understand to do what I want to do...

After the macro runs, I will end up with something like this:

abcdefg:hijk,THEWORD
bbcdefg:hijk,THEWORD THEWORD THEWORD
qweqwe:qwer, THEWORD THEWORD

What I REALLY need (which I have tried to do manually AFTER running this amazing macro) is a result that looks like this:

abcdefg:hijk,THEWORD
bbcdefg:hijk,THEWORD
bbcdefg:hijk,THEWORD(2)
bbcdefg:hijk,THEWORD(3)
qweqwe:qwer, THEWORD
qweqwe:qwer, THEWORD(2)

With large files, I have found how "human" I am in making errors as I am copying and pasting and typing... trying to get the above result.

I spent some time looking over posts having to do with duplicate values and duplicate lines hoping to get a clue (most posts had to do with deleting) - but this is still way beyond me... so all help is appreciated


The rest I can do manually if need be, but if anyone insists on helping me even further... :D

I need the file sorted with some more slicing and dicing for THIS final result:

"THEWORD","
abcdefg:hijk
bbcdefg:hijk
qweqwe:qwer
"
"THEWORD(2)","
bbcdefg:hijk
qweqwe:qwer
"
"THEWORD(3)","
bbcdefg:hijk
"

Mofi - your macro alone has saved me a tremendous amount of time - even if someone does not help me further, I am still way ahead here. Thanks to all for your posts.

And what an amazing product!

With Regards-
Sam
User avatar
SAbboushi
Basic User
Basic User
 
Posts: 37
Joined: Sun Nov 07, 2004 12:00 am


Return to Find/Replace/Regular Expressions