Finding vals in delimited text, copy line, replace some text

Help with writing and playing macros

Finding vals in delimited text, copy line, replace some text

Postby cfay » Wed Sep 12, 2007 7:49 pm

I am new to macros and have been reading up on how they work. Because I need to get this project done, I was thinking I should post this question and see if someone might be able to point me in the right direction. Once I get into these macros, I am sure I will be able to build and learn.

Anyway, here is the deal:

I have 100's of files. They have comma delimited text and the beginning of each row looks like this:
Code: Select all
0,0,"DOC","014","000031108.doc","","","",
0,0,"DOC","014","000031109.doc","","","",
0,0,"DOC","014","000031110.doc","","","",
0,0,"DOC","014","000031111.doc","","","",
28,0,"ZIP","014","000031112.zip","","",""
28,0,"SCC","014","000031113.scc","","",""
28,0,"TSV","014","000031114.TSV","","",""
28,0,"TXT","014","000031115.TXT","","",""
28,0,"TSV","014","000031116.TSV","","",""
0,0,"IDX","014","000031117.idx","","","",
29,0,"ZIP","014","000031118.zip","","",""
29,0,"MF","014","000031119.MF","","","","
29,0,"CLASS","014","000031120.class","","
29,0,"CLASS","014","000031121.class","","


What I need to do is search the third column for certain extensions, being ones that I want to exclude. I have a list of these extensions. Examples from above are IDX, CLASS, and SCC.

Once I find a row that has one of those extensions, I want to copy that whole line to a new file (all rows into 1 file)
Then I want to replace the third, fourth and fifth columns with replacement text - always the same.

So this:
29,0,"CLASS","014","000031121.class","","

would become this:
29,0,"TIF","","PlaceHolder.TIF","","


Does this make sense? I have been reading a bunch of your posts and I am guessing this will be a breeze for all you experts. Since I have never made a (successful) macro in ue, I have not gotten there yet.

Any help is greatly appreciated!
User avatar
cfay
Newbie
 
Posts: 6
Joined: Tue Sep 11, 2007 11:00 pm

Re: Finding vals in delimited text, copy line, replace some text

Postby cfay » Thu Sep 13, 2007 1:31 am

Well, I am not sure if I just need to be patient for someone to be able to reply or I just explained poorly. Anyway, I have a macro that partly works. If anyone can help with the issues, I would love it.

Here is what I have:

Code: Select all
InsertMode
ColumnModeOn
HexOff
UnixReOff
Find "TIFF"^p"
Replace All ""
Find "Hold"^p
Replace All ""
Find "CollectionID=0"
Replace All "CollectionID=1234"
Find "DocumentTextFile"^p
Replace All ""
Find "ExternalID=31"
Replace All "ExternalID=15"
Find "mar"
Key HOME
Key SHIFT
StartSelect
Key END
Copy
Open "\\Path1\Path2\Vol007\Copy of Vol007-ExcludedFiles_IncludedRecords.txt"
Bottom
IfColNum 1
Else
"
"
EndIf
Paste
"
"
CloseFile Save
ClearClipboard
Clipboard 0
EndIf
Find "scr"
Key HOME
Key SHIFT
StartSelect
Key END
Copy
Open "\\Path1\Path2\Vol007\Copy of Vol007-ExcludedFiles_IncludedRecords.txt"
Bottom
IfColNum 1
Else
"
"
EndIf
Paste
"
"
CloseFile Save
ClearClipboard
Clipboard 0
EndIf
Find "dms"
Key HOME
Key SHIFT
StartSelect
Key END
Copy
Open "\\Path1\Path2\Vol007\Copy of Vol007-ExcludedFiles_IncludedRecords.txt"
Bottom
IfColNum 1
Else
"
"
EndIf
Paste
"
"
CloseFile Save
ClearClipboard
Clipboard 0
EndIf
Find "mid"
Key HOME
Key SHIFT
StartSelect
Key END
Copy
Open "\\Path1\Path2\Vol007\Copy of Vol007-ExcludedFiles_IncludedRecords.txt"
Bottom
IfColNum 1
Else
"
"
EndIf
Paste
"
"
CloseFile Save
ClearClipboard
Clipboard 0
EndIf
Find RegExp "0,"scr","*","*.*""
Replace All "0,"TIF","","ExclusionPlaceHolder.TIF"
Find RegExp "0,"mar","*","*.*""
Replace All "0,"TIF","","ExclusionPlaceHolder.TIF"
Find RegExp "0,"dms","*","*.*""
Replace All "0,"TIF","","ExclusionPlaceHolder.TIF"
Find RegExp "0,"mid","*","*.*""
Replace All "0,"TIF","","ExclusionPlaceHolder.TIF"


The only problem that I have is that it is copying lines that are not to be excluded. It also leaves an extra line between each exclusion line that it copies out.

Anyone? Anyone? Bueller? :D

lol. Thank you,
User avatar
cfay
Newbie
 
Posts: 6
Joined: Tue Sep 11, 2007 11:00 pm

Re: Finding vals in delimited text, copy line, replace some

Postby pietzcker » Thu Sep 13, 2007 6:43 am

Hi, I'm no macros expert, but I can give you a (Perl style) regex that should find only the lines you want.

Code: Select all
^(\d+,\d+,)(?:"(?:IDX|SCC|CLASS)","[^"]+","[^"]+")(.*)$

should work. It will capture the first two columns into backreference \1, ensure that the columns 3-5 match and capture the rest of the line into \2.

So you could search for the above regex and replace with
Code: Select all
\1"TIF","","ExclusionPlaceHolder.TIF"\2


If I do this search/replace manually in your example code, it works without creating extra line feeds.

HTH,
Tim
User avatar
pietzcker
Master
Master
 
Posts: 241
Joined: Sun Aug 22, 2004 11:00 pm

Re: Finding vals in delimited text, copy line, replace some

Postby cfay » Thu Sep 13, 2007 12:04 pm

Thanks, I will look at this, although I have never gotten into perl, even a little. Will this copy the line to another file? It looks like perl will let me specify all extensions with a pipe delimiter. I think I can do this, but wanted to understand the part that copies the lines to the other file. Do I put this in as a new macro in ue?

Thank you, :lol:
User avatar
cfay
Newbie
 
Posts: 6
Joined: Tue Sep 11, 2007 11:00 pm

Re: Finding vals in delimited text, copy line, replace some

Postby jorrasdk » Thu Sep 13, 2007 4:02 pm

It's not Perl, it a regular expression as implemented in Perl. Nothing here copies anything anywhere. You still have to work out a macro that does this.

Using pietzcker's regexp's I have this suggestion for a macro:

Code: Select all
InsertMode
ColumnModeOff
HexOff
PerlReOn
Clipboard 8
ClearClipboard
Bottom
IfColNumGt 1
"
"
EndIf
Top
Loop
Find RegExp "^(\d+,\d+,)(?:"(?:IDX|SCC|CLASS)","[^"]+","[^"]+")(.*)\r\n"
IfFound
CutAppend
Else
ExitLoop
EndIf
EndLoop
Open "C:\temp\Copy of Vol007-ExcludedFiles_IncludedRecords.txt"
Paste
ClearClipboard
Clipboard 0
Top
Find RegExp "^(\d+,\d+,)(?:"(?:IDX|SCC|CLASS)","[^"]+","[^"]+")(.*)$"
Replace All "\1"TIF","","ExclusionPlaceHolder.TIF"\2"
User avatar
jorrasdk
Master
Master
 
Posts: 275
Joined: Mon Mar 19, 2007 11:00 pm
Location: Denmark


Return to Macros