Move Data to Delimited Format

Help with writing and playing macros

Move Data to Delimited Format

Postby umayxa3 » Wed Feb 13, 2013 2:57 pm

I know this has come up over and over (sorry MOFI) but I can't seem to find what I need in the forums (or in my notes).

I need to run a macro to rip through some files to convert ROW DATA to a DELIMITED FORMAT data.
I have about 30 files containing 100000 records. They need to be converted to a delimited format to import into SQL Server.

DATA TO CONVERT (REPEATED 100000 TIMES FOR EACH RECORD IN THE FILE)
-----------------------------------
...
BEGIN:>>DocTypeName: CS - ZZ Archive Results
[6]Last Name: SHMOE
[5]First Name: JOE
[1]P#: ZHELDHL22344
[9]Date: 02/13/2004
[2]Account/Client #: ASD2342113423docdate: 04/12/2004
>>FileTypeNum: 1
>>FullPath: w:\wil\ber\knuck\les.txt
...

DESTINATION FORMAT:
-----------------------------------
[1]|[2]|[BLANK]|[BLANK]|[5]|[6]|[BLANK]|[BLANK]|[9]

MOFI: Do you have a macro you could copy-and-paste for me? I promise I'll put it in my notes and not lose it =:)

Thanks,

-Allen
umayxa3
Newbie
 
Posts: 6
Joined: Sat Aug 15, 2009 8:33 am

Re: Move Data to Delimited Format

Postby Mofi » Thu Feb 14, 2013 3:01 am

Try this macro:

Code: Select all
InsertMode
ColumnModeOff
HexOff
UltraEditReOn
Bottom
IfColNumGt 1
InsertLine
EndIf
Top
TrimTrailingSpaces
Find MatchCase RegExp "%Last Name: +^(*^)^pFirst Name: +^(*^)^pP#: +^(*^)^pDate: +^(*^)^pAccount/Client #: +^(*^)$"
Replace All "!^3|^5|||^2|^1|||^4"
Find MatchCase RegExp "%[~!]*^p"
Replace All ""
Find MatchCase RegExp "%!"
Replace All ""

It converts

Code: Select all
BEGIN:>>DocTypeName: CS - ZZ Archive Results
Last Name: SHMOE
First Name: JOE
P#: ZHELDHL22344
Date: 02/13/2004
Account/Client #: ASD2342113423docdate: 04/12/2004
>>FileTypeNum: 1
>>FullPath: w:\wil\ber\knuck\les.txt

to

Code: Select all
ZHELDHL22344|ASD2342113423docdate: 04/12/2004|||JOE|SHMOE|||02/13/2004
User avatar
Mofi
Grand Master
Grand Master
 
Posts: 4049
Joined: Thu Jul 29, 2004 11:00 pm
Location: Vienna

Re: Move Data to Delimited Format

Postby umayxa3 » Fri Feb 15, 2013 4:53 pm

PERFECT!!
Thank you Mofi!

I've tucked that safely into my notes for the next time I need to do this (in 2yrs) =:)

-Allen
umayxa3
Newbie
 
Posts: 6
Joined: Sat Aug 15, 2009 8:33 am

Re: Move Data to Delimited Format

Postby umayxa3 » Mon Feb 18, 2013 11:22 am

Mofi,

I have 253 files to run this against.. (ugh).
What would be the best way to wrap this into a FindInFiles loop?

Thanks again,

-Allen
umayxa3
Newbie
 
Posts: 6
Joined: Sat Aug 15, 2009 8:33 am

Re: Move Data to Delimited Format

Postby Mofi » Tue Feb 19, 2013 1:16 am

If you are not sure that last line of every file has a DOS line terminator, run a Perl regular expression Replace in Files with search string (.)$(?!\r\n) and replace string \1\r\n

That appends a DOS line terminator at end of those files not already ending with a DOS line terminator.

To trim all spaces/tabs at end of the lines use an UltraEdit regular expression Replace in Files with search string [ ^t]+$ and an empty replace string. With the Perl regexp engine the search string would be [ \t]+$

Next run simply 3 UltraEdit regular expression Replace in Files using the search and replace strings as in the macro.

As macro:

PerlReOn
ReplInFiles MatchCase RegExp "Full directory path ending with\" "*.*" "(.)$(?!\r\n)" "\1\r\n"
UltraEditReOn
ReplInFiles MatchCase RegExp "Full directory path ending with\" "*.*" "[ ^t]+$" ""
ReplInFiles MatchCase RegExp "Full directory path ending with\" "*.*" "%Last Name: +^(*^)^pFirst Name: +^(*^)^pP#: +^(*^)^pDate: +^(*^)^pAccount/Client #: +^(*^)$" "!^3|^5|||^2|^1|||^4"
ReplInFiles MatchCase RegExp "Full directory path ending with\" "*.*" "%[~!]*^p" ""
ReplInFiles MatchCase RegExp "Full directory path ending with\" "*.*" "%!" ""
User avatar
Mofi
Grand Master
Grand Master
 
Posts: 4049
Joined: Thu Jul 29, 2004 11:00 pm
Location: Vienna

Re: Move Data to Delimited Format

Postby umayxa3 » Tue Feb 19, 2013 2:33 pm

Again.. thank you.

This is AWESOME!
umayxa3
Newbie
 
Posts: 6
Joined: Sat Aug 15, 2009 8:33 am

Re: Move Data to Delimited Format

Postby umayxa3 » Tue Feb 19, 2013 5:14 pm

I'm sorry...
I've hit another road block and I can't slice this at the source and I can't get my mind around doing this in UltraEdit..

The data I'm getting contains rows of similar data. I want to "filter" the useless data but can't figure out how to do this in a macro.

EXAMPLE OF THE DATA I'M LEFT WITH:
---------------------------------------
Last Name: SHMOE|First Name: Joe|P#: ZHELDHL22344|Data1: GOOD DATA|Data2: GOOD DATA
Last Name: SHMOE|First Name: Joe|P#: ZHELDHL22344|Data1: BAD DATA|Data2: BAD DATA
Last Name: QUEUE|First Name: Suzy|P#: BELDDHL2454544|Data1: GOOD DATA|Data2: GOOD DATA
Last Name: QUEUE|First Name: Suzy|P#: BELDDHL2454544|Data1: BAD DATA|Data2: BAD DATA
...
I want to keep the first record and toss out the second.

If I could just use the tag expression in the SEARCH I would resolve this.
Example of what I would like to do (but UE won't let me):

WHAT I WOULD LIKE TO DO:
----------------------------------
Find MatchCase RegExp "Last Name: ^(*^)|First Name: ^(*^)|P#: ^(*^)Last Name: +^1First Name: +^2P#: +^3"
Replace All "Last Name: ^1|First Name: ^2|P#: ^3"
...
The macro searches twice but only returns the first data found.

Am I just making this too hard?

How would you come at this?

Thanks again,

-Allen
umayxa3
Newbie
 
Posts: 6
Joined: Sat Aug 15, 2009 8:33 am

Re: Move Data to Delimited Format

Postby Mofi » Wed Feb 20, 2013 1:36 am

This feature of using a tagged string already in search string is called back referencing. Only the most powerful Perl regular expression engine supports back referencing. The legacy UltraEdit and Unix regular expression engines do not support referencing a tagged string already in search string.

PerlReOn
Find MatchCase RegExp "^(.*Last Name: )(.*)(\|First Name: )(.*)(\|P#: )(.*)(\|.*)\r\n.*Last Name: \2\|First Name: \4\|P#: \6\|.*$"
Replace All "\1\2\3\4\5\6\7"

The backslash left to every | is necessary as character | in a Perl regular expression means OR and therefore this character must be escaped to find simply character |.

.* is in Perl syntax the same as just * in UltraEdit syntax.

Depending on your data the regular expression search and replace string could be less complex:

PerlReOn
Find MatchCase RegExp "^(.*Last Name: .*\|First Name: .*\|P#: .*\|)(.*)\r\n\1.*$"
Replace All "\1\2"
User avatar
Mofi
Grand Master
Grand Master
 
Posts: 4049
Joined: Thu Jul 29, 2004 11:00 pm
Location: Vienna


Return to Macros