Match data from 2 different structures see example

Help with writing and playing macros

Match data from 2 different structures see example

Postby LiveAd » Fri Dec 22, 2006 5:04 am

Below there are snippets of data from 2 file structures I have.
The first structure is simply a telephone number. The 2nd is structured differently. It also contains a phone number but has a couple characters in front of it and has a domain name after it. Please look below at the samples. I need to flag the numbers in the first file which only have phone numbers and append the matching domain name to them from the 2nd file.
I have 86 files of the 2nd kind which are almost 200mb each and have about 3 million lines.

Any and all help would be greatly appreciated. Please provide as many detailed instructions as possible.

Thanks!!!!!!


File 1:

2012001988
2012221122
2012222222
2012222222
2012243310
2012259297
2012259553
2012323494
2012407625
2012436385
2012436579
2012440499
2012445058
2012616081
2012621297
2012647883


File 2:

AA2013162981,myboostmobile.com
AA2013162982,myboostmobile.com
AA2013162983,myboostmobile.com
AA2013162984,myboostmobile.com
AA2013162985,myboostmobile.com
AA2013162986,myboostmobile.com
AA2013162987,myboostmobile.com
AA2013162988,myboostmobile.com
AA2013162989,myboostmobile.com
AA2013162990,myboostmobile.com
AA2013162991,myboostmobile.com
AA2013162992,myboostmobile.com
AA2013162993,myboostmobile.com
AA2012221122,myboostmobile.com
AA2013162995,myboostmobile.com
AA2013162996,myboostmobile.com
AA2013162997,myboostmobile.com
AA2013162998,myboostmobile.com
AA2013162999,myboostmobile.com
AA2013163000,vtext.com
AA2013163001,vtext.com
AA2013163002,vtext.com
AA2013163003,vtext.com
AA2013163004,vtext.com
AA2013163005,vtext.com
AA2013163006,vtext.com
AA2013163007,vtext.com
AA2013163008,vtext.com
User avatar
LiveAd
Newbie
 
Posts: 3
Joined: Mon Dec 18, 2006 12:00 am

Re: Match data from 2 different structures see example

Postby Mofi » Fri Dec 22, 2006 10:34 am

Okay, here is the macro which should do this job on your large files. Good that you have mentioned that because I would have written the macro different when the files are not so big.

File 1 with the phone numbers only must have the focus. File 2 is also already open. No other file should be open.

First the macro trims all trailing spaces in file 1 (for security) and inserts a new line at the end of the file with the special marker character » and then copies whole content of file 1 to clipboard 9 and pastes it at top of file 2. This is done to avoid window switching which normally decrease macro execution speed. Hopefully size of file 1 is not too big for clipboard copying.

Now in file 2 a loop is executed until the marker character is found at start of a line.

In the loop first the current line is marked with character # and the phone number is selected and copied to clipboard 9. Next a find for this phone number with a following comma is executed.

If this search string is found the comma and the following domain name is copied to user clipboard 8 and with a find upwards with clipboard 9 reactived the cursor is moved back to the current phone number from file 1. There the comma and the domain name is appended to the line and the line mark remains.

If search for phone number with following comma was not successful, the line mark is removed.

Last command moves cursor down to next phone number from file 1 or the special marker character.

After the loop has finished the macro selects the maybe modified content from file 1 in file 2, cuts it to user clipboard 9 and pastes it over still existing selection in file 1. So file 2 was temporarily modified, but now contains the same content as before start of the macro.

In file 1 the last line with the special marker character is deleted, the cursor is positioned back to top and the 2 used clipboards are cleared to free memory.

The macro property Continue if a Find with Replace not found must be checked for this macro.

InsertMode
ColumnModeOff
HexOff
TrimTrailingSpaces
Bottom
IfColNum 1

"
Else
"
»
"
EndIf
SelectAll
Clipboard 9
Copy
NextWindow
Top
Paste
Top
Loop
IfCharIs "»"
ExitLoop
EndIf
"#"
StartSelect
Key END
Copy
EndSelect
Find "^c,"
IfFound
Key LEFT ARROW
StartSelect
Key END
Clipboard 8
Copy
EndSelect
Clipboard 9
Find Up "#^c"
Key LEFT ARROW
Key RIGHT ARROW
Clipboard 8
Paste
Clipboard 9
Key HOME
Else
Key HOME
Key DEL
EndIf
Key DOWN ARROW
EndLoop
Key DOWN ARROW
SelectToTop
Cut
PreviousWindow
Paste
Key UP ARROW
DeleteLine
Top
ClearClipboard
Clipboard 8
ClearClipboard
Clipboard 0

Because Christmas is soon coming, here is a second macro which does the same as above but uses window switching. You can use this one, if file 1 is also very large. I would be really interested in which version is faster. Can you run both macros on the same files, determine the execution time and post it?

The macro property Continue if a Find with Replace not found must be checked for this macro too.

InsertMode
ColumnModeOff
HexOff
TrimTrailingSpaces
Bottom
IfColNum 1

"
Else
"
»
"
EndIf
Top
NextWindow
Top
PreviousWindow
Loop
IfCharIs "»"
ExitLoop
EndIf
"#"
StartSelect
Key END
Copy
EndSelect
NextWindow
Find "^c,"
IfFound
Key LEFT ARROW
StartSelect
Key END
Copy
EndSelect
Top
PreviousWindow
Key LEFT ARROW
Key RIGHT ARROW
Paste
Key HOME
Else
PreviousWindow
Key HOME
Key DEL
EndIf
Key DOWN ARROW
EndLoop
DeleteLine
Top
ClearClipboard
Clipboard 0

And here is again the macro above, but now without the special marker character because now IfEof is used. This will not work if file 1 is a Unicode file because UltraEdit v12.20b does not correctly identify end of file on Unicode files. So the marker character in a new line at end of the file used in the macro above is my workaround to run a macro to end of a file when I don't know the file format (Unicode or ASCII).

The macro property Continue if a Find with Replace not found must be checked for this macro too.

InsertMode
ColumnModeOff
HexOff
TrimTrailingSpaces
Bottom
IfColNumGt 1
InsertLine
EndIf
Top
NextWindow
Top
PreviousWindow
Loop
IfEof
ExitLoop
EndIf
"#"
StartSelect
Key END
Copy
EndSelect
NextWindow
Find "^c,"
IfFound
Key LEFT ARROW
StartSelect
Key END
Copy
EndSelect
Top
PreviousWindow
Key LEFT ARROW
Key RIGHT ARROW
Paste
Key HOME
Else
PreviousWindow
Key HOME
Key DEL
EndIf
Key DOWN ARROW
EndLoop
Top
ClearClipboard
Clipboard 0
User avatar
Mofi
Grand Master
Grand Master
 
Posts: 4039
Joined: Thu Jul 29, 2004 11:00 pm
Location: Vienna

Re: Match data from 2 different structures see example

Postby LiveAd » Fri Dec 22, 2006 6:26 pm

It seems like there has got to be a different, simpler way.
I do not necessarily need to append the data from the 2nd set of files to the ACTUAL 1st file. I simply need to end up with a list of the records from list one that are somewhere in the lists from list 2 (list 2 is 86 files) and have the extra data from list two appended to the matching records of list one...even if that means making a list 3.

Is there I way I can take list one and go line by line... take line one from list one and search list 2 (86 LARGE files) for any match of that string and if there is a match append that to a new file...then search again for the next line of list one inside list 2... etc. etc. ???
I want the least amount of user action.
I do not mind if it takes a few days even so long as it gives me the end result I am looking for.

Your replies (all of you!) are welcome and suggestions/help TOTALLY appreciated!!!
User avatar
LiveAd
Newbie
 
Posts: 3
Joined: Mon Dec 18, 2006 12:00 am

Re: Match data from 2 different structures see example

Postby hveld » Sat Dec 23, 2006 10:50 am

this sounds more like a job for a database, especially when the files are big and if you need to do do this regularly.
As the data is one entry per line, you can easily import it in any database, like all data of type 1 in table1, and of type 2 in table2. Then it's a matter of a simple select statement in a loop to generate a 3rd table with data of types 1 and 2 combined as needed. I guess any database engine will do this much faster than UE.
User avatar
hveld
Basic User
Basic User
 
Posts: 42
Joined: Tue Nov 16, 2004 12:00 am

Re: Match data from 2 different structures see example

Postby Mofi » Sat Dec 23, 2006 1:45 pm

A database program would do this job really much faster than UltraEdit.

Have you understand and tried my macros? Which one is faster?

Because of your very large files it is maybe really better to use macro 2 or 3. The modified first file is not saved at the end, so you can delete all remaining lines not marked with a # at start of the line, then remove the # character itself from start of all lines and save the file with a new name.

List 2 are 86 files - no problem. Use the old DOS command copy to copy all 86 files together to a REALLY big single file - see Appending text files and Want to combine many files into one file - and then use one of my macros. You should also read in help of UltraEdit the page with the title Large File Handling and configure UltraEdit accordingly.
User avatar
Mofi
Grand Master
Grand Master
 
Posts: 4039
Joined: Thu Jul 29, 2004 11:00 pm
Location: Vienna

Re: Match data from 2 different structures see example

Postby Cher17 » Fri Jan 26, 2007 6:03 pm

Thank you!!! The 2nd one worked great for me and very quickly. I have similar large files and it is beautiful!
User avatar
Cher17
Newbie
 
Posts: 1
Joined: Wed Nov 29, 2006 12:00 am


Return to Macros