Home » Company » Blog » Community » Delete data from one file that exists in another

Delete data from one file that exists in another

Ben Schwenk
Operations Manager

Here's the scenario. File A contains a list of strings; one per line. File B is a larger "master" list. You need to quickly delete all lines from your "master list" in File B that exist in File A. You know this has to be possible with UltraEdit...but what's the best way of getting there?

This is a request that our support team receives frequently. Let's take a look at how we can do this with a quick macro.

Step 1: Break it down logically into simple components

Any time you're writing a macro, it's best to start by breaking things down to the most basic level possible. Writing a macro to accomplish something complex may seem like an overwhelming and time-consuming task, but one will often find that, once broken down into simpler parts, it is actually quite simple!

So, without worrying about specific macro commands just yet, let's think about how this should work. Assuming File A will be the active file at the time of macro execution, this is what needs to get done:

  1. Select line in File A
  2. Copy selection in File A
  3. Switch to File B
  4. Search for copied text in File B
  5. If copied text is found, delete found line
  6. Go back to File A

That really doesn't look too overwhelming, does it?

Step 2: Begin writing macro commands

Now that we know what we need to do, it's time to start writing specific macro commands. Since we want to check every line in File A against File B, we want to start at the very top of File A. Keeping that in mind, here are our simple components transposed into working macro commands:

Top
SelectLine
Copy
NextDocument
Find "^c"
Replace All ""
PreviousDocument

(The macro commands are exhaustively documented in the Help documentation.)

This is good, but this will only run once. There are still a couple of things we need to do to make this macro truly great:

  1. Loop for every line in File A, and
  2. Always start each Find/Replace at the top of File B, and
  3. Exit/end the loop and macro when the end of File A is reached.

Step 3: Properly loop the macro

When you're looping a macro, you want to identify the following components of the loop:

  1. What commands should be looped
  2. What condition should be met to end the loop

Identifying the above makes it obvious where to place your (starting) "Loop 0" and (ending) "ExitLoop" commands within the macro. It will also reveal whether or not additional commands are needed to check the "ExitLoop" condition. We want to loop everything except the "Top" command for File A, and we want to exit the loop only when we reach the end of File A. So we'll implement our loop logic in the following manner:

Top
Loop 0
SelectLine
Copy
NextDocument
Top
Find "^c"
Replace All ""
PreviousDocument
Key END
IfEof
ExitLoop
EndIf
EndLoop

*Note: Because "SelectLine" includes the selected line's terminator (new line character) as part of the selection, this causes the caret to reposition to the beginning of the next line. Therefore, there is no need to use a "Key DOWN ARROW" command to go to the next line in the file. However, if you were using some other method of selection instead of "SelectLine", and this method did not include the line terminator, you would need to use "Key DOWN ARROW" to avoid an infinite loop.
 

Step 4: Accommodating the last line in File A

You may find that this macro doesn't properly accommodate the last line of File A. That's because the last line of File A should be a blank, empty line. In other words, the last line with real data in File A must also have a line terminator. To accommodate this, we need to ensure the last line in File A is empty, and if it isn't, we need to add an empty line.

Bottom
IfColNumGt 1
"
"
EndIf
Top
Loop 0
SelectLine
Copy
NextDocument
Top
Find "^c"
Replace All ""
PreviousDocument
Key END
IfEof
ExitLoop
EndIf
EndLoop

That's it! We now have a macro which accomplishes exactly what we want it do: delete from File B all strings which exist in File A. It is important to note that when playing this macro, File A must be the active file, while File B should be the very next file in the file tab order.

Can you think of any ways to make this macro even better? Please share in the comments!

Update: Feedback from a power user

One of our users, Mofi (who has probably helped some of you in the forum), has sent us another macro for this task which takes into account nonstandard configurations and file contents. He has also kindly commented his macro with explanations of the commands. All lines starting with "//" are comments and must be removed before the macro code can be copied into the Edit/Create Macro dialog.

Here is Mofi's much improved macro:

InsertMode
ColumnModeOff
UnixReOff
// Copy content of File A to clipboard 9. A macro should never destroy
// content of Windows clipboard which most often used by the users.
Clipboard 9
SelectAll
Copy
// Disable selection mode and move to top of file to discard the selection
// in active File A. That is not really necessary, but looks better.
EndSelect
Top
// Switch to other document and check last line for line termination.
// If last line does not have one, but has preceding whitespaces and
// auto-indent feature is enabled, UltraEdit adds on inserting the
// line termination also the preceding whitespaces and last byte(s)
// are therefore again not the line ending character(s). Therefore
// make an extra check after inserting line termination on preceding
// whitespaces and delete them.
NextWindow
Bottom
IfColNumGt 1
InsertLine
IfColNumGt 1
DeleteToStartofLine
EndIf
EndIf
// Go to top of file and paste there the list from File A.
Top
Paste
// Check now if last line of list has a line termination and insert
// a line to mark end of list in File B. The marker string must be
// a string which surely does not exist ever in one of the 2 files.
IfColNumGt 1
"
EnD_Of_LiSt
"
Else
"EnD_Of_LiSt
"
EndIf
Top
// Back at top of file use a regular expression search to insert
// at beginning of every line a special "start of line string".
// It would be also possible to do this with ColumnInsert command
// in column mode, but that requires 3 commands and is slower.
Find RegExp "%"
Replace All "#!#"
// Replace the marker line by a single character different to first
// character inserted on every line with the replace above. This is
// a single replace, but a Replace All is used to keep position at
// top of the file and avoid usually two display updates.
Find MatchCase RegExp "%#!#EnD_Of_LiSt"
Replace All "!"
// Now it is time to run the loop which searches in File B for the
// lines listed in line A to delete them in File B. The loop is exited
// when the line with the exclamation is reached which marks end of list.
Loop 0
IfCharIs "!"
ExitLoop
EndIf
// Command SelectLine is usually used to select an entire line. But that
// command selects just the displayed line which can be just a part of a
// real line if soft word-wrap is enabled for File B. Therefore a regular
// expression find is used to select the line. The expression as is works
// for DOS, UNIX and MAC/UNIX files temporarily converted to DOS, but not
// for MAC files not converted to DOS. The selected line from File A and
// all occurrences of that line in File B are next removed with a replace
// all command. The inserted special string at start of every line avoids
// deleting just a substring of a line. The macro should not convert a
// line sequence with the 3 words
//
// like
// dislike
// unlike
//
// into
//
// disun
//
// when just the line with "like" should be deleted from File B.
Find RegExp "%*^r++^n"
Find "^s"
Replace All ""
EndLoop
// The macro is nearly finished. Now only the line with the exclamation
// mark and the inserted strings at beginning of every line must be deleted.
DeleteLine
Find MatchCase RegExp "%#!#"
Replace All ""
// Some users use the setting "Automatically copy to clipboard when
// selection is made" and therefore it was good to keep clipboard 9
// all the time active while running this macro. Now it is time to
// clear the content in this clipboard to free memory and switch
// back to the usually used Windows clipboard.
ClearClipboard
Clipboard 0

 

Share this:  

Submit to Twitter Submit to Facebook Submit to Technorati Submit to Delicious Submit to Digg Submit to Mixx Submit to Reddit Submit to Stumbleupon Submit to LinkedIn

Back to Top

Comment on this post

Required fields are marked with *.


Todd Gee
Guest post
Comment
Comments not allowed in macros?
Reply #1 on : Thu June 16, 2011, 10:32:37
why wouldn't comments be allowed in macros? (Referring to the line above which says that the comment lines must be removed from the macro before cutting/pasting into the macro edit window.

Seems like it'd be easy to make the macro parser skip those lines.

just thinkin'.....
Mofi
Guest post
Comment
Re: Why wouldn't comments be allowed in macros?
Reply #2 on : Thu June 16, 2011, 11:45:03
I answer that question. It is not so easy as it looks like because lines starting with // can be also inside a multi-line string. It would be nevertheless possible as my macro "Copy Macro Code" in Mofis_Macro_Examples package (see sticky topic "Macro examples and reference for beginners and experts" in Macros forum) demonstrates.

However, UltraEdit macros are compiled to binary like C/C++/C# source code and not stored in ASCII as source code of Visual Basic macros, JavaScript or PHP scripts interpreted on execution. Therefore after compilation the comments would be lost and opening the macro again in the Edit/Create Macro dialog resulting in decompilation of the binary macro code would result in pure source code.

It is in general advisable for all executable modules which are compiled from a source code to binary data to keep the source code also as text file to be able to recreate the compiled executable module at any time. A single bit failure in a compiled binary module can make the module unreadable while a bit failure in a text file is no problem on opening the text file. I have a *.uem file with same name as *.mac file for all my macro files containing source code (with comments and indentations) and additional information like macro properties and a description.
Rick
Guest post
Comment
Slight Modification
Reply #3 on : Tue June 21, 2011, 12:38:52
Thanks, I will definately use this macro and it will save me a lot of time. I did have to slightly modify it to work on my PC at work. We use UltraEdit 32 Pro V10.10a. I had to replace the following:
NextDocument
PreviousDocument

with:
NextWindow
PreviousWindow

otherwise it just deleted all of the contents of my "A" file (it never went to the next document tab without the change.)

Thanks Again!!

Regards,
Rick E.

Back to Top

Subscribe to our blog

Get software updates, company news, staff editorials, and power tips in your RSS reader.

IDM Highlights newsletter

Sign up now for...

  • Software Powertips
  • New Releases
  • Update Notices
  • Cool Tools, Specials
  • Company News