Advanced file sort
UltraEdit has a powerful sort engine that allows you to sort file large amounts of file data based on several "keys" that may be set. By default, UltraEdit will sort a file alphabetically by looking at the first character of the line. In the tutorial below, we will take a look at some of the advanced sort options that allow you to sort data based on a specific column or block of data.
Advanced File Sort: Column-Based Sort
UltraEdit can perform sorting tasks on large files, but for the purpose of this tutorial we will be using a small amount of sample data. As you can see below, this sample data includes a name, a monetary amount, and a tax amount in fixed columns.
For advanced sort options, it will be necessary to convert your data to fixed column widths. If your data is not organized in fixed columns, you can got to Column : Convert to Fixed Column to achieve this.
For more information on converting data fields to fixed width, visit our Power Tip Working with CSV Files.
You can access the sort options under File : Advanced Sort/Options.
The default values for the Advanced Sort/Options are shown below. Within this dialog, you can configure the sort to be ascending or descending. You can also set options to remove duplicates, ignore case, or perform a numeric sort.
Sort Order (Ascending/descending)
This sets the sort for ascending or descending order. "Ascending" is the default.
- Ascending - This sets the sort for ascending order ('a' to 'z', 1-9).
- Descending - This sets the sort for descending order ('z' to 'a', 9-1).
See Remove Duplicates section below.
This option should be checked if you want the search to be insensitive to case, i.e. "CAT" is considered the same as "cat". Do not check this option if the sort should consider "CAT" different from "cat".
If this is checked, the sort will perform a numeric sort on the selected columns. Typically a sort is non-numeric and the values 1,11,111 would be sorted before 2,22,222 as 1 comes before 2. If a numeric sort is selected, the values would be sorted in numeric order (1,2,11,12,111,222). The numeric sort is not possible with a locale specific sort or when using the alternate sort method.
Click the "Sort" button to sort your file based on the configurations in the Advanced Sort/Options dialog. Below is a screenshot of our sample data after a sort with UltraEdit's default settings.
As an example, we would like to sort our file based on one of the numberical values - not the individual's name. In order to do this we will need to access the Advanced Sort/Options (in File : Sort : Advanced Sort/Options) and set a key.
A key is a a range of columns that will be used to sort the file. UltraEdit will only look at the Key (column range) and will sort the entire record (line), not just the data in the column. You can specify up to 4 hierarchical keys for a single sort. UltraEdit will look at Key 1 first. If the data between two lines is identical, UltraEdit will then look at Key 2, then Key 3, and finally Key 4.
In our example, the column data for our key exists between columns 20 and 28. Therefore we will enter "20" for the Start Column and "28" for the End Column.
Clicking the "Sort" button gives us the output shown below.
Finally, we'd like to not only sort our sample data based on the main monetary amount, but on the tax amount as well. This is easy to do - we will enter the Start Column "32" (where our sample tax data begins) and the End Column "-1". Setting a value of "-1" will cause UltraEdit to look to the absolute end of the line.
After clicking "Sort", our output is shown below.
The remove duplicates allows you to delete duplicate records based on the sort keys. When you check the Remove Duplicates option, you will then configure the delete behavior using the "Where All Selected Keys Match" and "Where Any Selected Keys Match" options.
Note: The remove duplicate will only search the keys which are enabled (checked) in the "RD" column.
To understand the difference between "Where All Selected Keys Match" and "Where Any Selected Keys Match", consider you are sorting columns "One" and "Two" on the following example:
Where All Selected Keys Match
If you select the "Where All Selected Keys Match" option, the sort routine deletes the record when the data in BOTH (all) columns match - ie: it is recoginized as a duplicate if the data for the record (line) in both column One and Two match.
From the example above, the only duplicate record would be the 4th record because there is a "1" in column One and Two. So, the result would be:
Where Any Selected Keys Match
If you select the "Where Any Selected Keys Match" option, the sort routine deletes the record when the data in any of the columns match - ie: it is recoginized as a duplicate if the data for the record has a match in any other key on any other record.
From the example above, the duplicate records would be the 2nd and the 4th. The 4th record is a duplicate because column One has a 1 in both the 1st and 4th records. The 2nd record is a duplicate because there is a 1 in column Two in both the 1st and 2nd record. So, the result would be:
That's all there is to sorting columns - a very powerful and easy way of sorting your text data.