Getting started with Perl regex in UltraEdit and UEStudio
IDM PowerTips
As one of the most powerful components of UltraEdit’s/UEStudio’s find/replace functionality, Perl regular expressions give you the power to reformat large amounts of nonuniform data in a single replace, saving you minutes or even hours of manual text editing!
To search for your string go to Search -> Find (or CTRL F). To perform a Perl regular expression search, check the “Regular Expressions” option and ensure the regular expression engine is set to “Perl” in the advanced options of the Find dialog. With Perl regex enabled, type your expression and hit Next. It’s that easy!
In the example above, we used <td align=”right”.* to find all <td> elements with a property of align=”right”. Using Highlight All, you can see that the Perl regex matched all of these elements, even though they contain different data after the align=”right” property.
Perl regular expressions use a very simple and understandable syntax. Let’s take a look at this syntax below.
Note: Also be sure to read our advanced Perl regular expressions tutorial.
Perl regex syntax quick reference
. | Any character, except for new line |
^ | Start of line position (does not actually match any characters) |
$ | End of line position (does not actually match any characters) |
* | Matches previous character 0 or more times |
+ | Matches previous character 1 or more times |
? | Matches previous character 0 or 1 time |
a{n} | Matches "a" repeated exactly n times |
a{n,} | Matches "a" repeated n or more times |
a{n,m} | Matches "a" between n and m times |
[a-d0-3] | Matches a character in the set. This example matches a, b, c, d, 0, 1, 2, or 3 |
[^a-d0-3] | Matches any character NOT in the set. This example will match anything except for a, b, c, d, 0, 1, 2, and 3 (even new lines) |
(foo) | Matches "foo" and allows it to be backreferenced in Find and/or Replace string via \1 |
(?:foo) | Matches "foo" but does not store it in memory for backreferencing. This technique can be useful for grouping purposes |
| | "Or" operator. For example, "foo|bar" will match either "foo" OR "bar" |
Use “\” to escape the special meaning of special characters. For example, “\*” will match a literal “*”.
Perl regex built-in character classes and positions
\w | Any word character |
\W | Any character that is NOT a word character |
\d | Any digit character |
\D | Any character that is NOT a digit character |
\s | Any whitespace character (including new line characters) |
\S | Any character that is NOT a whitespace character |
\< | Matches start of word position (does not actually match any characters) |
\> | Matches end of word position (does not actually match any characters) |
\b | Matches beginning or end of word position (does not actually match any characters) |
\t | Tab |
\r | Carriage return (CR) |
\n | Line feed (LF) |
\xHH | Character with hex value HH |
Perl regex output modifiers
Unless otherwise specified, these modifiers can be used in Replace With strings only.
\u | Outputs uppercase version of the next character |
\U...\E | Outputs uppercase version of all characters until \E |
\l | Outputs lowercase version of next character |
\L...\E | Outputs lowercase version of all characters until \E |
\D | Where "D" is a digit, this outputs data matched by a "()" group in Find string. "\1" outputs the first group, "\2" outputs the second group, etc. (It is also possible to use a backreference in the Find string) |
$& | Outputs all text that matched the entire Find regex |
$` | Outputs the text between the end of the last match found (or the start of the text if no previous match was found), and the start of the current match |
$' | Outputs all text following the end of the current match |
Perl regex techniques and samples
.* | Match zero or more of any characters (except new line). Note: Perl regex are greedy, meaning they will match as much data as possible. For more information, see the non-greedy regex tutorial. |
.+ | Match one or more of any character (except new line) |
[\r\n]+ | Match new line character(s) regardless of terminator format |
(.+?) | Match hyperlinked text and preserve both the URL and link text for backreferencing |
^(.+)$[\r\n]+\1$ | Match two duplicate consecutive lines |
^.*[ \t]+$ | Match lines ending in one ore more spaces or tabs |
Now that you are familiar with regular expressions, dig a little deeper in our advanced Perl regex tutorial.