Making regular expression for finding IP addresses

Find, replace, find in files, replace in files, regular expressions

Making regular expression for finding IP addresses

Postby Leeuwarden » Tue Nov 09, 2010 9:39 am

Hello all,

I am quite new too UltraEdit, and have a big database. Now I want to find, within that database, a range of IP-adresses, lets say: 21.111.22.1 to 21.111.27.8. How can I make a Regular Expression for this? I have tried alot already, but it does'nt seem to work. So I start with 'Search' , 'Find in Files' , and then what do I type?

Sorry if this question is already posted on here: I couldn't find it.

Thanks so much in forward for your help.

With kind regards,
Leeuwarden
Leeuwarden
Newbie
 
Posts: 3
Joined: Tue Nov 09, 2010 9:33 am

Re: Making regular expression for finding IP addresses

Postby Mofi » Tue Nov 09, 2010 12:03 pm

With the UltraEdit regexp engine search for [0-9]+.[0-9]+.[0-9]+.[0-9]+ to find any "Number.Number.Number.Number" string. With the Unix/Perl regexp engine the same search can be done with \d+\.\d+\.\d+\.\d+

You can replace parts of the general expression by fixed parts, for example for your address range use 21.111.2[2-7].[0-9]+ (UE engine) or 21\.111\.2[2-7]\.\d+ (Unix/Perl engine).
User avatar
Mofi
Grand Master
Grand Master
 
Posts: 4049
Joined: Thu Jul 29, 2004 11:00 pm
Location: Vienna

Re: Making regular expression for finding IP addresses

Postby bulgrien » Tue Nov 09, 2010 1:40 pm

If you want the four octets to be limited to 3 numeric digits, you can use this Perl regular expression:

(\d{1,3}\.){3}\d{1,3}

\d{1,3}\. matches 1 to 3 numeric digits (one octet) followed by a period
Surrounding the whole thing in parentheses and adding {3} repeats the pattern 3 times (three octets followed by periods)
The \d{1,3} at the end adds the final 1 to 3 digit octet after the last period



If you want to validate each octet to ensure that they fall between 0 and 255, you can do that as well...although it makes for a pretty long regular expression:

(([01]?[0-9]?[0-9]|2[0-4][0-9]|25[0-5])\.){3}([01]?[0-9]?[0-9]|2[0-4][0-9]|25[0-5])

[01]?[0-9]?[0-9] validates the numbers 0 through 199 with optional leading zeros
2[0-4][0-9] validates the numbers 200 through 249
25[0-5] validates the numbers 250 through 255
The | character in between them acts as an OR

Replace each [0-9] with \d to snug things up a little more:

(([01]?\d?\d|2[0-4]\d|25[0-5])\.){3}([01]?\d?\d|2[0-4]\d|25[0-5])
User avatar
bulgrien
Master
Master
 
Posts: 92
Joined: Fri Dec 11, 2009 1:02 am
Location: Pennsylvania, USA

Re: Making regular expression for finding IP addresses

Postby Leeuwarden » Wed Nov 10, 2010 3:45 am

Thank you so much for your quick and very helpful responses! I tried both suggestions, but stumble upon another (just small) problem. When I for example use 21.111.2[2-7].[0-9]+, UltraEdit also finds 255.21.111.22.2. How can I make the software only find the addresses that BEGIN with 21.111.xxx. I tried it with %, so that would make %(21.111.2[2-7].[0-9]+) or %21.111.2[2-7].[0-9]+, , right? But that doesn't work!

Again: really thanks for your help!

Kind regards,
Leeuwarden

EDIT: I thought about it a bit longer, and understand it better now. I searched with the IP address at the starting of a line, but in this data they are not at the beginning of a line. So it should become something like: [NO digit/point OR digit directly before it, but an empty space)21.111.2[2-7].[0-9]+. Am I on the right track? ;)
Leeuwarden
Newbie
 
Posts: 3
Joined: Tue Nov 09, 2010 9:33 am

Re: Making regular expression for finding IP addresses

Postby Mofi » Wed Nov 10, 2010 6:43 am

Don't make it more complicated as necessary. If there is always a space character left an IP address you want find, just use a space character as first character of the regular expression search string.

Or with the UltraEdit regexp engine search for [~.0-9]21.111.2[2-7].[0-9]+ which is simplified for what you want, the character left the fixed string 21.111.2 must be any character except a dot or a digit. Of course this character is also selected. A non matching look-behind is not possible with the UltraEdit regexp engine.
User avatar
Mofi
Grand Master
Grand Master
 
Posts: 4049
Joined: Thu Jul 29, 2004 11:00 pm
Location: Vienna

Re: Making regular expression for finding IP addresses

Postby bulgrien » Wed Nov 10, 2010 7:35 am

Yes, you are on the right track. My previous Perl examples were more general in nature (validating any IP address). Here is a another Perl regular expression for you to consider:

(?<![\d\.])21\.111\.2[2-7]\.\d{1,3}(?![\d\.]).

If you don't know what will come before or after your IP address, then you can specify what should not be there using negative look-arounds:

(?<![\d\.]) is a negative look-behind that avoids strings preceded by a numeric digit or a period
(?![\d\.]) is a negative look-ahead that avoids strings followed by a numeric digit or a period
The ! character indicates that the look-around is negative (looking for the absence of the character rather than the presence of the character)
The only difference between the two is the < character which indicates that one of them is looking behind while the other is looking ahead
\d{1,3} for the last octet looks for any number up to 3 digits long. The [0-9]+ in your regular expression will match numbers of any length

If you want to ensure that your last octet is a valid number between 0 and 255, then you can get that from yesterday's post
User avatar
bulgrien
Master
Master
 
Posts: 92
Joined: Fri Dec 11, 2009 1:02 am
Location: Pennsylvania, USA

Re: Making regular expression for finding IP addresses

Postby Bracket » Wed Nov 10, 2010 8:03 am

Leeuwarden wrote:I tried it with %, so that would make %(21.111.2[2-7].[0-9]+) or %21.111.2[2-7].[0-9]+, , right? But that doesn't work!

EDIT: So it should become something like: [NO digit/point OR digit directly before it, but an empty space)21.111.2[2-7].[0-9]+. Am I on the right track?

There are a few problems with the RegEx you tried. The first problem is that you are using the "%" character - I'm not sure what the purpose of that is, but what you want to use is "\b" to indicate a word boundary. That's all you need on either side of the expression.

The second problem is that you're not escaping your periods. That's why you're finding IPs different than the pattern you're trying to get.


In addition, the problem with some of the other examples that have been given in this thread is that they will catch octets like "999", which would be an invalid IP address.


If you want IP addresses that always start with "21.111", than use this (Perl):

\b21\.111\.(25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?)\.(25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?)\b


If you want to grab any IP address, use this (Perl):

\b(?:(?:25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?)\.){3}(?:25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?)\b
User avatar
Bracket
Basic User
Basic User
 
Posts: 35
Joined: Fri Oct 26, 2007 11:00 pm

Re: Making regular expression for finding IP addresses

Postby bulgrien » Wed Nov 10, 2010 10:51 am

Bracket wrote:The second problem is that you're not escaping your periods. That's why you're finding IPs different than the pattern you're trying to get.

Not true, Bracket. The reason this user is not escaping his periods is because he is using the UltraEdit form of regular expressions instead of Perl regular expressions. By the way, the original poster has already been given Perl regular expressions to validate an entire IP address (shorter ones) and your addition of \b boundary markers does not fix the last issue raised by the user... namely the regular expression matching what appears to be a valid IP address immediately following "255." This is why Mofi suggested preceding the expression with a space and I instructed the user on the use of negative look-arounds.
User avatar
bulgrien
Master
Master
 
Posts: 92
Joined: Fri Dec 11, 2009 1:02 am
Location: Pennsylvania, USA

Re: Making regular expression for finding IP addresses

Postby Leeuwarden » Thu Nov 11, 2010 4:10 am

Again; thank you all for the replies! I agree with Bracket that you should not count on my intelligent, especially not in this particular situation ;) However, all replies were helpful in understanding UE a little bit better. Funny how the most simple reply fixed my last problem: just add a space before the string.

This is the winning string: ' 21.111.2[2-7].[0-9]+'

By the way: in this set of data it's impossible to find invalid IP addresses, because the data only has valid addresses. When I have new questions, I will post them!
Leeuwarden
Newbie
 
Posts: 3
Joined: Tue Nov 09, 2010 9:33 am

Re: Making regular expression for finding IP addresses

Postby bulgrien » Thu Nov 11, 2010 8:46 am

I'm glad Mofi's solution worked for you. ;) As he so well put it, no need to make it more complicated than necessary.
User avatar
bulgrien
Master
Master
 
Posts: 92
Joined: Fri Dec 11, 2009 1:02 am
Location: Pennsylvania, USA


Return to Find/Replace/Regular Expressions