How to match exactly one newline?

Find, replace, find in files, replace in files, regular expressions

How to match exactly one newline?

Postby chfleischer » Thu Jan 05, 2006 12:40 pm

Hi forum,

I have a logfile with time stamps and messages in between:

Code: Select all
M [Thr 5580] Thu Jan 05 05:46:46 2006
M [Thr 4548] Thu Jan 05 05:46:46 2006
M [Thr 3464] Thu Jan 05 05:46:46 2006
M  [Thr 3464] message1
M [Thr 3464] Thu Jan 05 05:47:46 2006
M  [Thr 3464] message2
M [Thr 3464] Thu Jan 05 05:48:46 2006
M  [Thr 3464] message3
M [Thr 4548] Thu Jan 05 05:48:47 2006
M [Thr 5580] Thu Jan 05 05:48:47 2006
M [Thr 4548] Thu Jan 05 05:48:48 2006
M [Thr 5580] Thu Jan 05 05:48:48 2006
M  [Thr 3464] message4



Each line begins with a region code ('M'). Time stamps have just one space between the region code and the opening '[', message lines have two spaces.

I'm trying to delete sequences of time stamps without message lines in between using "find and replace with regexp":

(Unix-style): "^M \[.*\pM \["
(UE-Style): "%M [[]*^pM [[]"

I've first tested the regexp with a "find with regexp", and for both styles of regexp, the search result is not as expected: the result spans the message lines.

Code: Select all
>>ok>M [Thr 4548] Thu Jan 05 05:46:46 2006
M [<ok<<Thr 5580] Thu Jan 05 05:46:46 2006
M [Thr 4548] Thu Jan 05 05:46:46 2006
M  [Thr 3464] message1
>>err>M [Thr 3464] Thu Jan 05 05:47:46 2006
M  [Thr 3464] message2
M [<err<<Thr 3464] Thu Jan 05 05:48:46 2006
M  [Thr 3464] message3
M [Thr 4548] Thu Jan 05 05:48:47 2006
M [Thr 5580] Thu Jan 05 05:48:47 2006
M [Thr 4548] Thu Jan 05 05:48:48 2006
M [Thr 5580] Thu Jan 05 05:48:48 2006
M  [Thr 3464] message4



If there is no message line, the find result is as expected (from ">>ok>" to "<ok<<"), but if there are message lines in between, the find spans muliple lines (e.g. from ">>err>" to "<err<<")

How can I enforce that the "\p" resp. "^p" matches exactly one CR/LF pair?

Any help is greatly welcome...
User avatar
chfleischer
Newbie
 
Posts: 4
Joined: Mon Aug 15, 2005 11:00 pm

Re: How to match exactly one newline?

Postby edd__ » Thu Jan 05, 2006 5:12 pm

Try this (Unix Syntax):
^M \[.*\pM( )+\[

Is this what you want? If not please provide more examples.

ed.
User avatar
edd__
Newbie
 
Posts: 5
Joined: Mon Jan 02, 2006 12:00 am

Re: How to match exactly one newline?

Postby chfleischer » Thu Jan 05, 2006 5:30 pm

Hello Ed,

thank you for the fast answer. Unfortunately, this didn't do the trick - UE still selects more than one line with "\p".

The example code above is a real excerpt from the log file, and it shows the problem when you cut&paste it into an empty edit buffer. There should be no match spanning three lines...

Chris
User avatar
chfleischer
Newbie
 
Posts: 4
Joined: Mon Aug 15, 2005 11:00 pm

Re: How to match exactly one newline?

Postby mrainey56 » Thu Jan 05, 2006 7:46 pm

I think I know what you're after.

Search with Unix Regex: ^(M )([[].+)(\p)(M )([[].+)(\p)(M )([[].+)(\p)
Replace All: \4\5\6\7\8\9


The Search string could be a lot more concise - I was trying everything under the sun to get it to find the first line, but to no avail. Anyway, I think it works.

It finds two time stamps followed by a message, then deletes the first time stamp. I had to run it three times to get them all. Tried it in a macro, UE wouldn't accept Replace All inside a loop (?).


Before:

M [Thr 5580] Thu Jan 05 05:46:46 2006
M [Thr 4548] Thu Jan 05 05:46:46 2006
M [Thr 3464] Thu Jan 05 05:46:46 2006
M [Thr 3464] message1
M [Thr 3465] Thu Jan 05 05:47:46 2006
M [Thr 3465] message2
M [Thr 3466] Thu Jan 05 05:48:46 2006
M [Thr 3466] message3
M [Thr 4548] Thu Jan 05 05:48:47 2006
M [Thr 5580] Thu Jan 05 05:48:47 2006
M [Thr 4548] Thu Jan 05 05:48:48 2006
M [Thr 3467] Thu Jan 05 05:48:48 2006
M [Thr 3467] message4
M [Thr 5580] Thu Jan 05 05:46:46 2006
M [Thr 4548] Thu Jan 05 05:46:46 2006
M [Thr 3468] Thu Jan 05 05:46:46 2006
M [Thr 3468] message1
M [Thr 3469] Thu Jan 05 05:47:46 2006
M [Thr 3469] message2
M [Thr 3470] Thu Jan 05 05:48:46 2006
M [Thr 3470] message3
M [Thr 4548] Thu Jan 05 05:48:47 2006
M [Thr 5580] Thu Jan 05 05:48:47 2006
M [Thr 4548] Thu Jan 05 05:48:48 2006
M [Thr 3471] Thu Jan 05 05:48:48 2006
M [Thr 3471] message4
M [Thr 5580] Thu Jan 05 05:46:46 2006
M [Thr 4548] Thu Jan 05 05:46:46 2006
M [Thr 3472] Thu Jan 05 05:46:46 2006
M [Thr 3472] message1
M [Thr 3473] Thu Jan 05 05:47:46 2006
M [Thr 3473] message2
M [Thr 3474] Thu Jan 05 05:48:46 2006
M [Thr 3474] message3
M [Thr 4548] Thu Jan 05 05:48:47 2006
M [Thr 5580] Thu Jan 05 05:48:47 2006
M [Thr 4548] Thu Jan 05 05:48:48 2006
M [Thr 3475] Thu Jan 05 05:48:48 2006
M [Thr 3475] message4



After:

M [Thr 3464] Thu Jan 05 05:46:46 2006
M [Thr 3464] message1
M [Thr 3465] Thu Jan 05 05:47:46 2006
M [Thr 3465] message2
M [Thr 3466] Thu Jan 05 05:48:46 2006
M [Thr 3466] message3
M [Thr 3467] Thu Jan 05 05:48:48 2006
M [Thr 3467] message4
M [Thr 3468] Thu Jan 05 05:46:46 2006
M [Thr 3468] message1
M [Thr 3469] Thu Jan 05 05:47:46 2006
M [Thr 3469] message2
M [Thr 3470] Thu Jan 05 05:48:46 2006
M [Thr 3470] message3
M [Thr 3471] Thu Jan 05 05:48:48 2006
M [Thr 3471] message4
M [Thr 3472] Thu Jan 05 05:46:46 2006
M [Thr 3472] message1
M [Thr 3473] Thu Jan 05 05:47:46 2006
M [Thr 3473] message2
M [Thr 3474] Thu Jan 05 05:48:46 2006
M [Thr 3474] message3
M [Thr 3475] Thu Jan 05 05:48:48 2006
M [Thr 3475] message4
User avatar
mrainey56
Master
Master
 
Posts: 212
Joined: Tue Jul 27, 2004 11:00 pm
Location: Spartanburg, South Carolina

Re: How to match exactly one newline?

Postby chfleischer » Wed Jan 18, 2006 2:40 pm

Hello MRainey,

(sorry for the late response - I've been skiing for some days) :D

Your pattern helps to reduce the work from days to hours - it removes the first of two successive timestamp lines.

I've shortened the pattern a little:

Search with Unix Regex: ^M [[].+\p(M [[].+\pM [[].+\p)
Replace All: \1

but now, I cannot match an arbitrary number of lines with e.g. "^(M [[].+\p)+(M [[].+\pM [[].+\p)". Is there any other trick to acheive this?

Best regards, Christian
User avatar
chfleischer
Newbie
 
Posts: 4
Joined: Mon Aug 15, 2005 11:00 pm

Re: How to match exactly one newline?

Postby chfleischer » Wed Jan 18, 2006 2:53 pm

Hi MRainey,

inverting the operation makes things much faster: Just searching for zwo successive timestamps and removing the first one (without looking for a message line at all) performs even better, because each replace halves the number of hits...

I'm using now
Search with Unix Regex: "^[A-Z] [[].+\p([A-Z] [[].+\p)
Replace All: \1

because the timestamp lines may begin with an arbitrary uppercase letter.

Thanks a lot,

Christian
User avatar
chfleischer
Newbie
 
Posts: 4
Joined: Mon Aug 15, 2005 11:00 pm

Re: How to match exactly one newline?

Postby mrainey56 » Wed Jan 18, 2006 5:02 pm

I appreciate the feedback - glad you worked out a good solution.
User avatar
mrainey56
Master
Master
 
Posts: 212
Joined: Tue Jul 27, 2004 11:00 pm
Location: Spartanburg, South Carolina


Return to Find/Replace/Regular Expressions