Search for lines not of a certain length?

Find, replace, find in files, replace in files, regular expressions

Search for lines not of a certain length?

Postby sfryatt24 » Thu Sep 30, 2004 12:44 pm

Hoping someone can help me:

We have a Unix file, 10 million records, record length = 147. Normally, we'll see the Line Feed at position 148 for all records. There are a number of records that are sporadically affected throughout the file where the Line Feed is in another column besides 148. This has affected positioning of fields in subsequent records.

How do I search using UE to determine the count of records affected?

How do I identify those records to the client to show them that they've incorrectly set the the Line Feeds?

Thanks in advance.
User avatar
sfryatt24
Newbie
 
Posts: 2
Joined: Wed Sep 29, 2004 11:00 pm

Re: Search for lines not of a certain length?

Postby Mofi » Fri Oct 01, 2004 2:07 am

Make a copy of the file. Open the copy and execute a regular UltraEdit style regular expression replace with

Find: %???...???^p
Replace:

???...??? means 147 '?' for 147 single characters except new line.
Execute Replace All. Redo Replace All until found 0.

The result is a file, which contains only lines, which have less or more than 147 characters in the line.
User avatar
Mofi
Grand Master
Grand Master
 
Posts: 4054
Joined: Thu Jul 29, 2004 11:00 pm
Location: Vienna

Re: Search for lines not of a certain length?

Postby sfryatt24 » Sun Oct 03, 2004 7:59 pm

Thanks for the info.
I don't want to seem lazy but as a novice UE user what are the specific steps to run when searching for line feeds in positions other than 148 - don't forget, there are some line feeds that appear beyond position 147?
Thanks again! :oops:
User avatar
sfryatt24
Newbie
 
Posts: 2
Joined: Wed Sep 29, 2004 11:00 pm

Re: Search for lines not of a certain length?

Postby Mofi » Mon Oct 04, 2004 2:23 am

OK, step by step:

Make a copy of your file with Windows Explorer.

Open the copy with UltraEdit.

Press Ctrl+R to open the replace dialog.

Enter in the field "Find What:"

  • 1 percentage character '%'
  • 147 question mark characters '?' (enter 10, copy and append it 13 times, add 7)
  • 1 '^' followed by 1 'p' for newline character (DOS)
Clear field "Replace With:".

Activate "Regular Expressions".

Check if "Current File" is selected and "Close after replace" is not selected.

Press button "Replace All" a few times, until you get the message "Search string not found".

Now you have a file, where only those lines exists, where the linefeed is not in column 148. You now can search for those lines in the huge file. You maybe can automate this with a macro.
User avatar
Mofi
Grand Master
Grand Master
 
Posts: 4054
Joined: Thu Jul 29, 2004 11:00 pm
Location: Vienna

Re: Search for lines not of a certain length?

Postby jbeck » Thu Apr 12, 2007 10:18 am

Hello

We have to make statistical returns to government bodies in huge fixed length text files. Sometimes these files can have structural errors, where some lines are not the correct length. We then edit these lines manually to correct them.

Problem is, finding these incorrect lines in huge files is difficult! Does anyone know of a way of searching for a line that is either greater or less than a certain length?

We have UE version 13.
User avatar
jbeck
Newbie
 
Posts: 6
Joined: Wed Apr 11, 2007 11:00 pm

Re: Search for lines not of a certain length?

Postby pietzcker » Thu Apr 12, 2007 11:09 am

How about the regex (New UltraEdit style)
Code: Select all
^(.{0,30}|.{50,100})$


meaning "match a line that either is less than 31 characters or 50-100 characters long"?

HTH,
Tim
User avatar
pietzcker
Master
Master
 
Posts: 241
Joined: Sun Aug 22, 2004 11:00 pm

Re: Search for lines not of a certain length?

Postby jbeck » Thu Apr 12, 2007 11:38 am

pietzcker, thanks, but I cannot get your regex suggestion to work. To help you I need to search for lines containing any text that are longer than 723 chars in length

I need to document this solution for other UE users here and it would be a neater solution than Mofi's

Regards,
Jon
User avatar
jbeck
Newbie
 
Posts: 6
Joined: Wed Apr 11, 2007 11:00 pm

Re: Search for lines not of a certain length?

Postby jorrasdk » Thu Apr 12, 2007 11:57 am

To follow up on pietzckers suggestion:

From Advanced menu, activate "configuration". Find "Search" and then "Regular expression engine".

Choose "Perl compatible regular expressions".

Return to your file.

Activate find (ctrl+F) - check "regular expression" and now find using:

Code: Select all
^.{724,}$
User avatar
jorrasdk
Master
Master
 
Posts: 275
Joined: Mon Mar 19, 2007 11:00 pm
Location: Denmark

Re: Search for lines not of a certain length?

Postby jbeck » Thu Apr 12, 2007 12:03 pm

Thank you gentlemen, that works a treat, turning on Perl expressions does help!
Regards,
Jon
User avatar
jbeck
Newbie
 
Posts: 6
Joined: Wed Apr 11, 2007 11:00 pm


Return to Find/Replace/Regular Expressions