Please help in searching for extended ascii characters

Find, replace, find in files, replace in files, regular expressions

Please help in searching for extended ascii characters

Postby bab00shka » Mon Nov 08, 2004 10:44 am

Hi,

I would like to search in files for anything which is not in the standard UK ascii set. ie anything which is not in the range 'space' (ascii 32) to ~ 'tilde' (ascii 126)

In UNIX I can do a grep as follows:

grep [^\ -\~] <filename>

I tried using the UtraEdit UNIX style regular expressions search, but could not get it to accept [^\ -\~] as a range. Is there any other way I can search for any non standard ascii characters?

Many thanks
Helen :?:
User avatar
bab00shka
Newbie
 
Posts: 2
Joined: Mon Nov 08, 2004 12:00 am

Re: Please help in searching for extended ascii characters

Postby Mofi » Mon Nov 08, 2004 11:23 am

To find (and delete) all non ASCII characters use a regular expression find (or replace).

With UltraEdit regular expression engine use as search string [~^t^r^n -~]+

With Unix/Perl regular expression engine the search string would be [^\t\r\n -~]+

Perl regular expression engine supports also hexadecimal notation and therefore the search string [^\t\r\n\x20-\x7E]+ could be also used with the Perl engine.

Tabs are ignored by these search strings. If tabs should be also found, remove ^t respectively \t.
User avatar
Mofi
Grand Master
Grand Master
 
Posts: 4066
Joined: Thu Jul 29, 2004 11:00 pm
Location: Vienna

Re: Please help in searching for extended ascii characters

Postby bab00shka » Mon Nov 08, 2004 11:42 am

Hi Mofi,

This has worked. Fantastic, thank-you!

Greetings to Austria!
Cheers
Helen :D
User avatar
bab00shka
Newbie
 
Posts: 2
Joined: Mon Nov 08, 2004 12:00 am

Re: Please help in searching for extended ascii characters

Postby ThWiedmann » Wed Nov 07, 2007 7:12 am

Hello,

if a text file contains special characters, e. g. a non-breaking space (ASCII code 160), and I'd like to search all locations of such a character, how, i. e. with which menu/function can this be done?

Example: How to search all locations of the character with ASCII code 160?

How must the find/search expression in UltraEdit be in order to search and find the character with ASCII code 160 (for an example)?
I'm not so familiar with regular expressions.

Thanks for all good hints.

Thomas Wiedmann
User avatar
ThWiedmann
Basic User
Basic User
 
Posts: 17
Joined: Sun Jul 31, 2005 11:00 pm

Re: Please help in searching for extended ascii characters

Postby jorrasdk » Mon Nov 12, 2007 8:08 am

Hi Thomas

You do not write which version of UE you use. But if you use version 12 or above, you are able to use the Perl regular expression engine. (Read the announcement for this forum to find out how to switch it on.)

The regular expression for ASCII code 160 (= A0 hexadecimal) is: \xA0
User avatar
jorrasdk
Master
Master
 
Posts: 275
Joined: Mon Mar 19, 2007 11:00 pm
Location: Denmark

Re: Please help in searching for extended ascii characters

Postby Mofi » Mon Nov 12, 2007 8:18 am

The regex search in my first post finds all characters greater than 127 (= all ANSI characters) and control character smaller than 32 (except TAB, CR and LF). So it finds also the non-breaking space (160).

If you want to search only for the non-breaking space do following:

  • Set cursor to top of the file with Ctrl+Home.
  • Click on View - ASCII Table.
  • Scroll down to decimal value 160 (hex A0).
  • Press button Insert Char and close the dialog.
  • Select this single character just inserted at top of the file.
  • Cut it with Ctrl+X.
  • Press Ctrl+F or Ctrl+R to open the find or replace dialog.
  • Press Ctrl+V to insert the non-breaking space character into the find field.
  • Uncheck all options because this is a normal find/replace. There is no need for a regular expression search.
  • Run the find/replace.
User avatar
Mofi
Grand Master
Grand Master
 
Posts: 4066
Joined: Thu Jul 29, 2004 11:00 pm
Location: Vienna


Return to Find/Replace/Regular Expressions