Complex character replacement (Unicode -> ASCII)

Display customization and font issues

Complex character replacement (Unicode -> ASCII)

Postby c4p0ne » Tue Nov 25, 2008 9:54 am

I would like to replace multiple Unicode characters (>30) throughout a file with their ASCII Equivalents. I need to do this because the conversion options do not work properly in the UltraEdit menus. For example, the following word

αθήνα (Unicode {Greek}; HEX: 03b1 03b8 03b7 03bd 03b1)

Will be converted to its ASCII Greek equivalent:

áèçíá (ASCII; HEX: 00e1 00e8 00e7 00ed 00e1)

So in other words I want to construct (if possible) an expression that will replace the entire Unicode Greek range with it's ASCII eqivalents letter by letter, another example:

Α (0391) becomes Á (00C1)
β (0392) becomes  (00C2)
Γ (0393) becomes Ø (00D8)
Δ (0394) becomes Ä (00C4)
etc.. etc..

For the life of me I cannot find a utility that will do this online or offline, however I can find millions of online/offline utilites that will convert the OTHER way around (Ascii TO Unicode). :|
User avatar
c4p0ne
Newbie
 
Posts: 9
Joined: Sat Oct 04, 2008 1:21 pm
Location: Classified

Re: Complex character replacement (Unicode -> ASCII)

Postby Mofi » Tue Nov 25, 2008 10:26 am

If I copy the Unicode string "αθήνα" (03b1 03b8 03ae 03bd 03b1) into a Unicode file, select at View Set Code Page the coding 1253 (ANSI - Greek) and at View - Set Font the font Courier New with the script Greek and run File - Conversions - Unicode to ASCII, I get the string "αθήνα" (00e1 00e8 00de 00ed 00e1).

Next I select at View Set Code Page the coding 1252 (ANSI - Latin I) and at View - Set Font the font Courier New with the script Western and see the same 5 bytes now as "áèÞíá" which I think is absolutely correct.

"αθηνα" (3rd character is 03B7) is converted to "αθηνα" using code page 1253 (3rd character is 00e7) which is viewed with Latin I code page as "áèçíá".

By the way: What is ASCII Greek?
User avatar
Mofi
Grand Master
Grand Master
 
Posts: 4062
Joined: Thu Jul 29, 2004 11:00 pm
Location: Vienna

Re: Complex character replacement (Unicode -> ASCII)

Postby c4p0ne » Tue Nov 25, 2008 11:09 am

Works thanks.
User avatar
c4p0ne
Newbie
 
Posts: 9
Joined: Sat Oct 04, 2008 1:21 pm
Location: Classified


Return to Editor Display