Unicode and UTF-8 Support


UltraEdit provides support for Unicode (16-Bit wide character) files, and for UTF-8 files and allows direct editing of Unicode files and UTF-8 files as well as conversion routines between ASCII/ANSI and Unicode or UTF-8.  UltraEdit attempts to detect the file type when the file is loaded. It will look for the FF FE marker for Unicode files. For UTF-8 it will look for one of the following three occurrences:

 

1) File marker (BOM) EF, BB, BF.

2) String occurrences "charset=utf-8" or "encoding=utf-8"

3) It will look for the occurrence of valid UTF-8 multi-byte characters in the first 64KB of a file.

 

If the file is found to be Unicode, it will be treated as such and the status bar will indicate this.  A Unicode file saved in "big Endian" format will be indicated with "UTF-16BE".  A Unicode file saved in "little Endian" format will be indicated with "UTF-16".

 

If the file is found to be UTF-8, it will be treated as such and converted internally to Unicode (16-Bit) for editing. The status bar will indicate this with "UTF-8". When the file is saved, it will be converted back from Unicode to UTF-8 and saved in this format.

 

The following conversions to and from Unicode/UTF-8 are available:

 

ASCII to Unicode

converts from ASCII to Unicode

UTF-8 to Unicode

converts from UTF-8 to Unicode (16-Bit)

Unicode to ASCII

converts from Unicode to ASCII

UTF-8 to ASCII

converts from UTF-8 to ASCII

ASCII to UTF-8 (Unicode Editing)

converts from ASCII to UTF-8 with the file internally in UNICODE format for editing

Unicode/UTF-8 to UTF-8 (Unicode)

converts the file from either Unicode or UTF-8 (non Unicode internally) to UTF-8 with the file internally in UNICODE format for editing

Unicode/ASCII/UTF-8 to UTF-8 (ASCII)

converts from Unicode, ASCII or UTF-8 (Unicode format internally) to UTF-8 but leaves the file in non-unicode (ASCII display) format for editing.

 



Article Number: 1248
Posted: Thu, Oct 13, 2011 6:11 PM
Last Updated: Tue, Jan 21, 2014 4:34 PM

Online URL: http://www.ultraedit.com/help/article/unicode-and-utf-8-support-1248.html