Function IsUnicode to detect Unicode file after opening it

Help with writing and running scripts

Function IsUnicode to detect Unicode file after opening it

Postby Mofi » Wed Oct 24, 2007 6:34 am

Hello script writers!

As discussed at Selection only returns first character there is currently (see post date) a problem with Unicode files.

The workaround posted by in0de using

UltraEdit.activeDocument.ASCIIToUnicode();
UltraEdit.activeDocument.unicodeToASCII();

does not work on Windows 98. Don't know why. However, it is not a fine solution.

Here is a much better one. A function which detects if UltraEdit or UEStudio has loaded a file as Unicode or not. By using this function after opening a file it is possible to convert to ASCII only if required because the file is really a Unicode file.

if (IsUnicode()) UltraEdit.activeDocument.unicodeToASCII();

If you want to report mistakes or have suggestions for further enhancements post a message here.

The script file IsUnicode.js with the code can be viewed or downloaded from the scripts section of the Extra Downloads page.
User avatar
Mofi
Grand Master
Grand Master
 
Posts: 4051
Joined: Thu Jul 29, 2004 11:00 pm
Location: Vienna

Re: Function IsUnicode to detect Unicode file after opening it

Postby Mofi » Wed Feb 11, 2009 11:24 am

The script file with the function IsUnicode was updated on 2009-02-10.

I have changed slightly all variable names by adding a prefix letter for the type of the variable. Advantages of using a type prefix letter:

  1. The type of a variable is always visible which is a great help.
  2. No problem anymore with variable names similar common words used in comments or strings, especially for searching/replacing such variables. For example sDirectory as variable name is much better than just Directory because word Directory can also exist in comments and strings.
  3. It is easy to search for all string, number or boolean variables if using a type prefix letter (always lowercase) and the variable name starts with an uppercase character. For example the case sensitive regular expression search string s[A-Z][A-Za-z]+ with option Match Whole Word Only finds all string variables in the script file.
  4. Using type prefix letters makes the selection of an existing variable of type string, number or boolean easier in the auto-complete dialog.
Additionally the global variable used is now named g_nDebugMessage. g_ as additional prefix defines that this variable is a global variable not defined inside the function. Variables with a non standard type don't have a prefix, but have a very special name like WorkingFile.

The standard prefixes I use are:

an ... array of numbers (doesn't exist in IsUnicode.js)
as ... array of strings (doesn't exist in IsUnicode.js)
b ... boolean
n ... number
s ... string


Further I have added for security a check if the current file is opened/viewed in hex edit mode. The function is not designed for determining the file encoding in hex edit mode. It now returns in this case always false and an appropriate warning is displayed if displaying debug messages are enabled.

Last I have added a code for demonstrating the usage of the function by simply running it on all open files and telling the user if the file is an ASCII/ANSI file or a Unicode file. The current cursor position in all files is saved before running IsUnicode and restored afterwards.
User avatar
Mofi
Grand Master
Grand Master
 
Posts: 4051
Joined: Thu Jul 29, 2004 11:00 pm
Location: Vienna


Return to Scripts