Wrong letter displayed for Slavic character

Display customization and font issues

Wrong letter displayed for Slavic character

Postby Nufacik » Mon Apr 28, 2008 12:57 am

Hi all.

I have one problem. I try UE v 13.10a+1 (I found some older version on my disk) for editing SQL source and texts files. In UE I have one problem with my native language. In my language existing letter "Ľ". It is ˇ and L together. But UE doesn't display this letter correct. In UE it is like "ź". But, if I copy this letter from UE like ź and paste into another editor, for example notepad, or hit textarea on your website, letter is pasted correctly, like "Ľ". I don know, what is wrong. Can you give my some advice?
Thanks.

Peter.
User avatar
Nufacik
Newbie
 
Posts: 9
Joined: Mon Jul 31, 2006 11:00 pm

Re: Wrong letter displayed for Slavic character

Postby Mofi » Mon Apr 28, 2008 2:54 am

First I guess the file you have open in UltraEdit is an ASCII/ANSI file and not a Unicode file and therefore Unicode characters like yours (coded with more than 1 byte) must be converted to ANSI (only 1 byte per character) if this is possible. The conversion from Unicode to ANSI is done with the code page currently selected at Advanced - Set Code Page/Locale or which is declared in the document and automatically detected and applied by UltraEdit (if the config setting for automatic code page detection is not disabled) or applied manually to current document at View - Set Code Page. HTML and XML documents should always contain the code page/character set declaration. For example in HTML in section <head>

<meta http-equiv="content-type" content="text/html; charset=iso-8859-1">

is the right one for most Western European and North American Languages.

I use the German page SELFHTML - Zeichenkodierungen which contains images of some character tables to answer questions like yours. In WikiPedia you can also find many articles about code pages and character sets. I can see your character in the code table of ISO-8859-2 (Latin-2) which is the code table for most Central European and Slavic languages. It has the dezimal code 165. Set the cursor left to your special character and click on Search - Character Properties. If it shows you the decimal code 165 for this character, I'm right.

Select ISO-8859-2 (Latin-2) at Advanced - Set Code Page/Locale or View - Set Code Page. Additionally you have to choose also a font at View - Set Font, View - Set HEX/Column Mode Font, View - Set Printer Font which supports the code page you need and has the correct glyph for your character. You must specify also the correct Script in the font settings dialog. If I'm right, the script you need is Central European.

The format of the current file is displayed in the status bar at bottom of the UltraEdit window. See my readme topic where I have explained what the abbreviations for the file format and the line terminations in the field right to the field with line/column/clipboard numbers in the status bar mean.
User avatar
Mofi
Grand Master
Grand Master
 
Posts: 4051
Joined: Thu Jul 29, 2004 11:00 pm
Location: Vienna

Re: Wrong letter displayed for Slavic character

Postby Nufacik » Mon Apr 28, 2008 6:00 am

Hi Mofi.

I have still this problem. I already setuped code pade 1250 - ANSI Central European, or ISO-8859-2 Central Europe, letter Ľ is always displayed as ź. I restart UE many times, nothing changed. If I have cursor on left side of this letter and use Search->Character Properties, I see in dialog this:
Decimal value: 188
Hexadecimal value: 0xbc
Display as: Ľ (this will very based on font and script)

Offset of Character (Decimal): 19512
Offset of Character (HEX): 0x4c38.

I'm using font Courier new, style: regular, Size: 10, script is Central European.

UE still display Ľ as ź.

Nuf
User avatar
Nufacik
Newbie
 
Posts: 9
Joined: Mon Jul 31, 2006 11:00 pm

Re: Wrong letter displayed for Slavic character

Postby Mofi » Mon Apr 28, 2008 10:06 am

I don't understand why this character has decimal code 188 instead of 165. ź is correct for character with code 188 using ISO-8859-2 (Latin-2).

I suggest to do what others program like Word and Notepad also do, use Unicode. Convert your file into any Unicode format any you will have no problem anymore with byte <-> glyph conversion.

I have selected the font/script you have selected and created an ANSI file with characters 160 to 169 and compare the look of the characters with the image for ISO-8859-2 (Latin-2), I can see that most characters are identical displayed, but not all. Looks like Microsoft has ignored partly the international standard ISO-8859-2 or script Central European is not equal with ISO-8859-2. Maybe you have to change additionally in the Regional and Language Settings of Windows the correct country and language. I'm not really an expert in non Western European languages because it can't read nothing else than German and English and so don't I have played with all those font/code page/country and language settings of Windows.

By the way: If I start Notepad, paste Ľ as ź into a new file, click on Save As, choose the encoding ANSI and click on button Save, I get the message:

This file contains characters in Unicode format which will be lost if you save this file as an ANSI encoded text file. To keep the Unicode information, click Cancel below and then select one of the Unicode options from the Encoding drop down list. Continue?

And if I continue the 2 special characters are saved as normal L and z in the ANSI file. Maybe for Ľ a Unicode file is required, I don't know.
User avatar
Mofi
Grand Master
Grand Master
 
Posts: 4051
Joined: Thu Jul 29, 2004 11:00 pm
Location: Vienna


Return to Editor Display