Problems with special characters like ü, ö

Find, replace, find in files, replace in files, regular expressions

Problems with special characters like ü, ö

Postby cocoonclubber » Sat Aug 06, 2005 4:05 pm

hi together...

i'm trying to do a search/replace on special charachters.
i have some html files, where i want to replace "ü" with "ü", but it doesn't find any "ü" in the files...

i tried to replace "ü" for test in jpegs, no problem, he found some... but not in html files...

wher could be the problem...
User avatar
cocoonclubber
Newbie
 
Posts: 6
Joined: Fri Aug 05, 2005 11:00 pm

Re: Problems with special characters like ü, ö

Postby Mofi » Sun Aug 07, 2005 11:18 am

I guess, the German umlauts are already coded as html entities in the html source text. Look for ö (= ö), ü (= ü), Ö (= Ö), ...

If the umlauts are part of an uri the are coded different. For example Ü = %C3%9C and ü = %C3%BC. You should never use German umlauts or ß in file names - NEVER!!!

Check also if the html files are in unicode - see status bar at bottom of UltraEdit window - or if the umlauts are coded in html unicode or html hex, which is also possible.
User avatar
Mofi
Grand Master
Grand Master
 
Posts: 4042
Joined: Thu Jul 29, 2004 11:00 pm
Location: Vienna

Re: Problems with special characters like ü, ö

Postby cocoonclubber » Sun Aug 07, 2005 12:53 pm

hi to austria,

thanx for reply.
the pages for pics are designed with arles image web page creator. other pages are designed with frontpage. when i now type e.g. a "ü" in frontpage it will be coded right as "ü" in arles, he writes the "ü" also in the code... thats the problem. i didn´t find any options in arles, that it codes it in unicode...

so i have to change all the "umlaute", and now i´m trying to do that wirth ultraedit, but with no success.

never got such problems. i got now a vserver @ server4you.de, now this problems appears with the page. with all the other webpage services i never got such problems...
User avatar
cocoonclubber
Newbie
 
Posts: 6
Joined: Fri Aug 05, 2005 11:00 pm

Re: Problems with special characters like ü, ö

Postby cocoonclubber » Sun Aug 07, 2005 12:58 pm

a codepart e.g.:

<div align="center">
<a href="../index2.html"><img src="../images/cimg0319.jpg" alt="<- zurück" title="<- zurück" width="800" height="600" border="0"></a>
</div>

here is written "zurück" from arles webpage creator. but ultraedit doesn't find this "ü"

greetz from germany
User avatar
cocoonclubber
Newbie
 
Posts: 6
Joined: Fri Aug 05, 2005 11:00 pm

Re: Problems with special characters like ü, ö

Postby Mofi » Sun Aug 07, 2005 1:34 pm

Which hex code has the "ü" character in "zurück"?

Set the cursor to the "ü" and switch to hex mode to see the hex code for it.

If it is an ANSI "ü", then it should have the hex code FC and 00 FC if it is unicode. UltraEdit should find this without any problem. An OEM "ü" (= old DOS) has hex code 81.

However, to replace all these umlauts from arles webpage creator do following:

Select one "ü", which is not found.

Open Replace (if all html files are open in UE) or Replace In Files.

The selected string - ü - is automatically already specified in the find field, independent of the hex code.

Now specify the replace string and set the other options correct and run the replace.

Redo this procedure for all other umlauts.

This should work, hopefully.
User avatar
Mofi
Grand Master
Grand Master
 
Posts: 4042
Joined: Thu Jul 29, 2004 11:00 pm
Location: Vienna

Re: Problems with special characters like ü, ö

Postby cocoonclubber » Sun Aug 07, 2005 2:03 pm

and here's my problem:

search/replace for "ü" doesn't give me any results...

git about 8000 html files so i'm not able to open them all in ultraedit...
User avatar
cocoonclubber
Newbie
 
Posts: 6
Joined: Fri Aug 05, 2005 11:00 pm

Re: Problems with special characters like ü, ö

Postby Mofi » Mon Aug 08, 2005 11:58 am

Is one of these files anywhere at WWW or can you zip one and upload it anywhere. I have done replaces for umlauts in html very often, last time for 2 weeks and it always worked (standard DOS or UNIX html files, no unicode, no UTF-8). So it must be a special file format problem and I can only help further, if I can look into an unmodified original source file.
User avatar
Mofi
Grand Master
Grand Master
 
Posts: 4042
Joined: Thu Jul 29, 2004 11:00 pm
Location: Vienna

Re: Problems with special characters like ü, ö

Postby cocoonclubber » Mon Aug 08, 2005 12:04 pm

they're normal html files... but not online temporarely. will upload them later whem back at home...

i now replaced the umlaute with dreamweaver. i didn't know that there's a good search/replace tool inside... now all the pages are all right.
User avatar
cocoonclubber
Newbie
 
Posts: 6
Joined: Fri Aug 05, 2005 11:00 pm

Re: Problems with special characters like ü, ö

Postby cocoonclubber » Mon Aug 08, 2005 3:52 pm

hi it's me again

i uploaded a part of the page, you can find it under:
http://www.danijel-brncic.de/test/

there's an index.html and under in the folder
http://www.danijel-brncic.de/test/pages/
you can find the image1.html to image30.html .

in the pages you'll find e.g. the word "zurück"...

thanx danijel
User avatar
cocoonclubber
Newbie
 
Posts: 6
Joined: Fri Aug 05, 2005 11:00 pm

Re: Problems with special characters like ü, ö

Postby Mofi » Tue Aug 09, 2005 3:05 pm

OK. Now after looking into the code, I could see the problem and it was like expected. Your files are coded in UTF-8 format with Unix line ending, which is also correct specified in your files with
<meta http-equiv="Content-Type" content="text/html; charset=utf-8">

So when you open your html files and auto detect UTF-8 is enabled in the general configuration dialog, you will see the "ü" in "zurück" correctly in UltraEdit, although it is coded in the file as two byte character with hex code C3 BC.

But to see, that it is coded with these hex codes, you have to disable first the auto detect UTF-8 feature, then open the html file, go to "zurück" and switch to hex mode.

Because the "Replace In Files" has no detection of file format, it will never find a "ü" entered in the find field (with hex code FC). To do the replace with UltraEdit, you would have to search for "ü" instead of "ü".

However, you have now done the replaces with Dreamweaver, so this is just for info for you and maybe for other users in future with the same problem.

For all users, a little table for UTF-8 to ANSI to OEM conversion table for all special German characters.

ANSI: ä | Ä | ö | Ö | ü | Ü | ß --- hex: E4 | C4 | F6 | D6 | FC | DC | DF
OEM: „ | Ž | ” | ™ |  | š | á | --- hex: 84 | 8E | 94 | 99 | 1 | 9A | E1
UTF8: ä | Ä | ö | Ö | ü | Ü | ß --- hex: C3 A4 | C3 84 | C3 B6 | C3 96 | C3 BC | C3 9C | C3 9F
User avatar
Mofi
Grand Master
Grand Master
 
Posts: 4042
Joined: Thu Jul 29, 2004 11:00 pm
Location: Vienna


Return to Find/Replace/Regular Expressions