Remove all spaces within an xml file except for within tags

Find, replace, find in files, replace in files, regular expressions

Remove all spaces within an xml file except for within tags

Postby lostinUE » Mon Jul 16, 2007 7:34 pm

Hi everyone,

hope you can help, I have been through most of the postings and couldn't find anything to help me with this problem. Not sure if it can be done so any help would be greatly appreciated.

I need to clean up a large number of xml files and combine them all into one big file(which I can create the macro for to go through them all) but need some help with regexp.

I want to eliminate any space characters (tabs, spaces and carriage returns) but not any spaces within the start and end tags.
For example:
<starttag>this is a test</starttag>
<start2>helpappreciated</start2>

Should become:
<starttag>this is a test</starttag><start2>helpappreciated</start2>

Thanks for your help
I'm using UE 12.00+3 and Ultraedit expressions
User avatar
lostinUE
Newbie
 
Posts: 1
Joined: Sun Jul 15, 2007 11:00 pm

Re: Remove all spaces within an xml file except for within tags

Postby jorrasdk » Mon Jul 16, 2007 8:39 pm

Since UE version 12.10 UE comes integrated with the XML parser XML Lint. That is of course not helping you directly.

But as a version 12.00+3 user you could download XML Lint yourself and install it as an external tool.

See this post for futher details: XML Lint - how to install

I ran XMLlint with the option --noblanks ("drop ignorable blank spaces") and it did exactly what you want and that is dropped all tabs, spaces and carriage returns between tags and between comments and tags. "Inline" spaces "survives".
User avatar
jorrasdk
Master
Master
 
Posts: 275
Joined: Mon Mar 19, 2007 11:00 pm
Location: Denmark

Re: Remove all spaces within an xml file except for within tags

Postby jorrasdk » Mon Jul 16, 2007 9:05 pm

Reedited between 00:00 and 00:15 CET.

Back again. Of course with UE style regex a more simplistic approach could be taken:

Search for
>[^t^p ]++<

replace
><

(only very minor disadvantages, for example bad empty tags <tag1> </tag1> with blanks are nullified as <tag1></tag1>).
User avatar
jorrasdk
Master
Master
 
Posts: 275
Joined: Mon Mar 19, 2007 11:00 pm
Location: Denmark


Return to Find/Replace/Regular Expressions