delete text between two words

Find, replace, find in files, replace in files, regular expressions

delete text between two words

Postby EllE » Sat Jul 15, 2006 7:57 pm

i need remove text between [lang_en] and [/lang_en]

example:
Code: Select all
blabla [lang_en]Genre: Sci-Fi / Action / Adventure[/lang_en] blabla
[lang_en]Country:[/lang_en] USA / Canada


i testing this macro:
Code: Select all
InsertMode
ColumnModeOff
HexOff
UnixReOff
Top
Loop
Find RegExp "lang_en[~,]+/lang_en"
IfFound
Delete
Else
ExitLoop
EndIf
EndLoop


but with this effect :(
Code: Select all
blabla [] USA / Canada
User avatar
EllE
Basic User
Basic User
 
Posts: 12
Joined: Mon Jun 12, 2006 11:00 pm

Re: delete text between two words

Postby EllE » Sat Jul 15, 2006 8:28 pm

I found something here viewtopic.php?t=2716

and compilated into this:
Code: Select all
InsertMode
ColumnModeOff
HexOff
Top
Loop
Find "[lang_en]"
IfNotFound
ExitLoop
EndIf
EndIf
StartSelect
Find Select "[/lang_en]"
IfSel
EndSelect
Delete
Else
EndSelect
ExitLoop
EndIf
EndLoop
Find "[lang_sk]"
Replace All ""
Find "[/lang_sk]"
Replace All ""


working great... but is too complicated
User avatar
EllE
Basic User
Basic User
 
Posts: 12
Joined: Mon Jun 12, 2006 11:00 pm

Re: delete text between two words

Postby Mofi » Sun Jul 16, 2006 3:05 pm

Fine, you are now searching, thinking and trying before asking. So I will help you a little.

First your macro contains 1 EndIf too much.

Second you can speed up your macro if you delete all strings between [lang_en]...[/lang_en] which are single line strings by using a single regular expression replace all (in UltraEdit style).

Find RegExp "^[lang_en^]*^[/lang_en^]"
Replace All ""

This regex will delete occurences like:

[lang_en]normal single line string[/lang_en]
[lang_en]single line string with [ inside[/lang_en]


You can also remove all strings between [lang_en]...[/lang_en] even if they are spanned over multiple lines with a single regular expression, but only if there is no [ character within the string.

Find RegExp "^[lang_en^][~^[]+^[/lang_en^]"
Replace All ""

This regex will delete occurences like:

[lang_en]line 1 of a multi-line string
line 2 of a multi-line string
line 3 of a multi-line string
[/lang_en]


But to make sure that really every string between [lang_en]...[/lang_en] is deleted, finally the loop must be used to also delete strings like:

[lang_en]line 1 of a multi-line string
line 2 of a multi-line string with a [ character inside
because of a nested other [tag]...[/tag] block
[/lang_en]


The whole macro with the enabled macro property Continue if a Find with Replace not found looks like as follows:

InsertMode
ColumnModeOff
HexOff
UnixReOff
Top
Find RegExp "^[lang_en^]*^[/lang_en^]"
Replace All ""
Find RegExp "^[lang_en^][~^[]+^[/lang_en^]"
Replace All ""
Loop
Find "[lang_en]"
IfNotFound
ExitLoop
EndIf
StartSelect
Find Select "[/lang_en]"
IfFound
EndSelect
Delete
Else
EndSelect
ExitLoop
EndIf
EndLoop
Top

Add UnixReOn or PerlReOn (v12+ of UE) at the end of the macro if you do not use UltraEdit style regular expressions by default - see search configuration. Macro command UnixReOff sets the regular expression option to UltraEdit style.

I have described the Find Select method also yesterday at "how to select two predefined text in different line".



Special note:
I have written once that IfFound is not possible after a Find Select "" command. This is not true anymore. IDM seems to have fixed this bug. I don't know at which version. For this example it is important to use IfFound instead of IfSel because the first searched string [lang_en] is still selected even if the second searched string is not found. If you have a version of UltraEdit where the IfFound does not work after Find Select "", you can first unselect the first search string and move the cursor to the start of the first search string. For this example the loop with a possible workaround would look like:

Loop
Find "[lang_en]"
IfNotFound
ExitLoop
EndIf
EndSelect
Key LEFT ARROW
Find Up "["
EndSelect
Key LEFT ARROW

StartSelect
Find Select "[/lang_en]"
IfSel
EndSelect
Delete
Else
EndSelect
ExitLoop
EndIf
EndLoop

A second solution for the UNSELECT AND MOVE BACK code is:

EndSelect
Key LEFT ARROW
Key Ctrl+LEFT ARROW
Key LEFT ARROW
Key Ctrl+LEFT ARROW
Key LEFT ARROW
User avatar
Mofi
Grand Master
Grand Master
 
Posts: 4054
Joined: Thu Jul 29, 2004 11:00 pm
Location: Vienna

Re: delete text between two words

Postby cwq2008119 » Fri Jul 21, 2006 6:53 am

MoFi
hello!
i am a new learner.
i don't clearly understand"^[lang_en^]*^[/lang_en^]" and "^[lang_en^][~^[]+^[/lang_en^]" what to means.
can you explain them little detail?
User avatar
cwq2008119
Newbie
 
Posts: 6
Joined: Wed Jul 19, 2006 11:00 pm

Re: delete text between two words

Postby cwq2008119 » Fri Jul 21, 2006 7:14 am

i had tried as you what to say
(Find RegExp "^[lang_en^]*^[/lang_en^]"
Replace All "")
but [lang_en]and[/lang_en] is replaced with ""
User avatar
cwq2008119
Newbie
 
Posts: 6
Joined: Wed Jul 19, 2006 11:00 pm

Re: delete text between two words

Postby Mofi » Fri Jul 21, 2006 8:11 am

cwq2008119 wrote:can you explain them little detail?


You should read in help of UltraEdit the article about the Find command and also Regular Expressions.

The UltraEdit style regular expression search string "^[lang_en^]*^[/lang_en^]" means following:

Find within a single line a string which starts with the string [lang_en] and ends with the string [lang_en]. The asterisk between these 2 strings means 0 or more occurences of any character except a new line character (CR or LF). The characters [ and ] are regular expression characters, but in this regular expression it should be interpreted as normal characters. So the must be escaped with the ^ character.


The UltraEdit style regular expression search "^[lang_en^][~^[]+^[/lang_en^]" means following:

Find within whole text a block which starts with the string [lang_en] and ends with the string [lang_en]. The [~^[]+ expression between the two strings means find at least 1 or more occurences of all characters specified within the brackets. The expression within [] means: all characters except the character [ because it is the start of the end string. This includes also the new line characters CR and LF which is the reason why it finds a block and not only a string within a line.


cwq2008119 wrote:[lang_en]and[/lang_en] is replaced with ""


That's exactly what the replace should do, delete the whole string that starts with [lang_en] and ends with [/lang_en]. If you want to delete only the characters between the two strings, you only have to write Replace All "[lang_en][/lang_en]".
User avatar
Mofi
Grand Master
Grand Master
 
Posts: 4054
Joined: Thu Jul 29, 2004 11:00 pm
Location: Vienna

Re: delete text between two words

Postby cwq2008119 » Sat Jul 22, 2006 2:02 am

thank you!
something more about "^[lang_en^]*^[/lang_en^]" and
"^[lang_en^][~^[]+^[/lang_en^]"
^[lang_en^]:does the first ^ lies in the front of the first character of
the start string? and the second ^ in the front of the last character.

"^[lang_en^][~^[]+^[/lang_en^]".
[~^[],why within [] only exclude [(the first start character?)?

can you recommend some program language to learn?
how about Perl(Practical Extraction and Report Language)?
User avatar
cwq2008119
Newbie
 
Posts: 6
Joined: Wed Jul 19, 2006 11:00 pm

Re: delete text between two words

Postby Mofi » Sat Jul 22, 2006 11:51 am

You are driving me crazy and you have not read the help articles!

The ^ character is an escape character for UltraEdit style regular expression characters like the \ is for Unix/Perl style regex or C/C++.

^ in an UltraEdit style regular expression simply means that the next character in the string should be interpreted as normal character and not with its regular expression meaning. Sorry, but that should not be too difficult to understand.

[lang_en] in a regular expression search would search for a character which is either a l or a or n or g or _ or e (and its uppercase equivalents too if Match Case is not used). ^[lang_en^] in an UltraEdit style regular expression search string means simply search for the string [lang_en] and not for a single character.

Once again. Read the help and play a little to understand. Learning by doing is the best method to learn something. I did not have a teacher or a book. I just learned it by trial and error.
User avatar
Mofi
Grand Master
Grand Master
 
Posts: 4054
Joined: Thu Jul 29, 2004 11:00 pm
Location: Vienna

Re: delete text between two words

Postby mrainey56 » Sat Jul 22, 2006 4:32 pm

You are driving me crazy and you have not read the help articles!
:D
User avatar
mrainey56
Master
Master
 
Posts: 212
Joined: Tue Jul 27, 2004 11:00 pm
Location: Spartanburg, South Carolina

Re: delete text between two words

Postby cwq2008119 » Sun Jul 23, 2006 2:15 am

Mofi
thank you!
i see.
find regexp ^[abc^] is searching [abc],
and find regexp ^abc^ is searching abc.
User avatar
cwq2008119
Newbie
 
Posts: 6
Joined: Wed Jul 19, 2006 11:00 pm


Return to Find/Replace/Regular Expressions