Extract some lines to new file to build new, sorted list with placeholders for missing lines?

Help with writing and playing macros

Extract some lines to new file to build new, sorted list with placeholders for missing lines?

Postby mrducnt » Thu Apr 07, 2011 8:48 am

I'm working with a list, which is about 5 mil lines in this format
Sample:
Code: Select all
----------
1.Ticker: AAA
2.Company name: abcde
3.Current Price: 12345
4.EPS: 23456
5.P/E: 45678
6.Adjusted P/E: 23124
7.Book Value: 56789
8.ROA: 67890
9.ROE: 78901
10.RSI: 89.12
11.MA50: 80.890
12.MA100: 89.879
----------
1.Ticker: CCC
2.Company name: adsad bcde corp.
3.Current Price: 1234563
4.EPS:
5.P/E: 458
6.Adjusted P/E: 89852
8.ROA:
9.ROE: 896785
11.MA50: 87764
12.MA100: 12386
----------
....so on

Information about Ticker is listed between the two "----------". Now I want to extract some lines, and copy them to the next document, but remain the format. If there are any missing information, then return N/A.

Say I need to extract Ticker, Company name, Current Price, Book Value, RSI, then the final result would look like this:
Result
Code: Select all
----------
1.Ticker: AAA
2.Company name: abcde
3.Current Price: 12345
7.Book Value: 56789
10.RSI: 89.12
----------
1.Ticker: CCC
2.Company name: adsad bcde corp.
3.Current Price: 1234563
N/A
N/A
----------
....so on

What is the best solution to deal with this case :cry: ? Thanks in advance :lol:
mrducnt
Newbie
 
Posts: 3
Joined: Thu Apr 07, 2011 8:00 am

Re: Extract some lines to new file to build new, sorted list with placeholders for missing lines?

Postby Mofi » Thu Apr 07, 2011 10:10 am

If you need this only once, run a Perl regular expression Find searching for

Ticker|Company name|Current Price|Book Value|RSI|----------

with advanced find option List Lines Containing String enabled before pressing button Next. Then press button Clipboard, close the dialog, open a new file and paste the found and copied lines. Press Ctrl+Home to set caret to top of the file.

Now you need to add only the lines with N/A. That can be done with a few additional Perl regular expression Replace All commands.

Search for ^(--.*\r\n)([^1]) and use \11.N/A\r\n\2 as replace string.
Search for ^(1\..*\r\n)([^2]) and use \12.N/A\r\n\2 as replace string.
Search for ^(2\..*\r\n)([^3]) and use \13.N/A\r\n\2 as replace string.
Search for ^(3\..*\r\n)([^7]) and use \17.N/A\r\n\2 as replace string.
Search for ^(7\..*\r\n)([^1]) and use \1N/A\r\n\2 as replace string.

And finally search for ^[1237]\.N/A and use N/A as replace string.
User avatar
Mofi
Grand Master
Grand Master
 
Posts: 3936
Joined: Thu Jul 29, 2004 11:00 pm
Location: Vienna

Re: Extract some lines to new file to build new, sorted list with placeholders for missing lines?

Postby mrducnt » Fri Apr 08, 2011 5:36 am

Thanks Mofi! But I still have some question.
If between the two "----------" is about 20 lines, and they are not in the same order, like this:

Code: Select all
----------
1.Ticker: AAA
3.Current Price: 12345
11.MA50: 80.890
5.P/E: 45678
6.Adjusted P/E: 23124
7.Book Value: 56789
8.ROA: 67890
16.MACD: 89
4.EPS: 23456
2.Company name: abcde
10.RSI: 89.12
12.MA100: 89.879
14.UpBB: 768
9.ROE: 78901
15.BoBB: 657
13.ADX: 23
----------
1.Ticker: CCC
2.Company name: adsad bcde corp.
3.Current Price: 1234563
4.EPS: 3434
5.P/E: 458
6.Adjusted P/E: 89852
8.ROA: 42342
9.ROE: 896785
11.MA50: 87764
12.MA100: 12386
----------

Say I want to extract from line 1 to 10, then Is there any ways to do these two things:
+ Sort these lines based on the number at the start of line, and
+ Return N/A if these is any missing line
Thanks in advance :lol:
mrducnt
Newbie
 
Posts: 3
Joined: Thu Apr 07, 2011 8:00 am

Re: Extract some lines to new file to build new, sorted list with placeholders for missing lines?

Postby Mofi » Fri Apr 08, 2011 11:02 am

That new requirement now results in a real need of a macro or script. A good written script would do the entire job definitely faster, but writing macros is faster for a macro expert like me. The macro property Continue if search string not found must be checked for this macro. It does the entire job including copying the lines starting with -- or 1. or 2. or 3. ... or 10.

InsertMode
ColumnModeOff
HexOff
PerlReOn
Bottom
IfColNumGt 1
InsertLine
EndIf
Top
Clipboard 9
ClearClipboard
Loop 0
Find RegExp "^(--|1\.|2\.|3\.|4\.|5\.|6\.|7\.|8\.|9\.|10\.).*\r\n"
IfNotFound
ExitLoop
EndIf
CopyAppend
EndLoop
NewFile
Paste
ClearClipboard
Clipboard 0
Key UP ARROW
Key END
IfColNum 1
ExitMacro
EndIf
Top
Loop 0
Find RegExp "^(?:\d+\..*\r\n)+"
IfNotFound
ExitLoop
EndIf
SortAsc Numeric 1 -1 0 0 0 0 0 0
Find "----------"
EndLoop
Bottom
Key BACKSPACE
Top
Find RegExp "^(--.*\r\n)(?!1\.)"
Replace All "\11.N/A\r\n"
Find RegExp "^(1\..*\r\n)(?!2\.)"
Replace All "\12.N/A\r\n"
Find RegExp "^(2\..*\r\n)(?!3\.)"
Replace All "\13.N/A\r\n"
Find RegExp "^(3\..*\r\n)(?!4\.)"
Replace All "\14.N/A\r\n"
Find RegExp "^(4\..*\r\n)(?!5\.)"
Replace All "\15.N/A\r\n"
Find RegExp "^(5\..*\r\n)(?!6\.)"
Replace All "\16.N/A\r\n"
Find RegExp "^(6\..*\r\n)(?!7\.)"
Replace All "\17.N/A\r\n"
Find RegExp "^(7\..*\r\n)(?!8\.)"
Replace All "\18.N/A\r\n"
Find RegExp "^(8\..*\r\n)(?!9\.)"
Replace All "\19.N/A\r\n"
Find RegExp "^(9\..*\r\n)(?!10\.)"
Replace All "\110.N/A\r\n"
Find RegExp "^\d+\.(?=N/A)"
Replace All ""
Bottom
InsertLine
Top
User avatar
Mofi
Grand Master
Grand Master
 
Posts: 3936
Joined: Thu Jul 29, 2004 11:00 pm
Location: Vienna

Re: Extract some lines to new file to build new, sorted list with placeholders for missing lines?

Postby Mofi » Fri Apr 08, 2011 12:03 pm

In just modified in above macro the second loop to make the macro a little faster. Instead of using

Loop 0
Find "----------^p"
IfNotFound
ExitLoop
EndIf
Key HOME
StartSelect
Find Select "----------"
IfSel
Key HOME
SortAsc Numeric 1 -1 0 0 0 0 0 0
EndSelect
Else
EndSelect
ExitLoop
EndIf
EndLoop


the second loop is now

Loop 0
Find RegExp "^(?:\d+\..*\r\n)+"
IfNotFound
ExitLoop
EndIf
SortAsc Numeric 1 -1 0 0 0 0 0 0
Find "----------"
EndLoop


Then I thought that collecting all lines of interest in clipboard and copying them to a new file is horrible slow because of all the display updates. Therefore I thought, why not doing it reverse: copy all lines and then delete all lines in new file which are of no interest. That is also possible and avoids all the display updates. Here is the macro working with this method. The changed part is with blue color.

InsertMode
ColumnModeOff
HexOff
PerlReOn
Clipboard 9
SelectAll
Copy
NewFile
Paste
ClearClipboard
Clipboard 0
IfColNumGt 1
InsertLine
EndIf
Top
Find RegExp "^(?!(--|1\.|2\.|3\.|4\.|5\.|6\.|7\.|8\.|9\.|10\.)).*\r\n"
Replace All ""
Key END
IfColNum 1
ExitMacro
EndIf

Top
Loop 0
Find RegExp "^(?:\d+\..*\r\n)+"
IfNotFound
ExitLoop
EndIf
SortAsc Numeric 1 -1 0 0 0 0 0 0
Find "----------"
EndLoop
Bottom
Key BACKSPACE
Top
Find RegExp "^(--.*\r\n)(?!1\.)"
Replace All "\11.N/A\r\n"
Find RegExp "^(1\..*\r\n)(?!2\.)"
Replace All "\12.N/A\r\n"
Find RegExp "^(2\..*\r\n)(?!3\.)"
Replace All "\13.N/A\r\n"
Find RegExp "^(3\..*\r\n)(?!4\.)"
Replace All "\14.N/A\r\n"
Find RegExp "^(4\..*\r\n)(?!5\.)"
Replace All "\15.N/A\r\n"
Find RegExp "^(5\..*\r\n)(?!6\.)"
Replace All "\16.N/A\r\n"
Find RegExp "^(6\..*\r\n)(?!7\.)"
Replace All "\17.N/A\r\n"
Find RegExp "^(7\..*\r\n)(?!8\.)"
Replace All "\18.N/A\r\n"
Find RegExp "^(8\..*\r\n)(?!9\.)"
Replace All "\19.N/A\r\n"
Find RegExp "^(9\..*\r\n)(?!10\.)"
Replace All "\110.N/A\r\n"
Find RegExp "^\d+\.(?=N/A)"
Replace All ""
Bottom
InsertLine
Top
User avatar
Mofi
Grand Master
Grand Master
 
Posts: 3936
Joined: Thu Jul 29, 2004 11:00 pm
Location: Vienna

Re: Extract some lines to new file to build new, sorted list with placeholders for missing lines?

Postby mrducnt » Fri Apr 08, 2011 1:17 pm

Mofi, that macro solved all my problems. That's great. Many thank :lol: :lol: :lol:
mrducnt
Newbie
 
Posts: 3
Joined: Thu Apr 07, 2011 8:00 am


Return to Macros

cron