Parse and process tab-delimited text file

Help with writing and running scripts

Parse and process tab-delimited text file

Postby Kulstad » Thu Feb 14, 2013 11:48 am

Disclaimer
Please accept my apologies if this is a question that has been asked before. I've checked the forums, but could not find a solution or thread related to my issue.

I have a text file in the following format:
    table_name_01 [tab] column_name_01
    table_name_01 [tab] column_name_02
    table_name_01 [tab] column_name_03
    ...
    ...
    ...
    table_name_99 [tab] column_name_9
    9

Here is what I am attempting to do:
    read each tab-delimited line
    create a new document based on the table_name
    put the column_name in the table_name document
    read the next line
    if a document of table_name already exists, append the column_name
    if a document of table_name does not exist, create a new document based on the table_name
    put the column_name in the table_name document
do this until the end of file

I am brand new to UE scripting, and not very familiar with JS scripting. Any and all help is greatly appreciated.

Thank you in advance.
Kulstad
Newbie
 
Posts: 3
Joined: Thu Feb 14, 2013 11:38 am

Re: Parse and process tab-delimited text file

Postby Mofi » Fri Feb 15, 2013 2:04 am

Which file extensions should the files have?

Should the file table_name_01 contain after script finished

column_name_01[tab]column_name_02[tab]column_name_03

or

column_name_01
column_name_02
column_name_03


Can be assumed that no table name contains a character not valid for a file name like : / \ < > etc. or must the script verify that and replace not allowed characters in file names by an underscore or a different character?

Last, how large is the input file? Some KB, a few MB, several hundred MB or even GB? That makes a big difference in implementation.
User avatar
Mofi
Grand Master
Grand Master
 
Posts: 4049
Joined: Thu Jul 29, 2004 11:00 pm
Location: Vienna

Re: Parse and process tab-delimited text file

Postby Jaretin » Fri Feb 15, 2013 3:13 am

For small files use this script
Code: Select all
if (UltraEdit.document.length > 0) {
   // Define the environment for the script.
   UltraEdit.insertMode();
   UltraEdit.columnModeOff();
   UltraEdit.activeDocument.hexOff();
   // Select all and load the file contents into an array of lines.
   UltraEdit.activeDocument.selectAll();
   var allTables = {};
   if (UltraEdit.activeDocument.isSel()) {
      var asLines = UltraEdit.activeDocument.selection.split("\r\n");
      UltraEdit.activeDocument.top();  // Discards the selection.
      for (var nLineNum = 0; nLineNum < asLines.length; nLineNum++) {
         var asLineVals = asLines[ nLineNum ].split( "\t" );
         //ignore other lines
         if ( asLineVals.length == 2 ) {
            if ( allTables[ asLineVals[0] ] == null ) {
               //new tablename
               allTables[ asLineVals[0] ] = [];
            }
            allTables[ asLineVals[0] ][ allTables[ asLineVals[0] ].length ] = asLineVals[ 1 ];
         }
      }
      //now write file for each table
      for ( var _table in allTables ) {
           UltraEdit.newFile();
           var colSep = "\t"; //may change it to \r\n
         for ( var _tableCol=0; _tableCol < allTables[_table].length - 1; _tableCol++ ) {
            UltraEdit.activeDocument.write( allTables[_table][_tableCol] + colSep );
         }
           UltraEdit.activeDocument.write( allTables[_table][_tableCol] );
           UltraEdit.saveAs( "c:\\temp\\" + _table + ".txt" );
            UltraEdit.closeFile( "c:\\temp\\" + _table + ".txt" ,2);
      }
   }
}
User avatar
Jaretin
Basic User
Basic User
 
Posts: 19
Joined: Sun Mar 25, 2007 11:00 pm

Re: Parse and process tab-delimited text file

Postby Kulstad » Fri Feb 15, 2013 11:36 am

Thank you for your reply.

Mofi wrote:Which file extensions should the files have?

A simple .TXT or .SQL extension would suffice.

Mofi wrote:Should the file table_name_01 contain after script finished

column_name_01[tab]column_name_02[tab]column_name_03

or

column_name_01
column_name_02
column_name_03



It should contain the following:
column_name_01
column_name_02
column_name_03


Mofi wrote:Can be assumed that no table name contains a character not valid for a file name like : / \ < > etc. or must the script verify that and replace not allowed characters in file names by an underscore or a different character?

This would be a correct assumption, for both table names and field names.

Mofi wrote:Last, how large is the input file? Some KB, a few MB, several hundred MB or even GB? That makes a big difference in implementation.

I can't envision the input file ever being more than 5MB in size. The input file is manually created (for now) by copying specific columns from an Excel spreadsheet and dropping them in a text file.
Kulstad
Newbie
 
Posts: 3
Joined: Thu Feb 14, 2013 11:38 am

Re: Parse and process tab-delimited text file

Postby Kulstad » Fri Feb 15, 2013 11:52 am

Thank you so very much for this, Jaretin. With the slight of changing var colSep = "\t"; to var colSep = "\r\n"; it is exactly what I was looking for.
Kulstad
Newbie
 
Posts: 3
Joined: Thu Feb 14, 2013 11:38 am


Return to Scripts