Removing quotes within quotes (CSV)

Find, replace, find in files, replace in files, regular expressions

Removing quotes within quotes (CSV)

Postby kizzy » Mon Jun 25, 2007 6:43 pm

Hello all..

I am using UE version 12.20.

I am trying to remove quotes within quotes on a rather large data file that we need to process into our DB.

For some reason this one is give me a run for my money I am sure it's an easy one for the experts here!!

Sample text:

,"003","N041100","","LASER WELD WIRE 10""LG","350906",

in this line the quotes (in bold) are within another set of quotes which is bad. I would like to search and remove this condition from my data. The quotes will always occur within another set of qoutes that are located between two comma's.

Ok condition , "",
This is acceptable and should not be removed.

Any ideas? Thanks for any help you can provide!!
Kizzy
User avatar
kizzy
Newbie
 
Posts: 3
Joined: Sun Jun 24, 2007 11:00 pm

Re: Removing quotes within quotes (CSV)

Postby scallanh » Mon Jun 25, 2007 9:47 pm

My first thought is, why try to remove the quotes and mangle your data in the process? Your sample text is valid CSV (quotes within each value are escaped by doubling them), so any tools designed to work with CSV should parse it correctly. For example:

Code: Select all
Plain text:  LASER WELD WIRE 10"LG
CSV encoded:  "LASER WELD WIRE 10""LG"


If you really must remove the quotes for some reason, I think it would be easier to use a tool designed to work with CSV files. For example, open the CSV file in Excel, search+replace the quotes, and re-save as CSV.
User avatar
scallanh
Basic User
Basic User
 
Posts: 31
Joined: Mon Oct 24, 2005 11:00 pm

Re: Removing quotes within quotes (CSV)

Postby kizzy » Tue Jun 26, 2007 1:35 am

scallanh I am with you 100% we are importing about 2.2 GB of data from AS400 land into a SQL Server 2007 DB via an extracted CVS file. For some crazy reason SQL Server 2007 throws up all over the quotes within the quotes. Researching online says this seems to be a know bug. /sigh

The replace idea has potential will give it a go!
User avatar
kizzy
Newbie
 
Posts: 3
Joined: Sun Jun 24, 2007 11:00 pm

Re: Removing quotes within quotes (CSV)

Postby Bego » Tue Jun 26, 2007 6:58 am

hi dudes,

if I understood this correctly, a replace like "no comma, double quotes, no comma" should solve the problem.
Perl regexp:
find:
Code: Select all
([^,])\"\"([^,])

replace with:
Code: Select all
$1$2


so you get:
Code: Select all
,"003","N041100","","LASER WELD WIRE 10LG","350906",


Is that ok ?

rds Bego
User avatar
Bego
Master
Master
 
Posts: 357
Joined: Wed Nov 24, 2004 12:00 am
Location: Germany

Re: Removing quotes within quotes (CSV)

Postby pietzcker » Tue Jun 26, 2007 9:06 am

Hi bego,

clever idea :)

However, you don't need to escape the quotes.
Code: Select all
([^,])""([^,])

should work too...

And of course if you use negative lookaround, you can search for
Code: Select all
(?<!,)""(?!,)

and replace with nothing (or something like "inch" etc.) which is a LOT faster.

Cheers,
Tim
User avatar
pietzcker
Master
Master
 
Posts: 241
Joined: Sun Aug 22, 2004 11:00 pm

Re: Removing quotes within quotes (CSV)

Postby kizzy » Tue Jun 26, 2007 11:02 am

You guys rock! I really need to start learning the Perl RegExpressions aspect. I mostly do very basic stuff using the Unix Reg. Thanks for all your help!!

Kizzy
User avatar
kizzy
Newbie
 
Posts: 3
Joined: Sun Jun 24, 2007 11:00 pm

Re: Removing quotes within quotes (CSV)

Postby Bego » Tue Jun 26, 2007 2:45 pm

Hi Tim,

you're right. Thx for the tip.

kizzy, you can do the same with UE regexp or Unix regexp. Perl is just my personal favorite style. Glad to hear it works.

Bego
User avatar
Bego
Master
Master
 
Posts: 357
Joined: Wed Nov 24, 2004 12:00 am
Location: Germany


Return to Find/Replace/Regular Expressions