You should read the Wikipedia article about Comma-separated values
. Your specific problem is that you have cells with line breaks. It is allowed in CSV files to embed a line break. In such cases a table row spreads over multiple lines in the CSV file. But text editors interpret a line break always as a line break. Text editors don't know that when a line break exists within "..." that it is a line break of the text inside a cell and not the line break after last cell of a row and this line break should be ignored when displaying the CSV file.
However, you can even see CSV files with embedded line breaks in a text cell correct, if you know some more details. MS Excel saves spreadsheets as CSV always with 0D 0A (carriage return + line feed) as terminator of a table row (= line) which is also called DOS line endings. But MS Excel saves line breaks inside a cell with 0A (line feed) only which is normally used for Unix files. So there is a difference between end of a table row and end of a line of a cell text. Normally such files are interpreted as mixed DOS/Unix files and most programs like UltraEdit or a browser when downloading a CSV file automatically corrects the 0A without the 0D by inserting 0D so that the whole file contains only DOS line endings. Such automatic corrections are also the reason why sample text files should be always packed with ZIP, RAR, etc. and the archive should be uploaded for the others and not the plain text file.
Okay, how does this difference in saving of "line endings vs. line break" may help you? UltraEdit has at Advanced - Configuration - File Handling - DOS/UNIX/MAC Handling
the setting Only recognize DOS terminated lines (CR/LF) as new lines for editing
. If you enable this setting you will see now the first table row also as a single line in UltraEdit. You will notice that this single 0A is displayed in the text with a rectangle. You can now use a regular expression replace to convert those line feeds without a preceding carriage returns to a normal text. I use for example following:
This regular expression in UltraEdit syntax
replaces the single line feed (hex code 0A) by the text \n
. When I now see a \n
in the text I know here is a line break in the table cell.