This article presents CSV (Comma Separated Values) Excel files. It is part of Excel file types.
It explains how Excel displays text file content: how it converts it, how it distributes data in cells (or not).
Table of contents
How to create a file from outside Excel
How does Excel recognize text files
When Excel opens a file, it identifies whether it is a native Excel file or a text file (and its encoding):
- If the file starts with the bytes D0 CF 11 E0 A1 B1 1A E1 (source MSDN: Developing a tool to recognise MS Office file types (.doc, .xls, .mdb, .ppt)) + some other unknown conditions, it is identified as being an Excel native document
- Else, if the file starts with the bytes EF BB BF (UTF-8 Byte Order Mark, aka UTF-8 BOM), it is identified as being a text file in UTF-8 encoding
- Else, if the file starts with the bytes FF FE (UTF-16LE BOM), it is identified as being a text file in UTF-16LE (UTF-16 little endian) encoding
- Else, if the file starts with the bytes FE FF (UTF-16BE BOM), it is identified as being a text file in UTF-16BE (UTF-16 big endian) encoding
- Else, it is processed as if being a text file in current PC locale encoding (Windows control panel, Regional and language options, choose Country).
If the text file starts with characters <?xml version="1.0"?>, it is identified as being an XML text file.
Otherwise it will be identified as being a non-XML text file.
How to format text file, and how it is displayed in Excel
There is only one valid format: create the text file using UTF-16LE encoding, with Byte Order Mark, and with horizontal tabulation character as field separator.
For more information, see Note 1151258 - Error when sending Excel attachments, and also CSV tests of encoding and column separator which shows Excel screen captures of results according to Excel release, file encoding, Windows and user system locales, and column separator.
Data |
Valid anytime |
Valid sometimes (avoid) |
How it is displayed in Excel and other comments |
---|---|---|---|
Number |
=99999/100 |
999.99 or 999,99 |
The problem with using , or . is that it may not be interpreted correctly according to user's locale. A workaround is to define a formula with division by 100 to have 2 decimals. Negative numbers must be preceded by minus sign. If decimal part is zero, Excel doesn't display it (there is no workaround) |
Code with only digits |
="000099999" |
|
Considered as a text. All leading zeroes are displayed |
Date |
YYYY-MM-DD |
MM.DD.YYYY, DD/MM/YYYY, etc. |
will be displayed in the user date format |
Time |
HH:MM:SS |
|
will be displayed in the user time format |
Text with several lines |
(no universal solution) |
"line 1<NL>line 2" |
Only works with Excel 2007. <NL> is the new line character, see below; unfortunately excel does not autofit rows so we may not see immediately the remaining lines |
Field separator |
Horizontal tabulation character U+0009 |
Commas or Semicolons |
Depends on user's locale |
Comma or semicolon |
Whole cell content containing comma or semicolon must be delimited by 2 double quotes; example with 3 cells where first contains comma: "Amount, currency",10,EUR |
|
Avoid using comma or semicolon as field separator |
New line |
Either character U+000A, or character U+000D followed by U+000A |
|
|
What about the "Text Import Wizard"
You may have seen or heard of the Text Import Wizard. It is a tool for choosing manually the encoding, the column separator, etc., when you import a text file.
When Excel opens a file and it identifies it as being a text file, it may or may not display the Text Import Wizard, depending from where you start the file:
- if you open the file from an external application (File Explorer, Outlook, etc.), the Text Import Wizard is not displayed (though Microsoft states that it sometimes does)
- if you open it using the Excel File Open menu, the Text Import Wizard is displayed
Links
- MS KB 214295 - How to bypass the Text Import Wizard when you import files in Excel
- MS - Excel formatting and features that are not transferred to other file formats
From Excel, saving files as CSV
Why CSV file is not recognized correctly by SAP
Excel (2003 and after) adds 2 leading bytes FF FE for identifying the file as having encoding UTF-16LE (these leading bytes are called BOM). ABAP statement OPEN DATASET may interpret these 2 bytes and they are normally removed from the data stream. If the OPEN DATASET interprets these bytes as belonging to the data stream, please add complementary keywords to the statement. See ABAP documentation for more information.