Page tree
Skip to end of metadata
Go to start of metadata

This article presents CSV (Comma Separated Values) Excel files. It is part of Excel file types.

It explains how Excel displays text file content: how it converts it, how it distributes data in cells (or not).

Table of contents

How to create a file from outside Excel

How does Excel recognize text files

When Excel opens a file, it identifies whether it is a native Excel file or a text file (and its encoding):

  • If the file starts with the bytes D0 CF 11 E0 A1 B1 1A E1 (source MSDN: Developing a tool to recognise MS Office file types (.doc, .xls, .mdb, .ppt)) + some other unknown conditions, it is identified as being an Excel native document
  • Else, if the file starts with the bytes EF BB BF (UTF-8 Byte Order Mark, aka UTF-8 BOM), it is identified as being a text file in UTF-8 encoding
  • Else, if the file starts with the bytes FF FE (UTF-16LE BOM), it is identified as being a text file in UTF-16LE (UTF-16 little endian) encoding
  • Else, if the file starts with the bytes FE FF (UTF-16BE BOM), it is identified as being a text file in UTF-16BE (UTF-16 big endian) encoding
  • Else, it is processed as if being a text file in current PC locale encoding (Windows control panel, Regional and language options, choose Country).

If the text file starts with characters <?xml version="1.0"?>, it is identified as being an XML text file.
Otherwise it will be identified as being a non-XML text file.

How to format text file, and how it is displayed in Excel

There is only one valid format: create the text file using UTF-16LE encoding, with Byte Order Mark, and with horizontal tabulation character as field separator.

For more information, see Note 1151258 - Error when sending Excel attachments, and also CSV tests of encoding and column separator which shows Excel screen captures of results according to Excel release, file encoding, Windows and user system locales, and column separator.

Data

Valid anytime

Valid sometimes (avoid)

How it is displayed in Excel and other comments

Number

=99999/100
=-50

999.99 or 999,99

The problem with using , or . is that it may not be interpreted correctly according to user's locale. A workaround is to define a formula with division by 100 to have 2 decimals. Negative numbers must be preceded by minus sign. If decimal part is zero, Excel doesn't display it (there is no workaround)

Code with only digits

="000099999"

 

Considered as a text. All leading zeroes are displayed

Date

YYYY-MM-DD

MM.DD.YYYY, DD/MM/YYYY, etc.

will be displayed in the user date format

Time

HH:MM:SS

 

will be displayed in the user time format

Text with several lines

(no universal solution)

"line 1<NL>line 2"

Only works with Excel 2007. <NL> is the new line character, see below; unfortunately excel does not autofit rows so we may not see immediately the remaining lines

Field separator

Horizontal tabulation character U+0009

Commas or Semicolons

Depends on user's locale

Comma or semicolon
(when also used as field separator)

Whole cell content containing comma or semicolon must be delimited by 2 double quotes; example with 3 cells where first contains comma: "Amount, currency",10,EUR

 

Avoid using comma or semicolon as field separator

New line

Either character U+000A, or character U+000D followed by U+000A

 

 

What about the "Text Import Wizard"

You may have seen or heard of the Text Import Wizard. It is a tool for choosing manually the encoding, the column separator, etc., when you import a text file.
When Excel opens a file and it identifies it as being a text file, it may or may not display the Text Import Wizard, depending from where you start the file:

  • if you open the file from an external application (File Explorer, Outlook, etc.), the Text Import Wizard is not displayed (though Microsoft states that it sometimes does)
  • if you open it using the Excel File Open menu, the Text Import Wizard is displayed

From Excel, saving files as CSV

Why CSV file is not recognized correctly by SAP

Excel (2003 and after) adds 2 leading bytes FF FE for identifying the file as having encoding UTF-16LE (these leading bytes are called BOM). ABAP statement OPEN DATASET may interpret these 2 bytes and they are normally removed from the data stream. If the OPEN DATASET interprets these bytes as belonging to the data stream, please add complementary keywords to the statement. See ABAP documentation for more information.