Plain Text
Plain Text
The term is sometimes used quite loosely, to mean files that contain only
"readable" content (or just files with nothing that the speaker doesn't prefer).
For example, that could exclude any indication of fonts or layout (such as markup,
markdown, or even tabs); characters such as curly quotes, non-breaking spaces, soft
hyphens, em dashes, and/or ligatures; or other things.
In principle, plain text can be in any encoding, but occasionally the term is taken
to imply ASCII. As Unicode-based encodings such as UTF-8 and UTF-16 become more
common, that usage may be shrinking.
Plain text is also sometimes used only to exclude "binary" files: those in which at
least some parts of the file cannot be correctly interpreted via the character
encoding in effect. For example, a file or string consisting of "hello" (in
whatever encoding), following by 4 bytes that express a binary integer that is not
just a character(s), is a binary file, not plain text by even the loosest common
usages. Put another way, translating a plain text file to a character encoding that
uses entirely different numbers to represent characters does not change the meaning
(so long as you know what encoding is in use), but for binary files such a
conversion does change the meaning of at least some parts of the file.
Contents
1 Plain text and rich text
2 Usage
3 Encoding
3.1 Character encodings
3.2 Control codes
4 See also
5 References