为什么当我们在文件中写入\ n时将其转换为\ r \ n组合? [英] why when we write \n in the file it converts into \r\n combination?

查看:384
本文介绍了为什么当我们在文件中写入\ n时将其转换为\ r \ n组合?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我从书中读到了这个概念,当我们尝试使用fputs()将\ n写入文件时,fputs()会将\ n转换为\ r \ n组合,然后,如果我们使用fgets读回同一行( )发生反向转换意味着\ r \ n向后转换为\ n. 我不明白这背后的目的是什么?

I read this concept from book that when we attemp to write \n to the file using fputs(), fputs() converts the \n to \r\n combination and then if we read the same line back using fgets () the reverse conversion happens means \r\n back convert to \n. I don't get that what is the purpose behind this?

推荐答案

简而言之,DOS是这样做的原因.

Succinctly, DOS is the reason for this.

不同的系统对于行尾有不同的约定. Unix认为一个字符'\n'足以标记一行的结尾. DOS决定需要两个字符'\r''\n',尽管其他系统也使用该约定. Mac OS 1-9(Mac OS X之前的版本)仅使用'\r'.其他系统可以使用计数和行数据而不是行末尾,或者可以模拟打孔的卡片,其中空白的长度最大为固定长度(72或80). Unix也不区分二进制文件和文本文件. DOS可以. (DOS还使用 Control-Z 在文本文件中标记EOF.Unix没有EOF标记;它确切知道文件的大小,并使用该长度来确定何时到达EOF. )

Different systems have different conventions for line endings. Unix reckons one character, '\n', is sufficient to mark the end of a line. DOS decided that it needed two characters, '\r' and '\n', though other systems also used that convention. The versions of Mac OS 1-9 (prior to Mac OS X) used just '\r' instead. Other systems could use a count and the line data instead of a line ending, or could simulate punched cards with blanks up to a fixed length (72 or 80). Unix also doesn't distinguish between binary and text files; DOS does. (DOS also uses Control-Z to mark EOF in a text file. Unix doesn't have an EOF marker; it knows exactly how big the file is and uses that length to determine when it has reached EOF.)

C起源于Unix,但是为了使在系统之间迁移代码更容易,标准的I/O包定义了当它在处理文本文件时,输入端将转换以单个字符作为统一输入,输出端会将'\n'转换为本地行结尾.

C originate on Unix, but to make it easier to migrate code between the systems, the standard I/O package defined that when it was working on text files, the input side would convert a native line ending to the single '\n' character for uniform input, and the output side would convert a '\n' to the native line ending.

但是,提及文本文件还意味着需要有二进制文件,而这些映射不会发生.

However, the mention of text files also meant that there needed to be binary files, where these mappings do not occur.

您可能会注意到,大多数Internet协议(例如HTTP)都将CRLF(回车,换行或'\r''\n')强制用作行标记的结尾.

You might note that most of the internet protocols (HTTP, for example) mandate CRLF (carriage return, line feed, or '\r', '\n') for the end of line markers.

(实际上,在MS-DOS或PC-DOS中指责DOS有点不公平.在DOS出现之前,还有其他系统使用CRLF线路末端约定,并且它们可能在Internet上更具影响力.但是,几乎所有这些祖先系统基本上都已失效,Windows成为当今环境,在这种环境中,二进制文件和文本文件之间的区别很重要,并且您会遇到CRLF行尾.)

(Actually, blaming DOS, as in MS-DOS or PC-DOS, is a little unfair. There were other systems that used the CRLF line end convention before DOS existed, and they may have been more influential on the Internet. However, almost all those ancestral systems are substantially defunct, and Windows is the environment that you'll run into these days where the distinction between binary and text files matters, and where you'll encounter CRLF line endings.)

请注意,C标准对文本文件有这样的说法:

Note that the C standard has this to say about text files:

¶2文本流是由几行组成的有序字符序列,每一行 由零个或多个字符加上一个换行符结尾.是否 最后一行要求终止换行符是实现定义的.人物 可能必须在输入和输出上添加,更改或删除以符合不同 在主机环境中表示文本的约定.因此,流中的字符与外部字符之间不必一一对应. 表示.从文本流读取的数据必须与数据进行比较 仅在以下情况下才将其写到该流中:数据仅由打印组成 字符和控制字符水平制表符和换行符;没有换行符 紧跟在空格字符之后;最后一个字符是换行符. 是否在换行符之前立即写出空格字符 当读入是实现定义的时出现.

¶2 A text stream is an ordered sequence of characters composed into lines, each line consisting of zero or more characters plus a terminating new-line character. Whether the last line requires a terminating new-line character is implementation-defined. Characters may have to be added, altered, or deleted on input and output to conform to differing conventions for representing text in the host environment. Thus, there need not be a one-to-one correspondence between the characters in a stream and those in the external representation. Data read in from a text stream will necessarily compare equal to the data that were earlier written out to that stream only if: the data consist only of printing characters and the control characters horizontal tab and new-line; no new-line character is immediately preceded by space characters; and the last character is a new-line character. Whether space characters that are written out immediately before a new-line character appear when read in is implementation-defined.

这可能会或可能不会发生很多事情.请特别注意,根据标准,写入文件的尾随空白可能会或可能不会出现在输入中.这样一来,支持打孔卡图像或固定长度记录的系统就可以符合标准.

That's a lot of things that might or might not happen. Note, in particular, that trailing blanks written to a file might, or might not, appear in the input — according to the standard. That allows the systems that support punched card images or fixed length records to comply with the standard.

也请注意(如 Giacomo Degli Eposti 所指出的),这一切都意味着如果您以二进制模式打开最初写为文本文件的文件,则可能会从I/O系统获得完全不同的字节列表.您将在每个换行符中看到两个字符;您可能会看到一个 Control-Z ,后跟其他字符(可能为空字节),直到块"边界为止,该边界可能是256字节的倍数,等等.

Note, too (as pointed out by Giacomo Degli Eposti), that this all means that if you open a file in binary mode that was originally written as a text file, you may very well get a significantly different list of bytes back from the I/O system. You'll see two characters per newline; you might see a Control-Z followed by other characters (possibly null bytes) up to a 'block' boundary that might be a multiple of 256 bytes, etc.

这篇关于为什么当我们在文件中写入\ n时将其转换为\ r \ n组合?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆