修改C文件的现有内容 [英] modify existing contents of file in c

查看:128
本文介绍了修改C文件的现有内容的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

int main()
{
    FILE *ft;
    char ch;
    ft=fopen("abc.txt","r+");
    if(ft==NULL)
    {
        printf("can not open target file\n");
        exit(1);
    }
    while(1)
    {
        ch=fgetc(ft);
        if(ch==EOF)
        {
            printf("done");
            break;
        }
        if(ch=='i')
        {
            fputc('a',ft);
        }
    }
    fclose(ft);
    return 0;
}

正如人们可以看到,我想在 I 被替换这样的方式来修改的abc.txt A 在里面。结果
该程序工作正常,但是当我打开的abc.txt 外,它似乎是未经编辑的。结果
任何可能的原因是什么?

As one can see that I want to edit abc.txt in such a way that i is replaced by a in it.
The program works fine but when I open abc.txt externally, it seemed to be unedited.
Any possible reason for that?

为什么在这种情况下,后字符我不受替换,因为答案建议?

Why in this case the character after i is not replace by a, as the answers suggest?

推荐答案

有多个问题:


  1. 龟etc()返回 INT ,而不是字符;它必须返回每一个有效的字符值加上一个单独的值,EOF。正如所写的,你不能可靠地检测EOF。如果字符是一个无符号的类型,你永远也找不到EOF;如果字符是一个符号类型,你会错误识别一些有效的字符(通常是Y,Y变音符,U + 00FF,拉丁小写字母Y带分音符)作为EOF。

  1. fgetc() returns an int, not a char; it has to return every valid char value plus a separate value, EOF. As written, you can't reliably detect EOF. If char is an unsigned type, you'll never find EOF; if char is a signed type, you'll misidentify some valid character (often ÿ, y-umlaut, U+00FF, LATIN SMALL LETTER Y WITH DIAERESIS) as EOF.

如果您在打开更新模式文件的输入和输出之间切换,你必须使用一个文件定位操作( fseek的()退(),名义上 fsetpos());您必须使用写作与阅读之间的定位操作或 fflush()

If you switch between input and output on a file opened for update mode, you must use a file positioning operation (fseek(), rewind(), nominally fsetpos()) between reading and writing; and you must use a positioning operation or fflush() between writing and reading.

这是一个好主意,关闭你打开(现在固定在code)。

It is a good idea to close what you open (now fixed in the code).

如果你写的工作,你会覆盖之后的字符我 A

If your writes worked, you'd overwrite the character after the i with a.

这些变化导致:

#include <stdio.h>
#include <stdlib.h>

int main(void)
{
    FILE *ft;
    char const *name = "abc.txt";
    int ch;
    ft = fopen(name, "r+");
    if (ft == NULL)
    {
        fprintf(stderr, "cannot open target file %s\n", name);
        exit(1);
    }
    while ((ch = fgetc(ft)) != EOF)
    {
        if (ch == 'i')
        {
            fseek(ft, -1, SEEK_CUR);
            fputc('a',ft);
            fseek(ft, 0, SEEK_CUR);
        }
    }
    fclose(ft);
    return 0;
}

有空间可以容纳更多的错误检查。

There is room for more error checking.

fseek的(英尺,0,SEEK_CUR);由C标准要求语句

¶7(+​​作为第二或第三的字符当文件被打开与更新模式
  上述模式的参数值的列表),输入和输出可以在被执行
  相关的数据流。的但是,输出不得直接依次输入无
  干预调用 fflush 函数或文件定位功能( fseek的
   fsetpos 退),并输入不得直接跟着输出无
  居间呼叫到文件定位功能,除非输入操作遇到结束OF-
  文件
的打开(或创建)与更新模式的文本文件可以代替打开(或创建)
  二进制流在一些实施

ISO/IEC 9899:2011 §7.21.5.3 The fopen function

¶7 When a file is opened with update mode ('+' as the second or third character in the above list of mode argument values), both input and output may be performed on the associated stream. However, output shall not be directly followed by input without an intervening call to the fflush function or to a file positioning function (fseek, fsetpos, or rewind), and input shall not be directly followed by output without an intervening call to a file positioning function, unless the input operation encounters end-of- file. Opening (or creating) a text file with update mode may instead open (or create) a binary stream in some implementations.

(着重号。)

从ISO行情/ IEC 9899:2011,目前的C标准

Quotes from ISO/IEC 9899:2011, the current C standard.

§7.21输入/输出&LT;&stdio.h中GT;

§7.21 Input/output <stdio.h>

§7.21.1简介

EOF 它扩展成一个整型常量前pression,int型的和负的值,即
  是由几个函数返回,以指示结束文件,也就是从一个没有更多的输入
  流;

EOF which expands to an integer constant expression, with type int and a negative value, that is returned by several functions to indicate end-of-file, that is, no more input from a stream;

§7.21.7.1的龟etc 函数

§7.21.7.1 The fgetc function

INT龟etc(FILE *流);

¶2如果输入流的结束的文件指示符指向流没有设置和
  下一个字符是present,在龟etc 函数获取字符作为 unsigned char型转换为 INT 并提出了相关的文件位置指示器
  流(如果定义)。

¶2 If the end-of-file indicator for the input stream pointed to by stream is not set and a next character is present, the fgetc function obtains that character as an unsigned char converted to an int and advances the associated file position indicator for the stream (if defined).

返回

¶3如果该流的结束的文件指示符被设置,或者如果流是在文件结束-,对于流的结束的文件指示符被设置并且龟etc 函数返回EOF。否则,该
  龟etc 函数返回的下一个字符从输入流通过流指向。
  如果发生读取错误,流的错误指示器设置和龟etc 功能
  返回EOF。 289)

¶3 If the end-of-file indicator for the stream is set, or if the stream is at end-of-file, the end-of-file indicator for the stream is set and the fgetc function returns EOF. Otherwise, the fgetc function returns the next character from the input stream pointed to by stream. If a read error occurs, the error indicator for the stream is set and the fgetc function returns EOF.289)

289)的结束文件和读取错误可以通过使用加以区分的feof FERROR 功能。

289) An end-of-file and a read error can be distinguished by use of the feof and ferror functions.

因此​​, EOF 是一个负整数(通常为-1,但标准不要求)。在龟etc()函数或者返回unsigned char型(范围为0 EOF或字符作为一个的值。 .UCHAR_MAX,通常0..255)。

So, EOF is a negative integer (conventionally it is -1, but the standard does not require that). The fgetc() function either returns EOF or the value of the character as an unsigned char (in the range 0..UCHAR_MAX, usually 0..255).

§6.2.5类型

¶3字符声明为类型的对象是大到足以存储基本的任何成员
  执行字符集。如果基本执行字符集的一个成员被存储在一个
  字符的对象,它的值保证为非负。如果任何其他的字符被存储在
  一个字符对象,所得到的值是实现定义的,但应在范围之内
  可以重新在该类型psented $ P $值。

¶3 An object declared as type char is large enough to store any member of the basic execution character set. If a member of the basic execution character set is stored in a char object, its value is guaranteed to be nonnegative. If any other character is stored in a char object, the resulting value is implementation-defined but shall be within the range of values that can be represented in that type.

¶5声明为类型的对象符号字符占据相同的存储量为
  ''纯''字符对象。

¶5 An object declared as type signed char occupies the same amount of storage as a ‘‘plain’’ char object.

§6对于每个符号整数类型的,有一个相应的(但不同)的无符号
  整数型(与关键字指定无符号),使用相同数量的
  存储(包括符号信息)并具有相同的对齐要求。

§6 For each of the signed integer types, there is a corresponding (but different) unsigned integer type (designated with the keyword unsigned) that uses the same amount of storage (including sign information) and has the same alignment requirements.

§15三种类型的字符符号字符 unsigned char型统称
  字符类型。实现应定义字符具有相同的范围内,
  再presentation,和行为要么符号字符 unsigned char型 45)

§15 The three types char, signed char, and unsigned char are collectively called the character types. The implementation shall define char to have the same range, representation, and behavior as either signed char or unsigned char.45)

45) CHAR_MIN ,在&LT; limits.h中&GT; ,将有值 0 SCHAR_MIN 中的一个,这样就可以
  用于区分两个选项。无论所做的选择,字符是一个独立的类型从
  另外两个是不兼容的两种

45) CHAR_MIN, defined in <limits.h>, will have one of the values 0 or SCHAR_MIN, and this can be used to distinguish the two options. Irrespective of the choice made, char is a separate type from the other two and is not compatible with either.

这证明我的说法,即纯字符可为带符号或无符号类型。

This justifies my assertion that plain char can be a signed or an unsigned type.

现在考虑:

char c = fgetc(fp);
if (c == EOF)
   …

假设龟etc()返回EOF和纯字符是一个无符号(8位)型和EOF是 1 。分配会将为0xFF到 C ,这是一个正整数。当比较时, C 被提升到一个 INT (因此该值255),255是不是消极的,所以比较失败。

Suppose fgetc() returns EOF, and plain char is an unsigned (8-bit) type, and EOF is -1. The assignment puts the value 0xFF into c, which is a positive integer. When the comparison is made, c is promoted to an int (and hence to the value 255), and 255 is not negative, so the comparison fails.

相反,假设纯字符是一个有符号(8位)类型和字符集是ISO 8859-15。如果龟etc()返回y,分配值将是位模式0b11111111,这是一样的 1 ,所以在比较中, C 将被转换为 1 和比较 C = = EOF 将返回true,即使一个有效的字符被读取。

Conversely, suppose that plain char is a signed (8-bit) type and the character set is ISO 8859-15. If fgetc() returns ÿ, the value assigned will be the bit pattern 0b11111111, which is the same as -1, so in the comparison, c will be converted to -1 and the comparison c == EOF will return true even though a valid character was read.

您可以调整细节,但基本的参数,而保持有效的sizeof(char)的&LT;的sizeof(INT)。有DSP芯片,其中不适用;你不得不重新考虑的规则。即使如此,基本点保持; 龟etc()返回 INT ,而不是字符

You can tweak the details, but the basic argument remains valid while sizeof(char) < sizeof(int). There are DSP chips where that doesn't apply; you have to rethink the rules. Even so, the basic point remains; fgetc() returns an int, not a char.

如果你的数据是真正的ASCII(7位数据),那么所有字符的范围是0..127,你会不会遇到ÿ问题的misinter pretation。但是,如果你的字符类型是无符号的,你仍然有无法检测EOF的问题,让你的程序将很长一段时间运行。如果您需要考虑便携性,你会考虑到这一点。这些都是你需要处理的C程序员的专业级问题。您可以杂牌组装电脑用自己的方式,你的系统上工作相对轻松地将数据和没有采取所有这些细微之处考虑的方案。但是,你的程序将无法在其他人的系统工作。

If your data is truly ASCII (7-bit data), then all characters are in the range 0..127 and you won't run into the misinterpretation of ÿ problem. However, if your char type is unsigned, you still have the 'cannot detect EOF' problem, so your program will run for a long time. If you need to consider portability, you will take this into account. These are the professional grade issues that you need to handle as a C programmer. You can kludge your way to programs that work on your system for your data relatively easily and without taking all these nuances into account. But your program won't work on other people's systems.

这篇关于修改C文件的现有内容的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆