解析与多个通用分隔符℃的文件 [英] Parse a file with multiple common delimiters C

查看:99
本文介绍了解析与多个通用分隔符℃的文件的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我工作在C基本命令行的音乐库,让你打开通过命令行的文件,并像一个艺术家,歌名和今年出版添加信息。一旦其离开,它然后写入该信息返回到相同的文件。

I'm working on a basic command line music library in C that lets you open a file via command line, and add information like an artist, song title, and year published. Once it exits, it then writes that information back into the same file.

问题我在试图找到一个解决方案正确解析文本文件。

The problem I'm having is trying to find a solution to correctly parse the text file.

例如,输入文件看起来是这样的:

For example, the input file will look something like this:

Title: Heirloom, Artist: Basenji, Year Published: 2014
Title: With Me, Artist: Cashmere Cat, Year Published: 2014

我的工作指定(针对常见的做法),我们在存储信息的一个行中的项目结构宋,看起来像这样:

struct Song {
    char title[250];
    char artist[250];
    int year_published;
};

每个歌曲存储在类型的数组结构宋,名为 music_lib []

Each Song is stored in array of type struct Song, called music_lib[].

我知道如何在每一行分成一个特定的结构宋这样做:

I know how to separate each line into one specific struct Song by doing:

while(fscanf(input_file, "%s %s %ld", *temp_title, *temp_artist, *temp_year) != EOF)
    copy_song_to_music_library(temp_title, temp_artist, temp_year);

我不知道该怎么做的是如何正确地分析文本文件,这样当我有一个已知的格式:

What I don't know how to do is how to properly parse the text file so that when I have a known format:

标题:传家宝,艺术家:购买basenji,出版年份日期:2014

有关我的标题变量,我得到宅变(与标题:被剥离出来),我的艺术家变量,我得到寻租basenji(与艺术家:剥离出来),和我当年可变我得到2014年(含今年出版:剥离出来)

For my title variable, I get "Heirloom" (and Title: is stripped out), for my artist variable, I get "Basenji" (with artist: stripped out), and for my year variable I get 2014 (with year published: stripped out).

有没有一种简单的方法来做到这一点?

Is there an easy way to do this?

推荐答案

您需要修改

while(fscanf(input_file, "%s %s %ld", *temp_title, *temp_artist, *temp_year) != EOF)

while(fscanf(input_file, "Title: %s, Artist: %s, Year Published: %ld", *temp_title, *temp_artist, *temp_year) != EOF)

此外,您还需要检查中的fscanf返回值(),以确保正确的阅读。

的fscanf()<的手册页 / code>

。 。 。返回成功匹配和分配的输入项目数,它可以是少于在早期匹配失败的情况下的规定,甚至是零。

. . . return the number of input items successfully matched and assigned, which can be fewer than provided for, or even zero in the event of an early matching failure.

一些相关的参考:

这(家庭)函数的签名

int fscanf(FILE *stream, const char *format, ...);

其中,为const char *格式被描述为

格式的字符串,其中包含描述如何处理输入字符序列指令序列。

The format string consists of a sequence of directives which describe how to process the sequence of input characters.

格式预期格式为[重点煤矿]

and the expected format for format is [emphasis mine]

一个指令是以下之一:

•的空白字符的序列(空格,制表符,换行符等;看到isspace为(3))。该指令匹配的空白,包括没有任何金额,在输入

• A sequence of white-space characters (space, tab, newline, etc.; see isspace(3)). This directive matches any amount of white space, including none, in the input.

•普通字符(即,一个比空格或%等)。该字符必须输入的下一个字符完全匹配​​。

•A转换规范,以'%'(百分号)字符开始。从输入字符的序列,根据本说明书中转换,并将​​结果放置在相应的指针参数。如果输入的下一个项目不符合规范的转换,转换失败,这是一个失败的匹配

• A conversion specification, which commences with a '%' (percent) character. A sequence of characters from the input is converted according to this specification, and the result is placed in the corresponding pointer argument. If the next item of input does not match the conversion specification, the conversion fails-this is a matching failure.

请注意:

但是,使其更广义的,我会推荐使用与fgets()取输入,然后用的strtok()来标记输入和使用。

However, to make it more generalized, I'll recommend using fgets() to take the input, then using strtok() to tokenize the input and use.

这篇关于解析与多个通用分隔符℃的文件的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆