如何在C ++中解析引号和逗号 [英] How to parse quotation marks and comma in c++

查看:107
本文介绍了如何在C ++中解析引号和逗号的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有一个大文件要解析.以前,它用spacecomma分开,我用sscanf(string, "%lf %lf ", &aa, &bb);将数据导入程序.

I have a huge file to parse. Previously, it was separated by either space or comma and I used sscanf(string, "%lf %lf ", &aa, &bb); to get the data into my program.

但是现在数据格式已更改为"122635.670399999","209705.752799999",同时带有逗号和引号.而且我不知道该如何处理.实际上,我以前的代码是在网上找到的,我很难找到解决此类问题的合适文档.如果您可以向我推荐一些,那就太好了.谢谢.

But now the data format is changed to "122635.670399999","209705.752799999", with both comma and quotation marks. And I have no idea how to deal with it. Actually, my previous code was found online and I had a really hard time finding a proper document for this kind of problems. It will be great if you can recommend some to me. Thanks.

推荐答案

不是读取字符串,而是删除字符串中的逗号和引号,最后将数据转换为数字,我可能会创建一个语言环境对象将逗号和引号归类为空格,使流具有该语言环境,并在没有其他条件的情况下读取数字.

Rather than read a string, then remove the commas and quotes from the strings, and finally convert the data to numbers, I'd probably create a locale object that classifies commas and quotes as white space, imbue the stream with that locale, and read the numbers without further adieu.

// here's our ctype facet:
class my_ctype : public std::ctype<char> {
public:
    mask const *get_table() { 
        static std::vector<std::ctype<char>::mask> 
            table(classic_table(), classic_table()+table_size);

        // tell it to classify quotes and commas as "space":
        table['"'] = (mask)space;
        table[','] = (mask)space;
        return &table[0];
    }
    my_ctype(size_t refs=0) : std::ctype<char>(get_table(), false, refs) { }
};

使用它,我们可以读取如下数据:

Using that, we can read the data something like this:

int main() { 
    // Test input from question:
    std::string input("\"122635.670399999\",\"209705.752799999\"");

    // Open the "file" of the input (from the string, for test purposes).
    std::istringstream infile(input);

    // Tell the stream to use the locale we defined above:
    infile.imbue(std::locale(std::locale(), new my_ctype));

    // Read the numbers into a vector of doubles:
    std:vector<double> numbers{std::istream_iterator<double>(infile),
                               std::istream_iterator<double>()};

    // Print out the sum of the numbers to show we read them:
    std::cout << std::accumulate(numbers.begin(), numbers.end(), 0.0);
}

请注意,一旦使用ctype构面为流添加了语言环境,我们就可以读取数字,就好像根本没有逗号和引号一样.由于ctype构面将它们归类为空格,因此它们在充当其他内容之间的分隔符时被完全忽略.

Note that once we've imbued the stream with a locale using our ctype facet, we can just read numbers as if the commas and quotes didn't exist at all. Since the ctype facet classifies them as white-space, they're completely ignored beyond acting as separators between other stuff.

我主要是为了指出这一点,以便在此之后的任何处理中都没有魔力.如果您愿意使用istream_iterator代替(例如)double value; infile >> value;,没有什么特别的.您可以使用通常读取以空格分隔的数字的任何方式来读取数字-因为就流而言,这正是您所拥有的 .

I'm pointing this out primarily to make clear that there's no magic in any of the processing after that. There's nothing special about using istream_iterator instead of (for example) double value; infile >> value; if you prefer to do that. You can read the numbers any of the ways you'd normally read numbers that were separated by white space -- because as far as the stream cares, that's exactly what you have.

这篇关于如何在C ++中解析引号和逗号的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆