当使用std :: fstream读取文本文件时,如何使用非默认分隔符? [英] How can I use non-default delimiters when reading a text file with std::fstream?

查看:704
本文介绍了当使用std :: fstream读取文本文件时,如何使用非默认分隔符?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

在我的C ++代码中,我想从一个文本文件(* .txt)中读取并对每个条目进行标记。更具体地说,我想要能够从文件中读取单个单词,例如format,stack,Jason,europe,等。

In my C++ code, I want to read from a text file (*.txt) and tokenize every entry. More specifically, I want to be able to read individual words from a file, such as "format", "stack", "Jason", "europe", etc.

我选择使用 fstream 来执行这个任务,我不知道如何将它的分隔符设置为我想使用的分隔符, \\\
,以及连字符,甚至是Mcdonal's中的撇号)。我想出的空间和 \\\
是默认分隔符,但连字符不是,但我想把它们作为分隔符,以便在解析文件时,我会得到的话blah blah xxx animal - catas simplyblah,blah,xxx,animal,cat。

I chose to use fstream to perform this task, and I do not know how to set it's delimiter to the ones I want to use (space, \n, as well as hyphens and even apostrophes as in "Mcdonal's"). I figured space and \n are the default delimiters, but hyphens are not, but I want to treat them as delimiters so that when parsing the file, I will get words in "blah blah xxx animal--cat" as simply "blah", "blah", "xxx", "animal", "cat".

想要能够从堆栈溢出,你是,等获得两个字符串,并仍然能够维护 \\\
and space as delimiters at the same time。

That is, I want to be able to get two strings from "stack-overflow", "you're", etc, and still be able to maintain \n and space as delimiters at the same time.

推荐答案

istream将空格视为分隔符。它使用区域设置来告诉它什么字符是空格。一个区域,反过来,包括分类字符类型的ctype facet 。这样的一个facet看起来可能是这样:

An istream treats "white space" as delimiters. It uses a locale to tell it what characters are white space. A locale, in turn, includes a ctype facet that classifies character types. Such a facet could look something like this:

#include <locale>
#include <iostream>
#include <algorithm>
#include <iterator>
#include <vector>
#include <sstream>

class my_ctype : public
std::ctype<char>
{
    mask my_table[table_size];
public:
    my_ctype(size_t refs = 0)  
        : std::ctype<char>(&my_table[0], false, refs)
    {
        std::copy_n(classic_table(), table_size, my_table);
        my_table['-'] = (mask)space;
        my_table['\''] = (mask)space;
    }
};

有一些测试程序可以显示它的工作原理:

And a little test program to show it works:

int main() {
    std::istringstream input("This is some input from McDonald's and Burger-King.");
    std::locale x(std::locale::classic(), new my_ctype);
    input.imbue(x);

    std::copy(std::istream_iterator<std::string>(input),
        std::istream_iterator<std::string>(),
        std::ostream_iterator<std::string>(std::cout, "\n"));

    return 0;
}

结果:

This
is
some
input
from
McDonald
s
and
Burger
King.

istream_iterator< string> $ c>>> 从流中读取单个字符串,因此如果直接使用它们,应该得到相同的结果。您需要包含的部分是创建语言环境,并使用 imbue 使流使用该语言环境。

istream_iterator<string> uses >> to read the individual strings from the stream, so if you use them directly, you should get the same results. The parts you need to include are creating the locale and using imbue to make the stream use that locale.

这篇关于当使用std :: fstream读取文本文件时,如何使用非默认分隔符?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆