C ++程序读取恒定(但未知)数列的未知大小的CSV文件(只用浮漂填写)到一个数组 [英] c++ program for reading an unknown size csv file (filled only with floats) with constant (but unknown) number of columns into an array

查看:274
本文介绍了C ++程序读取恒定(但未知)数列的未知大小的CSV文件(只用浮漂填写)到一个数组的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

想知道,如果有人可以帮我个忙我尝试建立一个程序,在从CSV文件未知大小的花车大数据块读取。我已经写了这个在MATLAB,但要编译和发布这一让人感动于C ++。

was wondering if someone could give me a hand im trying to build a program that reads in a big data block of floats with unknown size from a csv file. I already wrote this in MATLAB but want to compile and distribute this so moving to c++.

我只是在学习,试图在这一读来启动

Im just learning and trying to read in this to start

7,5,1989
2,4,2312

从文本文件。

code为止。

// Read in CSV
//
// Alex Byasse

#include <iostream>
#include <fstream>
#include <vector>
#include <string>
#include <sstream>
#include <stdlib.h>

int main() {

    unsigned int number_of_lines = 0;
    FILE *infile = fopen("textread.csv", "r");
    int ch;
    int c = 0;
    bool tmp = true;
    while (EOF != (ch=getc(infile))){
      if(',' == ch){
    ++c;
      }
      if ('\n' == ch){
    if (tmp){
      int X = c;
      tmp = false;
    }
            ++number_of_lines;
    }
    }
    fclose(infile);

  std::ifstream file( "textread.csv" );

  if(!file){
    std:cerr << "Failed to open File\n";
    return 1;
  }

  const int ROWS = X;
  const int COLS = number_of_lines;
  const int BUFFSIZE = 100;
  int array[ROWS][COLS];
  char buff[BUFFSIZE];
  std::string line; 
  int col = 0;
  int row = 0;
  while( std::getline( file, line ) )
  {
    std::istringstream iss( line );
    std::string result;
    while( std::getline( iss, result, ',' ) )
      {
        array[row][col] = atoi( result.c_str() );
        std::cout << result << std::endl;
        std::cout << "column " << col << std::endl;
        std::cout << "row " << row << std::endl;
        col = col+1;
    if (col == COLS){
    std:cerr << "Went over number of columns " << COLS;
    }
      }
    row = row+1;
    if (row == ROWS){
      std::cerr << "Went over length of ROWS " << ROWS;
    }
    col = 0;
  }
  return 0;
}

我的MATLAB code我用的是>>

My matlab code i use is >>

fid = fopen(twoDM,'r');

s = textscan(fid,'%s','Delimiter','\n');
s = s{1};
s_e3t = s(strncmp('E3T',s,3));
s_e4q = s(strncmp('E4Q',s,3));
s_nd = s(strncmp('ND',s,2));

[~,cell_num_t,node1_t,node2_t,node3_t,mat] = strread([s_e3t{:}],'%s %u %u %u %u %u');
node4_t = node1_t;
e3t = [node1_t,node2_t,node3_t,node4_t];
[~,cell_num_q,node1_q,node2_q,node3_q,node_4_q,~] = strread([s_e4q{:}],'%s %u %u %u %u %u %u');
e4q = [node1_q,node2_q,node3_q,node_4_q];
[~,~,node_X,node_Y,~] = strread([s_nd{:}],'%s %u %f %f %f');

cell_id = [cell_num_t;cell_num_q];
[~,i] = sort(cell_id,1,'ascend');

cell_node = [e3t;e4q];
cell_node = cell_node(i,:);

任何帮助AP preciated。
亚历克斯

Any help appreciated. Alex

推荐答案

我想,很明显,只是使用输入输出流。阅读从CSV文件均匀阵列或者阵列,而无需使用任何引用打扰是相当简单:

I would, obviously, just use IOStreams. Reading a homogeneous array or arrays from a CSV file without having to bother with any quoting is fairly trivial:

#include <iostream>
#include <sstream>
#include <string>
#include <vector>

std::istream& comma(std::istream& in)
{
    if ((in >> std::ws).peek() != std::char_traits<char>::to_int_type(',')) {
        in.setstate(std::ios_base::failbit);
    }
    return in.ignore();
}

int main()
{
    std::vector<std::vector<double>> values;
    std::istringstream in;
    for (std::string line; std::getline(std::cin, line); )
    {
        in.clear();
        in.str(line);
        std::vector<double> tmp;
        for (double value; in >> value; in >> comma) {
            tmp.push_back(value);
        }
        values.push_back(tmp);
    }

    for (auto const& vec: values) {
        for (auto val: vec) {
            std::cout << val << ", ";
        }
        std::cout << "\n";
    }
}

鉴于该文件的结构简单,逻辑实际上可以简化为:代替单独读出的值,如果该隔板自动读取每一行可被视为一个值序列。由于逗号不会被自动读取,逗号是由间隔为内部线路创建字符串流之前更换。相应的code变为

Given the simple structure of the file, the logic can actually be simplified: Instead of reading the values individually, each line can be viewed as a sequence of values if the separators are read automatically. Since a comma won't be read automatically, the commas are replaced by spaced before creating the string stream for the internal lines. The corresponding code becomes

#include <algorithm>
#include <fstream>
#include <iostream>
#include <iterator>
#include <sstream>
#include <string>
#include <vector>

int main()
{
    std::vector<std::vector<double> > values;
    std::ifstream fin("textread.csv");
    for (std::string line; std::getline(fin, line); )
    {
        std::replace(line.begin(), line.end(), ',', ' ');
        std::istringstream in(line);
        values.push_back(
            std::vector<double>(std::istream_iterator<double>(in),
                                std::istream_iterator<double>()));
    }

    for (std::vector<std::vector<double> >::const_iterator
             it(values.begin()), end(values.end()); it != end; ++it) {
        std::copy(it->begin(), it->end(),
                  std::ostream_iterator<double>(std::cout, ", "));
        std::cout << "\n";
    }
}

下面是发生了什么:


  1. 目标定义为的载体双的向量。没有任何保证,不同的行的大小相同,但一旦该文件被读取,这是微不足道的检查。

  2. 的std :: ifstream的定义,并且与文件初始化。这可能是值得检查施工后的文件,看看它是否可以打开阅读(如果(FIN){性病::法院LT&;!&LT;打开失败... \\ N的; )。

  3. 文件处理一行一次。该行只是简单的读取使用的std ::函数getline()将它们读入的std ::字符串。当的std ::函数getline()失败,无法读取的另一条线和转换结束。

  4. 一旦被读取,所有的逗号之间用空格代替。

  5. 从这样修改读取行的字符串流构造。原来的code重用的的std :: istringstream 被宣布外循环,以节省建设流所有的时间成本。由于当线路完成流坏了,它首先需要的是 in.clear()其内容之前编定为 in.str (线)

  6. 各个值是使用迭代的std :: istream_iterator&LT;双&GT; 刚刚读它的构造与流的值。迭代给定是该序列的开始和默认构造迭代器的序列的末尾。

  7. 的迭代器产生的值的序列用于立即建立一个临时的的std ::矢量&lt;双方式&gt; 重新presenting行

  8. 临时矢量推到目标数组的结尾。

  1. The destination values is defined as a vector of vectors of double. There isn't anything guaranteeing that the different rows are the same size but this is trivial to check once the file is read.
  2. An std::ifstream is defined and initialized with the file. It may be worth checking the file after construction to see if it could be opened for reading (if (!fin) { std::cout << "failed to open...\n";).
  3. The file is processed one line at a time. The lines are simply read using std::getline() to read them into a std::string. When std::getline() fails it couldn't read another line and the conversion ends.
  4. Once the line is read, all commas are replaced by spaces.
  5. From the thus modified line a string stream for reading the line is constructed. The original code reused a std::istringstream which was declared outside the loop to save the cost of constructing the stream all the time. Since the stream goes bad when the lines is completed, it first needed to be in.clear()ed before its content was set with in.str(line).
  6. The individual values are iterated using an std::istream_iterator<double> which just read a value from the stream it is constructed with. The iterator given in is the start of the sequence and the default constructed iterator is the end of the sequence.
  7. The sequence of values produced by the iterators is used to immediately construct a temporary std::vector<double> representing a row.
  8. The temporary vector is pushed to the end of the target array.

之后,一切都在使用C ++ 11的功能简单的打印产生的矩阵的内容(范围为基础并与变量汽车 matically推导型)。

Everything after that is simply printing the content of the produced matrix using C++11 features (range-based for and variables with automatically deduced type).

这篇关于C ++程序读取恒定(但未知)数列的未知大小的CSV文件(只用浮漂填写)到一个数组的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆