我如何正确解析由空格分隔的文本文件 [英] how can i correctly parse a text file delimited by white space

查看:228
本文介绍了我如何正确解析由空格分隔的文本文件的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

下面是我的示例文本文件

Below is my sample text file

{

{

这是我的架构文件

[Sample File.txt]
ColNameHeader=True
Format=TabDelimited
CharacterSet=ANSI

这是我到目前为止为尝试读取上述示例文件而编写的代码,应该将从上面的文本文件读取的数据行返回以显示在dataGridView控件中.问题是,它作为单列返回,但是我想使用那些空白作为列定界符.我尝试了不同的字符分隔符,但没有成功.

And here is the code i have so far writen to try and read the above sample file, the data rows read from the text file above is supposed to be returned for display in a dataGridView control. The problem is, its being returned as single column, yet i want to use those white spaces as the column delimiters. I have tried different character delimiters with out success.

public DataSet LoadCSV(int numberOfRows)
    {
        DataSet ds = new DataSet();
            // Creates and opens an ODBC connection
            string strConnString = "Driver={Microsoft Text Driver (*.txt; *.csv)};Dbq=" + this.dirCSV.Trim() + ";Extensions=asc,csv,tab,txt;Persist Security Info=False";

            string sql_select;
            OdbcConnection conn;
            conn = new OdbcConnection(strConnString.Trim());
            conn.Open();

            //Creates the select command text
            if (numberOfRows == -1)
            {
                sql_select = "select * from [" + this.FileNevCSV.Trim() + "]";
            }
            else
            {
                sql_select = "select top " + numberOfRows + " * from [" + this.FileNevCSV.Trim() + "]";
            }

            //Creates the data adapter
            OdbcDataAdapter obj_oledb_da = new OdbcDataAdapter(sql_select, conn);

            //Fills dataset with the records from CSV file
            obj_oledb_da.Fill(ds, "csv");

            //closes the connection
            conn.Close();

        return ds;
    }

并将dataGridView的数据源设置为

And setting the dataGridView's data source like to

    // loads the first 500 rows from CSV file
this.dataGridView_preView.DataSource = LoadCSV(500);
this.dataGridView_preView.DataMember = "csv";

i,在datagridview中获得此信息,我得到一列,但我希望看到返回的数据为7列.

i, get this in the datagridview, i get one column yet i expect to see the data returned as 7 columns.

另外,我不知道F2和F3列来自哪里

Plus, i have no idea where F2 and F3 columns are coming from

推荐答案

我可能会以其他方式执行此操作.我将使用StreamReader,并逐行读取文件,将字符串分解为对象属性,然后将对象存储在列表中.然后,将列表绑定到datagridviews数据源.我演示了两种快速的方法来实现此目的.

I would probably do this a different way. I would use a StreamReader, and read in the file line by line, break the string up into object properties, and store the objects in a list. Then you bind the list to the datagridviews datasource. I demonstrate two quick ways to do this.

如果文件似乎是用制表符分隔的,则将该行拆分为一个数组,然后将每个索引分配给一个这样的属性.

If the file is tab separated, as it seems to be, split the line into an array and assign each index with to a property like so.

public partial class Form1 : Form
{
    private void Form1_Load(object sender, EventArgs e)
    {
        var rows = new List<Row>();
        var sr = new StreamReader(@"C:\so_test.txt");
        while (!sr.EndOfStream)
        {
            string s = sr.ReadLine();
            if (!String.IsNullOrEmpty(s.Trim()))
            {
                rows.Add(new Row(s));
            }
        }
        sr.Close();
        dataGridView1.DataSource = rows;
    }
}

public class Row
{
    public double Number1 { get; set; }
    public double Number2 { get; set; }
    public double Number3 { get; set; }
    public double Number4 { get; set; }
    public double Number5 { get; set; }
    public double Number6 { get; set; }
    public double Number7 { get; set; }
    public string Date1 { get; set; }

    public Row(string str)
    {
        string[] separator = { "\t" };
        var arr = str.Split(separator, StringSplitOptions.None);
        Number1 = Convert.ToDouble(arr[0]);
        Number2 = Convert.ToDouble(arr[1]);
        Number3 = Convert.ToDouble(arr[2]);
        Number4 = Convert.ToDouble(arr[3]);
        Number5 = Convert.ToDouble(arr[4]);
        Number6 = Convert.ToDouble(arr[5]);
        Number7 = Convert.ToDouble(arr[6]);
        Date1 = arr[7];
    }
}

2-硬起点和长度

如果数据用制表符分隔,但符合每一列的严格起始和终结点,则可以将每一列的起始点和长度声明为常量,然后通过子字符串获取它们.这样,只需要更改Row类中的代码即可.为了简洁起见,我遗漏了这些常量,只是对其进行了硬编码.

2 -Hard Start points and lengths

If the data is tab separated, but conforms to strict start and endpoints for each column, you could declare the startpoints and lengths for each column as constants and get those via substring. This would only need a change in code in your Row class, like this. I have left of the constants from brevity, and just hardcoded them.

    public Row(string str)
    {
        Number1 = Convert.ToDouble(str.Substring(4, 6));
        Number2 = Convert.ToDouble(str.Substring(16, 6));
        Number3 = Convert.ToDouble(str.Substring(28, 7));
        Number4 = Convert.ToDouble(str.Substring(40, 7));
        Number5 = Convert.ToDouble(str.Substring(52, 6));
        Number6 = Convert.ToDouble(str.Substring(64, 6));
        Number7 = Convert.ToDouble(str.Substring(76, 6));
        Date1 = str.Substring(88, 24);
    }

这篇关于我如何正确解析由空格分隔的文本文件的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆