如何用逗号,除非它是一个字段的一部分读CSV文件拆分 [英] How to read CSV file splitting by commas except if it's part of a field

查看:112
本文介绍了如何用逗号,除非它是一个字段的一部分读CSV文件拆分的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有下面的C#代码,读取一个CSV文件,目标是将其保存到一个SQL表:

 的StreamReader SR =新的StreamReader(tbCSVFileLocation.Text.ToString()); 
串线= sr.ReadLine();
的String []值= line.Split('');
DataTable的DT =新的DataTable();
的DataRow行;

的foreach(价值字符串DC)
{
dt.Columns.Add(新的DataColumn(DC));
}


{
值为= sr.ReadLine()分割(,)(sr.EndOfStream!)。
如果(value.Length == dt.Columns.Count)
{
行= dt.NewRow();
row.ItemArray =价值;
dt.Rows.Add(行);
}
}



我遇到的问题是我不知道。其中数据是从我的表来



下面是CSV文件的一个示例:



姓名,地址,许可证号,许可证类型,出生年份,有效日期,操作,处理不当描述,更新日期
563大医,PC,563格兰街
布鲁克林,NY 11211,196275 ,,, 9月29日/ 2010年,公司注册证书撤销。该公司承认有罪在充电的订货过度检查,治疗,或使用处理设施不能由患者的病情保证,2010年9月29日
阿龙,约瑟夫,2803北东700
犹他州普罗沃84604,072800,MD,1927,01 / 13/1999年,许可证投降,这操作修改以前订购#93-40在1993年3月31日,凡在聆讯委员会持续充电,该医生是由犹他州州立医学委员会纪律判处的刑罚,并命令他是否有意从事实践纽约州,两年的试用期以下的罚款。,
阿伦斯,马克金,845信箱
南派恩斯,北卡罗莱纳州28388,161530,MD,1958,12 / 13 / 2005年,许可证限制,直到医师的北卡罗莱纳州的行医执照被完全恢复没有任何conditions.The医师也必须遵守北卡罗来纳州医学委员会规定于2005年7月26日,该条款。医生已经完成了监测条款。医生没有比赛的费用北卡罗来纳州医学委员会已经处分了毒瘾的。12/06/2005



当我看着我的SQL表,这是显示的内容:

 出生生效日期行动不端行为说明日期的名称地址许可证号许可证类型每年更新
佛罗里达州的奥兰多32836173309 MD 1938年2012/2/29许可投降,医生收费没有比赛的有有他的DEA登记佛罗里达撤销美国毒品管制局的不当处方受控物质。 2/22/2012
佛罗里达州迈阿密33156119545 MD 1945年2002年10月10日责难和谴责医生没有比赛费用由医学佛罗里达州教育局已处分给病人过度辐射剂量的。2002年10月10日
纽约布鲁克林11229192310 2003年11月6日废止依据第注册证书230-一个纽约的纽约州公共卫生法和第1503(四)国家商业公司法该公司承认有罪的电荷故意不遵守商业公司法,违反纽约州教育法第6530(12)第1503。 2003年10月31日



正如你可以看到有没有奥兰多第一行的第一列。不知道是怎么回事。



请帮我解决这个问题。


解决方案

一些代码,应该可以帮助你开始..还可以使用调试步直通代码




声明一个受保护的静态数据表csvData并为其分配空最初




 保护静态数据表csvData = NULL;在你的类
csvData = GetDataTabletFromCSVFile(文件名)//声明向上顶; //转换CSV文件到一个DataTable

私有静态数据表GetDataTabletFromCSVFile(字符串csv_file_path)
{
csvData =新的DataTable(defaultTableName);

{使用(TextFieldParser csvReader =新的TextFieldParser(csv_file_path))
{
csvReader.SetDelimiters(新的String []
{$ B $
b tableDelim
});
csvReader.HasFieldsEnclosedInQuotes = TRUE;
的String [] = colFields csvReader.ReadFields();
的foreach(在colFields字符串列)
{
的DataColumn datecolumn =新的DataColumn(列);
datecolumn.AllowDBNull = TRUE;
csvData.Columns.Add(datecolumn);
}

,而(csvReader.EndOfData!)
{
的String [] = fieldData csvReader.ReadFields();
//使空值作为空
的for(int i = 0; I< fieldData.Length;我++)
{
如果(fieldData [I] ==字符串。空)
{
fieldData [I] =的String.Empty; // fieldData [我] = NULL
}
//跳过那些在他们的任何CSV标题信息或空行
如果(fieldData [0]。载有(声明)的行| | string.IsNullOrEmpty(fieldData [0]))
{
继续;
}
}
csvData.Rows.Add(fieldData);
}
}
}
赶上(异常前)
{
}
返回csvData;
}




fieldData [0]。载有(免责声明),这是在我的.csv文件中的列必须的,从而阅读和理解的逻辑非常简单,并进行更改,以适应您的.csv文件




如果你想尝试的东西更容易,然后解析出\字符时,使用快速监视窗口试试这个



你会得到

  VAR线= File.ReadLines(有些.csv文件的文件路径)选择(A => a.Split(''))。ToArray的()。 


I have the following C# code which reads a CSV file and goal is to save it to a SQL table:

StreamReader sr = new StreamReader(tbCSVFileLocation.Text.ToString());
string line = sr.ReadLine();
string[] value = line.Split(',');
DataTable dt = new DataTable();
DataRow row;

foreach (string dc in value)
{
  dt.Columns.Add(new DataColumn(dc));
}

while (!sr.EndOfStream)
{
  value = sr.ReadLine().Split(',');
  if (value.Length == dt.Columns.Count)
  {
    row = dt.NewRow();
    row.ItemArray = value;
    dt.Rows.Add(row);
  }
}

The issue I am having is I don't know where the data is coming from in my table.

Here is a sample of the CSV file:

Name,Address,License Number,License Type,Year of Birth,Effective Date,Action,Misconduct Description,Date Updated "563 Grand Medical, P.C.","563 Grand Street Brooklyn, NY 11211",196275,,,09/29/2010,Revocation of certificate of incorporation.,"The corporation admitted guilt to the charge of ordering excessive tests, treatment, or use of treatment facilities not warranted by the condition of a patient.",09/29/2010 "Aaron, Joseph","2803 North 700 East Provo, Utah 84604",072800,MD,1927,01/13/1999,License Surrender,"This action modifies the penalty previously imposed by Order# 93-40 on March 31, 1993, where the Hearing Committee sustained the charge that the physician was disciplined by the Utah State Medical Board, and ordered that if he intends to engage in practice in NY State, a two-year period of probation shall be imposed.", "Aarons, Mark Gold","P.O.Box 845 Southern Pines, North Carolina 28388",161530,MD,1958,12/13/2005,"License limited until the physician's North Carolina medical license is fully restored without any conditions.The physician must also comply with the terms imposed on July 26, 2005 by the North Carolina State Medical Board. The physician has completed the monitoring terms.",The physician did not contest the charge of having been disciplined by the North Carolina State Medical Board for his addiction to drugs.,12/06/2005

When I look at my SQL table, this is what is shown:

Name    Address License Number  License Type    Year of Birth   Effective Date  Action  Misconduct Description  Date Updated                    
Orlando  FL 32836"  173309  MD  1938    2/29/2012   License surrender   The physician did not contest the charge of having had his DEA registration for Florida revoked by the U.S. Drug Enforcement Administration for improperly prescribing controlled substances.   2/22/2012                   
Miami    Florida 33156" 119545  MD  1945    10/10/2002  Censure and reprimand   The physician did not contest the charge of having been disciplined by the Florida State Board of Medicine for giving a patient excessive doses of radiation.   10/10/2002                  
Brooklyn     New York 11229"    192310          11/6/2003   Annulment of certificate of incorporation pursuant to Section 230-a of the New York State Public Health Law and Section 1503(d) of the New York State Business Corporation Law  The corporation admitted guilt to the charge of willfully failing to comply with Section 1503 of the Business Corporation Law in violation of New York State Education Law Section 6530(12).    10/31/2003                  

As you can see there is no ORLANDO for the first column for the first row. Not sure what is going on.

Please help me resolve it.

解决方案

some code that should help get you started.. also use the Debugger to step thru the code

Declare a protected static DataTable csvData and assign it null initially

protected static DataTable csvData = null; // declared up top in your class
csvData = GetDataTabletFromCSVFile(fileName); //Converts the CSV File into a DataTable

private static DataTable GetDataTabletFromCSVFile(string csv_file_path)
{
    csvData = new DataTable(defaultTableName);
    try
    {
        using (TextFieldParser csvReader = new TextFieldParser(csv_file_path))
        {
            csvReader.SetDelimiters(new string[]
            {
                tableDelim 
            });
            csvReader.HasFieldsEnclosedInQuotes = true;
            string[] colFields = csvReader.ReadFields();
            foreach (string column in colFields)
            {
                DataColumn datecolumn = new DataColumn(column);
                datecolumn.AllowDBNull = true;
                csvData.Columns.Add(datecolumn);
            }

            while (!csvReader.EndOfData)
            {
                string[] fieldData = csvReader.ReadFields();
                //Making empty value as null
                for (int i = 0; i < fieldData.Length; i++)
                {
                    if (fieldData[i] == string.Empty)
                    {
                        fieldData[i] = string.Empty; //fieldData[i] = null
                    }
                    //Skip rows that have any csv header information or blank rows in them
                    if (fieldData[0].Contains("Disclaimer") || string.IsNullOrEmpty(fieldData[0]))
                    {
                        continue;
                    }
                }
                csvData.Rows.Add(fieldData);
            }
        }
    }
    catch (Exception ex)
    {
    }
    return csvData;
}

fieldData[0].Contains("Disclaimer") this is the column in my .csv file so read and understand the logic very straight forward and make changes to fit your .csv file as needed

if you want to try something easier and then Parse out the "\" Characters you will get when you use the Quick Watch window try this

var lines = File.ReadLines("FilePath of Some .csv File").Select(a => a.Split(',')).ToArray(); 

这篇关于如何用逗号,除非它是一个字段的一部分读CSV文件拆分的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆