如何用逗号,除非它是一个字段的一部分读CSV文件拆分 [英] How to read CSV file splitting by commas except if it's part of a field
问题描述
我有下面的C#代码,读取一个CSV文件,目标是将其保存到一个SQL表:
的StreamReader SR =新的StreamReader(tbCSVFileLocation.Text.ToString());
串线= sr.ReadLine();
的String []值= line.Split('');
DataTable的DT =新的DataTable();
的DataRow行;
的foreach(价值字符串DC)
{
dt.Columns.Add(新的DataColumn(DC));
}
而
{
值为= sr.ReadLine()分割(,)(sr.EndOfStream!)。
如果(value.Length == dt.Columns.Count)
{
行= dt.NewRow();
row.ItemArray =价值;
dt.Rows.Add(行);
}
}
我遇到的问题是我不知道。其中数据是从我的表来
下面是CSV文件的一个示例:
姓名,地址,许可证号,许可证类型,出生年份,有效日期,操作,处理不当描述,更新日期
563大医,PC,563格兰街
布鲁克林,NY 11211,196275 ,,, 9月29日/ 2010年,公司注册证书撤销。该公司承认有罪在充电的订货过度检查,治疗,或使用处理设施不能由患者的病情保证,2010年9月29日
阿龙,约瑟夫,2803北东700
犹他州普罗沃84604,072800,MD,1927,01 / 13/1999年,许可证投降,这操作修改以前订购#93-40在1993年3月31日,凡在聆讯委员会持续充电,该医生是由犹他州州立医学委员会纪律判处的刑罚,并命令他是否有意从事实践纽约州,两年的试用期以下的罚款。,
阿伦斯,马克金,845信箱
南派恩斯,北卡罗莱纳州28388,161530,MD,1958,12 / 13 / 2005年,许可证限制,直到医师的北卡罗莱纳州的行医执照被完全恢复没有任何conditions.The医师也必须遵守北卡罗来纳州医学委员会规定于2005年7月26日,该条款。医生已经完成了监测条款。医生没有比赛的费用北卡罗来纳州医学委员会已经处分了毒瘾的。12/06/2005
当我看着我的SQL表,这是显示的内容:
出生生效日期行动不端行为说明日期的名称地址许可证号许可证类型每年更新
佛罗里达州的奥兰多32836173309 MD 1938年2012/2/29许可投降,医生收费没有比赛的有有他的DEA登记佛罗里达撤销美国毒品管制局的不当处方受控物质。 2/22/2012
佛罗里达州迈阿密33156119545 MD 1945年2002年10月10日责难和谴责医生没有比赛费用由医学佛罗里达州教育局已处分给病人过度辐射剂量的。2002年10月10日
纽约布鲁克林11229192310 2003年11月6日废止依据第注册证书230-一个纽约的纽约州公共卫生法和第1503(四)国家商业公司法该公司承认有罪的电荷故意不遵守商业公司法,违反纽约州教育法第6530(12)第1503。 2003年10月31日
正如你可以看到有没有奥兰多第一行的第一列。不知道是怎么回事。
请帮我解决这个问题。
一些代码,应该可以帮助你开始..还可以使用调试
步直通代码
声明一个受保护的静态数据表csvData并为其分配空最初
块引用>
保护静态数据表csvData = NULL;在你的类
csvData = GetDataTabletFromCSVFile(文件名)//声明向上顶; //转换CSV文件到一个DataTable
私有静态数据表GetDataTabletFromCSVFile(字符串csv_file_path)
{
csvData =新的DataTable(defaultTableName);
试
{使用(TextFieldParser csvReader =新的TextFieldParser(csv_file_path))
{
csvReader.SetDelimiters(新的String []
{$ B $
b tableDelim
});
csvReader.HasFieldsEnclosedInQuotes = TRUE;
的String [] = colFields csvReader.ReadFields();
的foreach(在colFields字符串列)
{
的DataColumn datecolumn =新的DataColumn(列);
datecolumn.AllowDBNull = TRUE;
csvData.Columns.Add(datecolumn);
}
,而(csvReader.EndOfData!)
{
的String [] = fieldData csvReader.ReadFields();
//使空值作为空
的for(int i = 0; I< fieldData.Length;我++)
{
如果(fieldData [I] ==字符串。空)
{
fieldData [I] =的String.Empty; // fieldData [我] = NULL
}
//跳过那些在他们的任何CSV标题信息或空行
如果(fieldData [0]。载有(声明)的行| | string.IsNullOrEmpty(fieldData [0]))
{
继续;
}
}
csvData.Rows.Add(fieldData);
}
}
}
赶上(异常前)
{
}
返回csvData;
}
fieldData [0]。载有(免责声明),这是在我的.csv文件中的列必须的,从而阅读和理解的逻辑非常简单,并进行更改,以适应您的.csv文件
块引用>
如果你想尝试的东西更容易,然后解析出\字符时,使用快速监视窗口试试这个
你会得到VAR线= File.ReadLines(有些.csv文件的文件路径)选择(A => a.Split(''))。ToArray的()。
I have the following C# code which reads a CSV file and goal is to save it to a SQL table:
StreamReader sr = new StreamReader(tbCSVFileLocation.Text.ToString()); string line = sr.ReadLine(); string[] value = line.Split(','); DataTable dt = new DataTable(); DataRow row; foreach (string dc in value) { dt.Columns.Add(new DataColumn(dc)); } while (!sr.EndOfStream) { value = sr.ReadLine().Split(','); if (value.Length == dt.Columns.Count) { row = dt.NewRow(); row.ItemArray = value; dt.Rows.Add(row); } }
The issue I am having is I don't know where the data is coming from in my table.
Here is a sample of the CSV file:
Name,Address,License Number,License Type,Year of Birth,Effective Date,Action,Misconduct Description,Date Updated "563 Grand Medical, P.C.","563 Grand Street Brooklyn, NY 11211",196275,,,09/29/2010,Revocation of certificate of incorporation.,"The corporation admitted guilt to the charge of ordering excessive tests, treatment, or use of treatment facilities not warranted by the condition of a patient.",09/29/2010 "Aaron, Joseph","2803 North 700 East Provo, Utah 84604",072800,MD,1927,01/13/1999,License Surrender,"This action modifies the penalty previously imposed by Order# 93-40 on March 31, 1993, where the Hearing Committee sustained the charge that the physician was disciplined by the Utah State Medical Board, and ordered that if he intends to engage in practice in NY State, a two-year period of probation shall be imposed.", "Aarons, Mark Gold","P.O.Box 845 Southern Pines, North Carolina 28388",161530,MD,1958,12/13/2005,"License limited until the physician's North Carolina medical license is fully restored without any conditions.The physician must also comply with the terms imposed on July 26, 2005 by the North Carolina State Medical Board. The physician has completed the monitoring terms.",The physician did not contest the charge of having been disciplined by the North Carolina State Medical Board for his addiction to drugs.,12/06/2005
When I look at my SQL table, this is what is shown:
Name Address License Number License Type Year of Birth Effective Date Action Misconduct Description Date Updated Orlando FL 32836" 173309 MD 1938 2/29/2012 License surrender The physician did not contest the charge of having had his DEA registration for Florida revoked by the U.S. Drug Enforcement Administration for improperly prescribing controlled substances. 2/22/2012 Miami Florida 33156" 119545 MD 1945 10/10/2002 Censure and reprimand The physician did not contest the charge of having been disciplined by the Florida State Board of Medicine for giving a patient excessive doses of radiation. 10/10/2002 Brooklyn New York 11229" 192310 11/6/2003 Annulment of certificate of incorporation pursuant to Section 230-a of the New York State Public Health Law and Section 1503(d) of the New York State Business Corporation Law The corporation admitted guilt to the charge of willfully failing to comply with Section 1503 of the Business Corporation Law in violation of New York State Education Law Section 6530(12). 10/31/2003
As you can see there is no ORLANDO for the first column for the first row. Not sure what is going on.
Please help me resolve it.
解决方案some code that should help get you started.. also use the
Debugger
to step thru the codeDeclare a protected static DataTable csvData and assign it null initially
protected static DataTable csvData = null; // declared up top in your class csvData = GetDataTabletFromCSVFile(fileName); //Converts the CSV File into a DataTable private static DataTable GetDataTabletFromCSVFile(string csv_file_path) { csvData = new DataTable(defaultTableName); try { using (TextFieldParser csvReader = new TextFieldParser(csv_file_path)) { csvReader.SetDelimiters(new string[] { tableDelim }); csvReader.HasFieldsEnclosedInQuotes = true; string[] colFields = csvReader.ReadFields(); foreach (string column in colFields) { DataColumn datecolumn = new DataColumn(column); datecolumn.AllowDBNull = true; csvData.Columns.Add(datecolumn); } while (!csvReader.EndOfData) { string[] fieldData = csvReader.ReadFields(); //Making empty value as null for (int i = 0; i < fieldData.Length; i++) { if (fieldData[i] == string.Empty) { fieldData[i] = string.Empty; //fieldData[i] = null } //Skip rows that have any csv header information or blank rows in them if (fieldData[0].Contains("Disclaimer") || string.IsNullOrEmpty(fieldData[0])) { continue; } } csvData.Rows.Add(fieldData); } } } catch (Exception ex) { } return csvData; }
fieldData[0].Contains("Disclaimer") this is the column in my .csv file so read and understand the logic very straight forward and make changes to fit your .csv file as needed
if you want to try something easier and then Parse out the "\" Characters you will get when you use the Quick Watch window try this
var lines = File.ReadLines("FilePath of Some .csv File").Select(a => a.Split(',')).ToArray();
这篇关于如何用逗号,除非它是一个字段的一部分读CSV文件拆分的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!