如何使用大数据进行锻炼并应用一些验证 [英] how workout with large data and apply some validations

查看:48
本文介绍了如何使用大数据进行锻炼并应用一些验证的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有一个大小约200到500 MB的CSV文件并应用一些验证然后行插入并更新到mysql数据库。



所以我想,第一个CSV文件进入数据集并对其进行处理。



它是好的还是其他想法更好。



我需要在内存中完成所有工作然后我将数据传递到数据库。

I have CSV File which size approx 200 to 500 mb and apply some validation then row insert and update into mysql database.

so I think, First CSV file into dataset and work on it.

is it good or other think is better than.

I need all work done in memory then i will pass data into database.

推荐答案

我会使用streamreader



http://msdn.microsoft.com/en-us/library/system。 io.streamreader.aspx [ ^ ]



然后在我正在读取文件时验证行,然后你可以构建你的插入语句



http://www.w3schools.com/sql/sql_insert.asp [ ^ ]



并将它们插入到您的数据库中,
I would use a streamreader

http://msdn.microsoft.com/en-us/library/system.io.streamreader.aspx[^]

Then validate the rows as i am reading the file, then you could you build your insert statements off of that

http://www.w3schools.com/sql/sql_insert.asp[^]

and insert them to your database,


首先,如果这是偶然的任务,那么对于今天的系统(尤其是服务器)来说,0.5GB并不是那么多。但我认为这500MB并非来自单行。因此,您可以逐行解析它并根据需要进行验证。或者你也需要在行之间进行一些验证?

尝试使用像这样的现成CSV阅读器:快速CSV阅读器 [ ^ ]或者这个: http:// blogs.msdn.com/b/jmstall/archive/2012/03/24/opensource-csv-reader-on-nuget.aspx [ ^ ]
First of all, if this is an occasional task, 0.5GB is not that much for the today''s system (especially a server). But I suppose those 500MB are not coming from a single row. So you can simply parse it row by row and validate as needed. Or do you need some validation between the rows too?
Try using a ready-made CSV reader like this one: A Fast CSV Reader[^] or this one: http://blogs.msdn.com/b/jmstall/archive/2012/03/24/opensource-csv-reader-on-nuget.aspx[^]


这篇关于如何使用大数据进行锻炼并应用一些验证的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆