使用C#将XML文件加载到MySQL的最快方法是什么? [英] What is the fastest way to load an XML file into MySQL using C#?

查看:123
本文介绍了使用C#将XML文件加载到MySQL的最快方法是什么?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

将大型(> 1GB)XML文件转储到MySQL数据库中最快的方法是什么?

What is the fastest way to dump a large (> 1GB) XML file into a MySQL database?

有问题的数据是StackOverflow知识共享数据转储.

The data in question is the StackOverflow Creative Commons Data Dump.

这将在我正在构建的脱机StackOverflow查看器中使用,因为我希望在无法访问互联网的地方进行一些学习/编码.

This will be used in an offline StackOverflow viewer I am building, since I am looking to do some studying/coding in places where I will not have access to the internet.

我想在项目完成后将其释放给其他StackOverflow成员以供自己使用.

I would like to release this to the rest of the StackOverflow membership for their own use when the project is finished.

最初,我一次从XML/写入到DB读取一条记录.在我的机器上运行大约需要10个小时.我正在使用的hacktastic代码现在将500条记录放入一个数组中,然后创建一个插入查询以一次加载所有500条记录(例如"INSERT INTO posts VALUES (...), (...), (...) ... ;").尽管速度更快,但仍需要数小时才能运行.显然,这并不是解决问题的最佳方法,所以我希望这个网站上的大腕们知道更好的方法.

Originally, I was reading from XML/writing to DB one record at a time. This took about 10 hours to run on my machine. The hacktastic code I'm using now throws 500 records into an array, then creates an insertion query to load all 500 at once (eg. "INSERT INTO posts VALUES (...), (...), (...) ... ;"). While this is faster, it still takes hours to run. Clearly this is not the best way to go about it, so I'm hoping the big brains on this site will know of a better way.

  • 我正在使用C#作为桌面应用程序(即WinForms)来构建应用程序.
  • 我正在使用MySQL 5.1作为数据库.这意味着诸如"LOAD XML INFILE filename.xml"之类的功能在该项目中不可用,因为该功能仅在MySQL 5.4及更高版本中可用.这种限制主要是由于我希望该项目对我本人以外的其他人有用,并且我不想强迫人们使用MySQL的Beta版本.
  • 我希望将数据加载内置到我的应用程序中(即没有说明在运行该应用程序之前使用'foo'将转储加载到MySQL中.").
  • 我正在使用MySQL Connector/Net,因此MySql.Data名称空间中的任何内容都是可以接受的.
  • I am building the application using C# as a desktop application (i.e. WinForms).
  • I am using MySQL 5.1 as my database. This means that features such as "LOAD XML INFILE filename.xml" are not usable in this project, as this feature is only available in MySQL 5.4 and above. This constraint is largely due to my hope that the project would be useful to people other than myself, and I'd rather not force people to use Beta versions of MySQL.
  • I'd like the data load to be built into my application (i.e. no instructions to "Load the dump into MySQL using 'foo' before running this application.").
  • I'm using MySQL Connector/Net, so anything in the MySql.Data namespace is acceptable.

感谢您可以提供的任何指针!

Thanks for any pointers you can provide!

到目前为止的想法

将整个XML文件加载到列中,然后使用XPath对其进行解析的存储过程

stored procedure that loads an entire XML file into a column, then parses it using XPath

  • 这不起作用,因为文件大小受max_allowed_pa​​cket变量的限制,该变量默认设置为1 MB.这远远小于数据转储文件的大小.
  • 推荐答案

    这有两部分:

    • 读取xml文件
    • 写入数据库

    要读取xml文件,请链接 http://csharptutorial. blogspot.com/2006/10/reading-xml-fast.html ,显示可以使用流阅读器在2.4秒内读取1 MB,这将是2400秒或40分钟(如果我的数学迟到了) )用于1 GB的文件.

    For reading the xml file, this link http://csharptutorial.blogspot.com/2006/10/reading-xml-fast.html , shows that 1 MB can be read in 2.4 sec using stream reader, that would be 2400 seconds or 40 mins (if my maths is working this late) for 1 GB file.

    据我了解,将数据导入MySQL最快的方法是使用LOAD DATA.

    From what I have read the fastest way to get data into MySQL is to use LOAD DATA.

    http://dev.mysql.com/doc/refman/5.1/en/load-data.html

    因此,如果您可以读取xml数据,请将其写入LOAD DATA可以使用的文件中,然后运行LOAD DATA.总时间可能少于您所经历的小时数.

    Therefore, if you can read the xml data, write it to files that can be used by LOAD DATA, then run LOAD DATA. The total time may be less than the hours that you are experiancing.

    这篇关于使用C#将XML文件加载到MySQL的最快方法是什么?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆