我有100个xml文件,我想将100个文件分配到100个线程中,这些线程并行执行。 [英] I have 100 xml files, I want assign 100 files into 100 threads and these threads parallel execution.

查看:69
本文介绍了我有100个xml文件,我想将100个文件分配到100个线程中,这些线程并行执行。的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述





我有100个xml文件,我想将100个文件分配到100个线程中,这些线程并行执行。



当我使用parallel.foreach循环读取100个文件时,我将文件传递给somemethod()。该文件在此方法中插入数据。



我想要

100个文件 - > 100 somemethod()



线程下执行的100种方法并行执行。



请帮助我并建议我,这种方式是否正确。



比你......

Hi,

I have 100 xml files, I want assign 100 files into 100 threads and these threads parallel execution.

when I read the 100 files using parallel.foreach loop I pass my file into somemethod(). The file inserted data in this method.

I want
100 files-> 100 somemethod()

this 100 methods under threads executed parallely.

Please help me and suggest me, this way correct or not.

Than you......

推荐答案

看到了评论,有助于提高性能的是发布管道。



读入XML并将其传递给将其处理到数据库的线程。当一个正在运行时,您可以开始阅读第二个文件。有很多文章处理多线程。在处理你的主要应用程序之前,我会阅读一篇文章并尝试一个小型原型。



希望这会有所帮助。
Seeing the comments, what could help improving your performance is to launch a "pipeline".

Read in XML and pass it to a thread that will process it to database. While that one is running you can start reading in the second file. There are many articles out there that handle multi-threading. I would run through an article and try a small prototype before tackling your main application.

hope this helps.


你有定义您尝试解决的问题的范围。换句话说,瓶颈在哪里阻碍了你对这些文件的处理。



你拥有的线程数越多并不意味着你的处理速度越快。 />


您获得的好处取决于问题的范围。这些XML文件有多大?是否需要对数据进行大量处理?你是如何插入数据库的?批量插入或单个记录插入?



换句话说,问题的范围是I / O绑定,意味着等待磁盘I / O浪费的时间完成或是否在等待网络访问或SQL查询?



问题计算是否受约束?在将结果存储到数据库之前是否需要进行大量数据处理?



如果问题是计算限制,那么你拥有的核心数量在运行代码的机器上的所有CPU中,确定可以同时执行的最大线程数。这是没有道理的,你将有x个线程在同一时间运行。请记住,除了你的以外,Windows已经有几百个线程在各种进程中运行。您与计算机上运行的每个其他进程共享CPU。



您可以创建更多线程,等待运行,而不是CPU上的核心,但是淹没有工作的CPU只会使你的线程等待比实际运行更多的运行。



至于I / O限制问题,它实际上取决于瓶颈在哪里是。请记住,一次只能读取一个磁盘,一次只能读取一个磁盘。因此,如果您的问题是磁盘I / O限制,抛出几百个线程就会对您造成任何影响。



如果I / O问题是等待网络或SQL处理完成,同样,在问题上抛出各种线程只会导致数百个线程等待完成某些事情。
You have to define the scope of the problem you're trying to solve. In other words, where is the bottleneck holding up your processing of those files.

The higher the number of threads you have does NOT mean your processing occurs faster.

The benefit you get depends on the scope of the problem. How big are these XML files? Is there a lot of processing that needs to take place on the data? How are you doing the inserts into the database? Bulk insert or individual record inserts?

In other words, is the scope of the problem I/O bound, meaning is the time being wasted waiting for disk I/O to complete or is it waiting for a network access or an SQL query?

Is the problem compute bound? Is there a lot of processing of data that needs to be done before storing the results in your database?

If the problem is compute bound, the number of cores you have in all of your CPU's on the machine running your code determines the maximum number of threads you can expect to execute at the same time. This is NO WAY MENAS THAT YOU WILL HAVE x number of threads running at the exact same time. Remember, Windows already has a few hundred threads running in various processes besides yours. You share the CPU with every other process running on the machine.

You can have more threads created, waiting to run, than there are cores on the CPU, but swamping the CPU with work will only make your threads wait to run more than they are actually running.

As for I/O bound problems, it really depends on where the bottleneck is. Remember, a single disk can only ever read one bit at a time, for one thread at a time. So if your problem is disk I/O bound, throwing a couple hundred threads at the problem will do you no good at all.

If the I/O problem is waiting for network or SQL processing to finish, again, throwing all kinds of threads at the problem will only result in hundreds of threads waiting around for something to complete.


这篇关于我有100个xml文件,我想将100个文件分配到100个线程中,这些线程并行执行。的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆