我们可以使用多线程将Microsoft Word文档转换为C#中的HTML吗? [英] Can we use multi threading to convert Microsoft Word document to HTML in C#?

查看:232
本文介绍了我们可以使用多线程将Microsoft Word文档转换为C#中的HTML吗?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有一个Windows服务,该服务将轮询数据库以查找doc,docx,pdf和rtf类型的所有上载文档,并将其转换为HTML并将其保存到本地文件系统中.这些文档是从数据库中提取的,并在内存中排队,然后由多个线程拾取,以从共享队列中进行处理.

I have a Windows Service which polls the database for any uploaded documents of type doc, docx, pdf and rtf and convert them to HTML and save them into the local file system. The documents are fetched from database and queued in the memory and then picked up by multiple threads for processing from the shared queue.

我面临的问题是,处理过程在一段时间内变慢.对于大小为50 KB的文档,转换在最初几天发生的速度更快,例如2秒,而在几天之后,对于同一文档,转换发生的速度较慢,例如20秒.我所看到的是,随着时间的推移,处理时间呈下降趋势.我无法确定导致这种下降趋势的原因.即使重新启动Windows服务也无济于事.

The problem I am facing is, the processing become slower over a period of time. The conversion is happening faster in the initial few days say 2 seconds for a document of size 50 KB and slower after few days of time say 20 seconds for the same document. All I can see is a declining trend in the processing time as the days are progressing. I couldn't nail down to what is causing this declining trend. Even restarting of the Windows Service is not helping.

Microsoft Office安装在Windows Server上以进行文档转换.每天将近2000个文档转换为HTML.

Microsoft Office is installed on the Windows Server for the document conversion. And per day nearly 2000 documents are being converted to HTML.

所以我的问题是我们可以使用多线程将Microsoft Word文档处理为HTML吗?

So my question is can we use multi threading to process Microsoft Word document to HTML?

推荐答案

我认为您已经在使用尽可能多的多线程-您无法提高Word的效率,只需并行运行多个Word实例即可(正在做).我建议花更多的时间进行调查.

I think you are already using as much multithreading as is possible - you can't make Word more efficient, just run several Word instances in parallel (which you are doing). I'd suggest spending more time in investigation.

进行一些日志记录/跟踪和性能分析.找出哪些行的代码/方法确实很慢.

Do some logging/tracing and profiling. Find out which lines of code/methods are the ones that are really slow.

如果事实证明是Word较慢,请尝试观察它和系统.缓慢来自何处?它是否耗尽了所有CPU?也许磁盘访问过多?也许某个地方收集了太多临时文件?或者,也许您用完了RAM,而Windows正在疯狂地交换?在最后一种情况下,这一切都在使用吗?也许您没有正确关闭某些内容(例如Word本身或使其打开的文件)?

If it turns out to be Word that is slow, try watching it and the system. Where does the slowness come from? Is it using up all the CPU? Perhaps the disk is being accessed too much? Maybe there are too many temporary files gathered somewhere? Or perhaps you run out of RAM and Windows is swapping like mad? In the last case what is using it all? Maybe you're not closing something properly (like Word itself or the files that you make it open)?

这篇关于我们可以使用多线程将Microsoft Word文档转换为C#中的HTML吗?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆