多线程或多处理 [英] multithreading or multiprocessing

查看:46
本文介绍了多线程或多处理的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在为Linux设计一个专用的syslog处理守护程序,该守护程序需要健壮且可扩展,并且我正在辩论多线程与多进程.

对多线程的明显反对是复杂性和令人讨厌的错误. 由于IPC通信和上下文切换,多进程可能会影响性能.

"Unix编程的艺术"在此处中进行讨论. /p>

您会推荐基于进程的系统(例如Apache)还是多线程方法?

解决方案

它们两者都可能以自己的方式变得复杂和复杂.

您可以选择两者之一.在宏伟的计划中,选择哪个可能无关紧要.重要的是您做得如何.因此:

做您最有经验的事情. 或者,如果您领导团队,请执行该团队最有经验的工作.

---线程化!---

我做了很多线程编程,我喜欢其中的一部分,而我不喜欢其中的一部分.我学到了很多东西,现在通常可以编写多线程应用程序而不会太费力,但是必须以非常特定的方式编写它.即:

1)必须使用非常明确定义的数据边界(该边界是100%线程安全的)编写的.否则,可能发生的任何情况都将发生,并且可能不会在您放置调试器的情况下发生..此外,调试线程代码就像在Schrodinger的盒子中窥视一样……通过查看该位置,其他线程可能有也可能没有有时间处理更多.

2)必须使用能使机器承受压力的测试代码编写.许多多线程系统仅在计算机承受很大压力时才会显示其错误.

3)必须有一些非常聪明的人来拥有数据交换代码.如果可以通过某种方式制作快捷方式,则某些开发人员可能会使用它,而您将遇到一个错误的错误.

4)必须有万能的情况,这些情况将以最少的麻烦重置应用程序.这是针对由于某些线程问题而中断的生产代码的.简而言之:表演必须继续.

---跨进程!---

我对基于进程的线程的经验较少,但是最近在Windows(IPC是Web服务调用... WOO!)中进行了一些跨进程的工作,并且它相对干净,简单,但是我在这里也要遵守一些规则.总的来说,进程间的通信将更不会出错,因为程序很好地接收了来自外界的输入,并且那些传输机制通常是异步的.反正...

1)定义清晰的过程边界和沟通机制.只要边界清楚,并且边界上有很多验证和错误检查代码,就可以通过TCP或Web服务或管道或其他任何方法来进行消息/事件.

2)为瓶颈做好准备.代码的宽恕非常重要.我的意思是,有时您将无法写入该管道.您必须能够重新排队并重试这些消息,而应用程序不会锁定/抛出异常.

3)一般而言,会有更多的代码,因为跨进程边界传输数据意味着您必须以某种方式对其进行序列化.这可能是问题的根源,尤其是当您开始维护和更改该代码时.

希望这会有所帮助.

I am designing a dedicated syslog-processing daemon for Linux that needs to be robust and scalable and I'm debating multithread vs. multiprocess.

The obvious objection with multithreading is complexity and nasty bugs. Multi-processes may impact performance because of IPC communications and context switching.

"The Art of Unix Programming" discusses this here.

Would you recommend a process-based system (like Apache) or a multi-threaded approach?

解决方案

Both of them can be complicated and complex in their own ways.

You can do either. In the grand scheme of things, it might not matter which you choose. What does matter is how well you do them. Therefore:

Do what you are most experienced with. Or if your leading a team, do what the team is most experienced with.

---Threading!---

I have done a lot of threaded programming, and I enjoy parts of it, and parts of it I do not enjoy. I've learned a lot, and now can usually write a multi-threaded application without too much pain, but it does have to be written in a very specific way. Namely:

1) It has to be written with very clearly defined data boundaries that are 100% thread safe. Otherwise, whatever condition that can happen, will happen, and it might not be when you have a debugger laying around.. Plus debugging threaded code is like peering into Schrodinger's box... By looking in there, other threads may or may not have had time to process more.

2) It has to be written with test code that stresses the machine. Many multi-threaded systems only show their bugs when the machines are heavily stressed.

3) There has to be some very smart person who owns the data exchanging code. If there is any way for a shortcut to be made, some developer will probably make it, and you will have an errant bug.

4) There has to be catch-all situations that will reset the application with a minimum of fuss. This is for the production code that breaks because of some threading issue. In short: The show must go on.

---Cross-Process!---

I have less experience with process-based threading, but have recently been doing some cross-process stuff in Windows (where the IPC is web service calls... WOO!), and it is relatively clean and simple, but I follow some rules here as well. By and large, interprocess communication will be much more error free because programs receive input from the outside world very well.. and those transport mechanisms are usually asynchronous. Anyway...

1) Define clear process boundaries and communication mechanisms. Message/eventing via, oh say, TCP or web services or pipes or whatever is fine, as long as the borders are clear, and there is a lot of validation and error checking code at those borders.

2) Be prepared for bottlenecks. Code forgiveness is very important. By this I mean, sometimes you won't be able to write to that pipe. You have to be able to requeue and retry those messages without the application locking up/tossing an exception.

3) There will be a lot more code in general, because transporting data across process boundaries means you have to serialize it in some fashion. This can be a source of problems, especially when you start maintaining and changing that code.

Hope this helps.

这篇关于多线程或多处理的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆