多线程总是比单线程产生更好的性能吗? [英] Does multithreading always yield better performance than single threading?

查看:117
本文介绍了多线程总是比单线程产生更好的性能吗?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我知道答案是,这是一个例子为什么单个线程比Java中的多线程更快?

I know the answer is No, here is an example Why single thread is faster than multithreading in Java? .

因此,当在线程中处理任务很简单时,成本创建线程会产生比分发任务更多的开销。这是一个单线程比多线程更快的情况。

So when processing a task in a thread is trivial, the cost of creating a thread will create more overhead than distributing the task. This is one case where a single thread will be faster than multithreading.


  • 是否有更多情况下单个线程比多线程更快?

  • Are there more cases where a single thread will be faster than multithreading?

我们什么时候应该决定放弃多线程而只使用一个线程来完成我们的目标?

When should we decide to give up multithreading and only use a single thread to accomplish our goal?

虽然问题被标记为,欢迎在Java之外讨论。
如果我们能在答案中有一个小例子来解释,那将是很好的。

Although the question is tagged java, it is also welcome to discuss beyond Java. It would be great if we could have a small example to explain in the answer.

推荐答案

这是一个非常好的关于线程及其与实际工作的链接的好问题,这意味着可用的物理CPU及其内核和超线程。

This is a very good question regarding threading and its link to the real work, meaning the available physical CPU(s) and its cores and hyperthreads.


  1. 多线程如果您的CPU有多个可用核心,则可能允许您并行执行操作。所以在一个理想的世界中,例如如果你的CPU有4个可用内核并且你的算法工作真正并行,那么使用4个线程来计算一些质数可能会快4倍。

  2. 如果在核心可用时启动更多线程,则该线程管理您的操作系统将花费越来越多的时间在线程交换机上,这样您使用CPU的效率会变差。

  3. 如果编译器,CPU缓存和/或运行时已实现您运行多个线程,访问内存中的相同数据区域,以不同的优化模式运行:只要编译/运行时确定只有一个线程访问数据,就可以避免将数据写入外部RAM太频繁,可能会有效地使用CPU的L1缓存。如果不是:必须激活信号量,并且还要更频繁地将缓存数据从L1 / L2缓存刷新到RAM。

所以我的经验教训来自高度并行的多线程已经:

So my lessons learned from highly parrallel multithreading have been:


  • 如果可能的话,使用单线程,无共享进程来提高效率

  • 如果需要线程,请尽可能地分离共享数据访问

  • 如果可能,不要尝试分配比可用内核更多的加载工作线程

这里有一个小程序(javafx)可以玩。它:

Here a small programm (javafx) to play with. It:


  • 分配一个100.000.000大小的字节数组,填充随机字节

  • 提供计算此数组中设置的位数的方法

  • 该方法允许计算每个'第n个'字节位

  • 计数(0,1 )将计算所有字节位

  • count(0,4)将计数0',4',8'字节位,允许并行交错计数

  • Allocates a byte array of 100.000.000 size, filled with random bytes
  • Provides a method, counting the number of bits set in this array
  • The method allow to count every 'nth' bytes bits
  • count(0,1) will count all bytes bits
  • count(0,4) will count the 0', 4', 8' byte bits allowing a parallel interleaved counting

使用MacPro(4核)结果:

Using a MacPro (4 cores) results in:


  1. 运行一个线程, count(0,1)需要1326ms来计算所有399993625位

  2. 并行运行两个线程,count(0,2)和count(1,2)需要920ms

  3. 运行四个线程,需要618ms

  4. 运行八个线程,需要631ms

  1. Running one thread, count(0,1) needs 1326ms to count all 399993625 bits
  2. Running two threads, count(0,2) and count(1,2) in parallel needs 920ms
  3. Running four threads, needs 618ms
  4. Running eight threads, needs 631ms




改变计数方式,例如递增一个共同的整数(AtomicInteger或synchronized)将极大地改变许多线程的性能。

Changing the way to count, e.g. incrementing a commonly shared integer (AtomicInteger or synchronized) will dramatically change the performance of many threads.

public class MulithreadingEffects extends Application {
    static class ParallelProgressBar extends ProgressBar {
        AtomicInteger myDoneCount = new AtomicInteger();
        int           myTotalCount;
        Timeline      myWhatcher = new Timeline(new KeyFrame(Duration.millis(10), e -> update()));
        BooleanProperty running = new SimpleBooleanProperty(false);

        public void update() {
            setProgress(1.0*myDoneCount.get()/myTotalCount);
            if (myDoneCount.get() >= myTotalCount) {
                myWhatcher.stop();
                myTotalCount = 0;
                running.set(false);
            }
        }

        public boolean isRunning() { return myTotalCount > 0; }
        public BooleanProperty runningProperty() { return running; }

        public void start(int totalCount) {
            myDoneCount.set(0);
            myTotalCount = totalCount;
            setProgress(0.0);
            myWhatcher.setCycleCount(Timeline.INDEFINITE);
            myWhatcher.play();
            running.set(true);
        }

        public void add(int n) {
            myDoneCount.addAndGet(n);
        }
    }

    int mySize = 100000000;
    byte[] inData = new byte[mySize];
    ParallelProgressBar globalProgressBar = new ParallelProgressBar();
    BooleanProperty iamReady = new SimpleBooleanProperty(false);
    AtomicInteger myCounter = new AtomicInteger(0);

    void count(int start, int step) {
        new Thread(""+start){
            public void run() {
                int count = 0;
                int loops = 0;
                for (int i = start; i < mySize; i+=step) {
                    for (int m = 0x80; m > 0; m >>=1) {
                        if ((inData[i] & m) > 0) count++;
                    }
                    if (loops++ > 99) {
                        globalProgressBar.add(loops);
                        loops = 0;
                    }
                }
                myCounter.addAndGet(count);
                globalProgressBar.add(loops);
            }
        }.start();
    }

    void pcount(Label result, int n) {
        result.setText("("+n+")");
        globalProgressBar.start(mySize);
        long start = System.currentTimeMillis();
        myCounter.set(0);
        globalProgressBar.runningProperty().addListener((p,o,v) -> {
            if (!v) {
                long ms = System.currentTimeMillis()-start;
                result.setText(""+ms+" ms ("+myCounter.get()+")");
            }
        });
        for (int t = 0; t < n; t++) count(t, n);
    }

    void testParallel(VBox box) {
        HBox hbox = new HBox();

        Label result = new Label("-");
        for (int i : new int[]{1, 2, 4, 8}) {
            Button run = new Button(""+i);
            run.setOnAction( e -> {
                if (globalProgressBar.isRunning()) return;
                pcount(result, i);
            });
            hbox.getChildren().add(run);
        }

        hbox.getChildren().addAll(result);
        box.getChildren().addAll(globalProgressBar, hbox);
    }


    @Override
    public void start(Stage primaryStage) throws Exception {        
        primaryStage.setTitle("ProgressBar's");

        globalProgressBar.start(mySize);
        new Thread("Prepare"){
            public void run() {
                iamReady.set(false);
                Random random = new Random();
                random.setSeed(4711);
                for (int i = 0; i < mySize; i++) {
                    inData[i] = (byte)random.nextInt(256);
                    globalProgressBar.add(1);
                }
                iamReady.set(true);
            }
        }.start();

        VBox box = new VBox();
        Scene scene = new Scene(box,400,80,Color.WHITE);
        primaryStage.setScene(scene);

        testParallel(box);
        GUIHelper.allowImageDrag(box);

        primaryStage.show();   
    }

    public static void main(String[] args) { launch(args); }
}

这篇关于多线程总是比单线程产生更好的性能吗?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆