为什么myVar = strings.Fields(scanner.Text())比python中可比的操作花费更多的时间? [英] Why myVar = strings.Fields(scanner.Text()) take much more time than comparable operation in python?

查看:56
本文介绍了为什么myVar = strings.Fields(scanner.Text())比python中可比的操作花费更多的时间?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

在golang中考虑以下代码

Consider the following code in golang

now := time.Now()
sec1 := now.Unix()

file, err := os.Open(file_name)
if err != nil {
    log.Fatal(err)
}
defer file.Close()

scanner := bufio.NewScanner(file)

var parsedLine []string

for scanner.Scan() {
    parsedLine = strings.Fields(scanner.Text())
}

fmt.Println(parsedLine)
now2 := time.Now()
sec2 := now2.Unix()
fmt.Println(sec2 - sec1) // takes 24 second for file1.txt

并考虑使用此python程序

And consider this python program

start = time.time()

with open(file) as f:
    for line in f:
        parsedLine = line.split()

end = time.time() 
print end - start # takes 4.6450419426 second for file1.txt

我观察到golang程序比Mac book pro上的python程序慢5倍

I observe the golang program is 5 times slower than the python program on a mac book pro

特别是这一行

parsedLine = strings.Fields(scanner.Text())

非常慢.

如果我在golang中将该行更改为

If I change that line in golang to

if strings.Contains(scanner.Text(), "string_that_never_exist") {
     continue
}
// take less than 1 second

和python

if "string_that_never_exist" in line:
    continue
# takes 2.86928987503 second

Golang版本现在比python版本快得多.

Golang version is now much faster than python one.

对于为什么 strings.Fields(scanner.Text())可能比 line.split()

我觉得我缺少一些愚蠢的东西,有人可以指出为什么golang版本比python花费更长的时间

I feel I am missing something silly, can someone point me out why the golang version take longer than python

推荐答案

任何基准测试都应该是一个很好的科学实验.它必须是可复制的.

Any benchmark should be a good scientific experiment. It must be reproducible.

首先,定义易于使用的输入:

First, define the readily available input:

威廉·莎士比亚的《威廉·莎士比亚全集》:

The Complete Works of William Shakespeare by William Shakespeare:

http://www.gutenberg.org/files/100/100-0.txt

接下来,完全定义可执行程序:

Next, fully define the executable programs:

linesplit.py :

import time; 
start = time.time()

# http://www.gutenberg.org/files/100/100-0.txt
file = "/home/peter/shakespeare.100-0.txt"
with open(file) as f:
    for line in f:
        parsedLine = line.split()

end = time.time() 
print (end - start)

linesplit.go :

package main

import (
    "bufio"
    "fmt"
    "log"
    "os"
    "strings"
    "time"
)

func main() {
    now := time.Now()
    sec1 := now.Unix()

    // http://www.gutenberg.org/files/100/100-0.txt
    file_name := "/home/peter/shakespeare.100-0.txt"
    file, err := os.Open(file_name)
    if err != nil {
        log.Fatal(err)
    }
    defer file.Close()

    scanner := bufio.NewScanner(file)

    var parsedLine []string

    for scanner.Scan() {
        parsedLine = strings.Fields(scanner.Text())
    }

    fmt.Println(parsedLine)
    now2 := time.Now()
    sec2 := now2.Unix()
    fmt.Println(sec2 - sec1) // takes 24 second for file1.txt
    fmt.Println(time.Since(now))
}

然后,提供基准测试结果:

Then, provide the benchmark results:

$ python2 --version
Python 2.7.14
$ time python2 linesplit.py
.07024809169769
real    0m0.089s
user    0m0.089s
sys     0m0.000s

$ python3 --version
Python 3.6.3
$ time python3 linesplit.py
0.12172794342041016
real    0m0.159s
user    0m0.155s
sys     0m0.004s

$ go version
go version devel +39ad208c13 Tue Jun 12 19:10:34 2018 +0000 linux/amd64
$ go build linesplit.go && time ./linesplit
[]
1
91.833622ms
real    0m0.100s
user    0m0.094s
sys     0m0.004s

$ 

我们有Python2<转到<Python3或0.0724<0.0918<0.1217或1.00< 1.001.31 <1.73.Python2是ASCII.Go和Python3是Unicode.

We have Python2 < Go < Python3 or 0.0724 < 0.0918 < 0.1217 or, as a ratio, 1.00 < 1.31 < 1.73. Python2 is ASCII. Go and Python3 are Unicode.

这篇关于为什么myVar = strings.Fields(scanner.Text())比python中可比的操作花费更多的时间?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
相关文章
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆