在Go中加速JSON解析 [英] Speeding up JSON parsing in Go

查看:202
本文介绍了在Go中加速JSON解析的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我们有事务日志文件,其中每个事务都是JSON格式的单行。我们经常需要选取数据的选定部分,执行单一时间转换,并将结果以特定格式输入到另一个系统。我编写了一个Python脚本,可以根据需要做到这一点,但我希望Go会更快,并让我有机会开始学习Go。所以,我写了下面的代码:

  package main 
importencoding / json
importfmt
importtime
importbufio
importos

func main(){

sep:= ,

reader:= bufio.NewReader(os.Stdin)

for {
data,_:= reader.ReadString('\\\
')
byt:= [] byte(data)

var dat map [string] interface {}

如果err:= json.Unmarshal(byt,& DAT); err!= nil {
break
}

status:= dat [status]。(string)
a_status:= dat [a_status]。 (string)
method:= dat [method]。(string)
path:= dat [path]。(string)
element_uid:= dat [element_uid] 。(string)
time_local:= dat [time_local]。(string)
etime,_:= time.Parse([02 / Jan / 2006:15:04:05 -0700] ,time_local)
fmt.Print(status,sep,a_status,sep,method,sep,path,sep,element_uid,sep,etime.Unix(),\\\

}
}

编译时没有抱怨,但我对缺乏性能改进感到惊讶。为了测试,我将2,000,000行日志放入tmpfs中(以确保磁盘I / O不会成为限制)并比较脚本的两个版本。我的结果:
$ b $

  $ time cat / mnt / ramdisk / logfile | ./stdin_conv> / dev / null 
real 0m51.995s

$ time cat / mnt / ramdisk / logfile | ./stdin_conv.py> / dev / null
real 0m52.471s

$ time cat / mnt / ramdisk / logfile> / dev / null
real 0m0.149s

这怎么能做得更快?我做了一些基本的努力。例如,ffjson项目建议创建不需要反射的静态函数;但是,到目前为止,我失败了,得到它的错误:

 错误:运行失败:/ tmp /ffjson-inception810284909.go 
STDOUT:

STDERR:
/tmp/ffjson-inception810284909.go:9:2:importjson_parse是一个程序,不是可导入的包


另外,我不会考虑上面所考虑的静态类型?可能不是 - 我积极地滴在Go所关注的耳朵后面。我试图在Go代码中选择性地禁用不同的属性,以查看是否存在特别的问题。没有一个对表现有明显影响。关于提高性能的任何建议,或者这仅仅是编译语言相对于其他语言没有实质性好处的情况?解析方案

尝试使用键入以删除所有这些不必要的赋值并键入断言;

 类型RenameMe结构{
状态字符串`json:状态`
Astatus字符串`json:a_status`
方法字符串`json:method`
路径字符串`json:path`
ElementUid字符串`json: element_uid`
TimeLocal time.Time`json:time_local`
Etime time.Time //事实之后处理此事
}

data: =& RenameMe {}
if err:= json.Unmarshal(byt,data); err!= nil {
break
}

data.Etime,_:= time.Parse([02 / Jan / 2006:15:04:05 -0700] ,time_local)

我不打算测试它以确保它胜过您的代码,但我打赌它的确有很大的优势。试试看,请让我知道。


We have transaction log files in which each transaction is a single line in JSON format. We often need to take selected parts of the data, perform a single time conversion, and feed results into another system in a specific format. I wrote a Python script that does this as we need, but I hoped that Go would be faster, and would give me a chance to start learning Go. So, I wrote the following:

package main
import "encoding/json"
import "fmt"
import "time"
import "bufio"
import "os"

func main() {

    sep := ","

    reader := bufio.NewReader(os.Stdin)

    for {
        data, _ := reader.ReadString('\n')
        byt := []byte(data)

        var dat map[string]interface{}

        if err := json.Unmarshal(byt, &dat); err != nil {
            break
        }

        status := dat["status"].(string)
        a_status := dat["a_status"].(string)
        method := dat["method"].(string)
        path := dat["path"].(string)
        element_uid := dat["element_uid"].(string)
        time_local := dat["time_local"].(string)
        etime, _ := time.Parse("[02/Jan/2006:15:04:05 -0700]", time_local)
        fmt.Print(status, sep, a_status, sep, method, sep, path, sep, element_uid, sep, etime.Unix(), "\n")
    }
}

That compiles without complaint, but I'm surprised at the lack of performance improvement. To test, I placed 2,000,000 lines of logs into a tmpfs (to ensure that disk I/O would not be a limitation) and compared the two versions of the script. My results:

$ time cat /mnt/ramdisk/logfile | ./stdin_conv > /dev/null 
real    0m51.995s

$ time cat /mnt/ramdisk/logfile | ./stdin_conv.py > /dev/null 
real    0m52.471s

$ time cat /mnt/ramdisk/logfile > /dev/null 
real    0m0.149s

How can this be made faster? I have made some rudimentary efforts. The ffjson project, for example, proposes to create static functions that make reflection unnecessary; however, I have failed so far to get it to work, getting the error:

Error: Go Run Failed for: /tmp/ffjson-inception810284909.go
STDOUT:

STDERR:
/tmp/ffjson-inception810284909.go:9:2: import "json_parse" is a program, not an importable package

:

Besides, wouldn't what I have above be considered statically typed? Possibly not-- I am positively dripping behind the ears where Go is concerned. I have tried selectively disabling different attributes in the Go code to see if one is especially problematic. None have had an appreciable effect on performance. Any suggestions on improving performance, or is this simply a case where compiled languages have no substantial benefit over others?

解决方案

Try using a type to remove all this unnecessary assignment and type assertion;

type RenameMe struct {
     Status string `json:"status"`
     Astatus string `json:"a_status"`
     Method string `json:"method"`
     Path string `json:"path"`
     ElementUid string `json:"element_uid"`
     TimeLocal time.Time `json:"time_local"`
     Etime time.Time // deal with this after the fact
}

data := &RenameMe{}
if err := json.Unmarshal(byt, data); err != nil {
            break
        }

data.Etime,  _ := time.Parse("[02/Jan/2006:15:04:05 -0700]", time_local)

I'm not going to test this to ensure it outperforms your code but I bet it does by a large margin. Give it a try and let me know please.

这篇关于在Go中加速JSON解析的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆