Python 中更快的套接字 [英] Faster sockets in Python

查看:37
本文介绍了Python 中更快的套接字的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有一个用 Python 编写的用于服务器的客户端,它通过 LAN 运行.该算法的某些部分密集使用套接字读取,执行速度比 几乎相同慢 3-6 倍用 C++ 编写.有哪些解决方案可以加快 Python 套接字读取速度?

I have a client written in Python for a server, which functions through LAN. Some part of the algorithm uses socket reading intensively and it is executing about 3-6 times slower, than almost the same one written in C++. What solutions exist for making Python socket reading faster?

我实现了一些简单的缓冲,我的用于处理套接字的类如下所示:

I have some simple buffering implemented, and my class for working with sockets looks like this:

import socket
import struct

class Sock():
    def __init__(self):
        self.s = socket.socket(socket.AF_INET, socket.SOCK_STREAM)
        self.recv_buf = b''
        self.send_buf = b''

    def connect(self):
        self.s.connect(('127.0.0.1', 6666))

    def close(self):
        self.s.close()

    def recv(self, lngth):
        while len(self.recv_buf) < lngth:
                self.recv_buf += self.s.recv(lngth - len(self.recv_buf))

        res = self.recv_buf[-lngth:]
        self.recv_buf = self.recv_buf[:-lngth]
        return res

    def next_int(self):
        return struct.unpack("i", self.recv(4))[0]

    def next_float(self):
        return struct.unpack("f", self.recv(4))[0]

    def write_int(self, i):
        self.send_buf += struct.pack('i', i)

    def write_float(self, f):
        self.send_buf += struct.pack('f', f)

    def flush(self):
        self.s.sendall(self.send_buf)
        self.send_buf = b''

P.S.:分析还表明大部分时间都花在读取套接字上.

P.S.: profiling also shows that the majority of time is spent reading sockets.

因为数据是以已知大小的块接收的,所以我可以一次读取整个块.所以我把我的代码改成这样:

Because data is received in blocks with known size, I can read the whole block at once. So I've changed my code to this:

class Sock():
    def __init__(self):
        self.s = socket.socket(socket.AF_INET, socket.SOCK_STREAM)
        self.send_buf = b''

    def connect(self):
        self.s.connect(('127.0.0.1', 6666))

    def close(self):
        self.s.close()

    def recv_prepare(self, cnt):
        self.recv_buf = bytearray()
        while len(self.recv_buf) < cnt:
            self.recv_buf.extend(self.s.recv(cnt - len(self.recv_buf)))

        self.recv_buf_i = 0

    def skip_read(self, cnt):
        self.recv_buf_i += cnt

    def next_int(self):
        self.recv_buf_i += 4
        return struct.unpack("i", self.recv_buf[self.recv_buf_i - 4:self.recv_buf_i])[0]

    def next_float(self):
        self.recv_buf_i += 4
        return struct.unpack("f", self.recv_buf[self.recv_buf_i - 4:self.recv_buf_i])[0]

    def write_int(self, i):
        self.send_buf += struct.pack('i', i)

    def write_float(self, f):
        self.send_buf += struct.pack('f', f)

    def flush(self):
        self.s.sendall(self.send_buf)
        self.send_buf = b''

recv'ing from socket 在这段代码中看起来是最佳的.但是现在 next_intnext_float 成为第二个瓶颈,它们每次调用大约需要 1 毫秒(3000 个 CPU 周期)来解包.是否有可能使它们更快,就像在 C++ 中一样?

recv'ing from socket looks optimal in this code. But now next_int and next_float became the second bottleneck, they take about 1 msec (3000 CPU cycles) per call just to unpack. Is it possible to make them faster, like in C++?

推荐答案

您最新的瓶颈在于 next_intnext_float 因为您从 bytearray 创建了中间字符串 并且因为您一次只解压一个值.

Your latest bottleneck is in next_int and next_float because you create intermediate strings from the bytearray and because you only unpack one value at a time.

struct 模块有一个 unpack_from,它接受一个缓冲区和一个偏移量.这更有效,因为不需要从您的 bytearray 创建一个中间字符串:

The struct module has an unpack_from that takes a buffer and an offset. This is more efficient because there is no need to create an intermediate string from your bytearray:

def next_int(self):
    self.recv_buf_i += 4
    return struct.unpack_from("i", self.recv_buf, self.recv_buf_i-4)[0]

此外,struct 模块一次可以解压多个值.目前,您为每个值从 Python 调用 C(通过模块).通过减少调用它的次数并让它在每次调用时做更多的工作,你会得到更好的服务:

Additionally, struct module can unpack more than one value at a time. Currently, you call from Python to C (via the module) for each value. You would be better served by calling it fewer times and letting it do more work on each call:

def next_chunk(self, fmt): # fmt can be a group such as "iifff" 
    sz = struct.calcsize(fmt) 
    self.recv_buf_i += sz
    return struct.unpack_from(fmt, self.recv_buf, self.recv_buf_i-sz)

如果您知道 fmt 将始终是 4 字节整数和浮点数,您可以将 struct.calcsize(fmt) 替换为 4 * len(fmt).

If you know that fmt will always be 4 byte integers and floats you can replace struct.calcsize(fmt) with 4 * len(fmt).

最后,作为偏好,我认为这读起来更干净:

Finally, as a matter of preference I think this reads more cleanly:

def next_chunk(self, fmt): 
    sz = struct.calcsize(fmt) 
    chunk = struct.unpack_from(fmt, self.recv_buf, self.recv_buf_i)
    self.recv_buf_i += sz
    return chunk

这篇关于Python 中更快的套接字的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆