通过套接字发送包含文件的字典(python) [英] Send a dictionary containing a file through a socket (python)

查看:81
本文介绍了通过套接字发送包含文件的字典(python)的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

是否可以通过套接字发送包含文件(图像或文档)作为值的字典?

Is it possible to send a dict that contains a file(image or document) as a value through a socket?

我尝试了类似波纹管的操作,但失败了.

I tried something like bellow but i failed..

with open("cat.jpeg", "rb") as f:
    myFile = f.read(2048)

data = {"id": "1283", "filename": "cat.jpeg", "file": myFile}

dataToSend = json.dumps(data).encode("utf-8")

这会产生json错误,myFile是字节数组,无法序列化.

This gives a json error, myFile being a byte array can't be serialized.

我尝试使用base64编码将myFile覆盖为字符串,但是没有用.

I tried coverting the myFile into a string using the base64 encode but it didn't worked.

部分起作用的是将myFile转换为字符串,例如str(myFile). json序列化程序工作正常,我通过套接字将其发送,字典正常,但myFile数据已损坏,所以我无法重新创建图片.

What partially worked was casting myFile into a string, like str(myFile). The json serializer worked, i send it through the socket, the dict was ok but the myFile data was corrupted so i couldn't recreate the picture.

那么有可能使用这种方法吗?或者我应该如何通过套接字发送文件和数据,以便在另一侧轻松地对其进行解析?

So is it possible using this approach or how should i send the file and the data through a socket to be easily parsed on the other side?

LE:

仍然无法使用base64编码工作,myFile仍为字节"格式,并且 json会出现此错误:TypeError:字节"类型的对象不可JSON序列化

Still doesn't work using base64 encoding, myFile is still "bytes" format and json gives this error: TypeError: Object of type 'bytes' is not JSON serializable

客户

import os
import base64
import json
import socket

currentPath = os.path.dirname(os.path.abspath(__file__)) + "\\downloads\\"

with open(currentPath + "cat.png", "rb") as f:
    l = f.read()

print(type(l))   #prints <class 'bytes'>

myFile = base64.b64encode(l)

print(type(myFile))    #prints <class 'bytes'>

data = {"id": "12", "filename": "cat.png", "message": "So cute!", "file": myFile}

dataToSend = json.dumps(data).encode("utf-8")   #prints TypeError: Object of type 'bytes' is not JSON serializable

s = socket.socket(socket.AF_INET, socket.SOCK_STREAM)
s.connect(("127.0.0.1", 1234))
s.sendall(dataToSend)
s.close()

和服务器:

import socket
import json
import os
import sys
import time
import base64

currentPath = os.path.dirname(os.path.abspath(__file__)) + "\\fileCache\\"
tempData = bytearray()

s = socket.socket(socket.AF_INET, socket.SOCK_STREAM)
s.bind(("127.0.0.1", 1234))
s.listen(5)
conn, addr = s.accept()

while True:
    dataReceived = conn.recv(2048)
    if sys.getsizeof(dataReceived) > 17:
        tempData = tempData + dataReceived
    else:
        data = json.loads(tempData.decode("utf-8"))
        break
    time.sleep(1)

print(data)

myFile = base64.b64decode(data["file"])

with open(currentPath + data["filename"], "wb") as f:
    f.write(myFile)
    f.close()

推荐答案

正如我在评论中所说,将二进制数据打包成字符串格式(如JSON)非常浪费-如果您使用base64,则会增加数据传输大小减少了33%,并且还使得JSON解码器很难正确地解码JSON,因为它需要流过整个结构才能提取索引.

As I was saying in my comment, packing binary data into a string format (like JSON) is wasteful - if you use base64 you're increasing the data transfer size by 33% and it also makes it hard for the JSON decoder to properly decode the JSON as it needs to stream through the whole structure just to extract the indices.

最好分别发送它们-JSON作为JSON,然后文件内容直接作为二进制.当然,您需要一种区分两者的方法,最简单的方法是在发送JSON数据时以其长度开头,以便服务器知道读取多少字节以获取JSON,然后读取其余的字节.作为文件内容.这将使它成为一种非常简单的协议,其包格式如下:

It's much better to send them separately - JSON as JSON, and then the file contents straight as binary. Of course, you'll need a way to distinguish between the two and the easiest is to just preface the JSON data with its length when sending it so that the server knows how much bytes to read to obtain the JSON, and then read the rest as the file contents. This would make it a sort of a very simple protocol with packages formed as:

[JSON LENGTH][JSON][FILE CONTENTS]

假设JSON永远不会大于4GB(如果是的话,解析起来将是一场噩梦,那么您将遇到更大的问题),足以将JSON LENGTH固定为4个字节(32位)作为无符号整数(如果您不希望JSON超过64KB,甚至可以使用16位),因此整个策略将在客户端运行,如下所示:

Assuming that the JSON will never be larger than 4GB (and if it is, you'll have much bigger problems as parsing it would be a nightmare) it's more than enough to have the JSON LENGTH of fixed 4 bytes (32 bits) as an unsigned integer (you can even go for 16-bit if you don't expect the JSON to go over 64KB) so the whole strategy would work on the client side as:

  1. 创建有效载荷
  2. 将其编码为JSON,然后使用UTF-8编码将其编码为bytes
  3. 获取上述包的长度并将其作为流的前4个字节发送
  4. 发送JSON包
  5. 读取并发送文件内容
  1. Create the payload
  2. Encode it to JSON and then encode it to bytes using UTF-8 encoding
  3. Get the length of the aforementioned package and send it as the first 4 bytes of the stream
  4. Send the JSON package
  5. Read and send the file contents

在服务器端,您执行相同的过程

And on the server side you do the same process

  1. 读取接收到的数据的前4个字节以获取JSON有效负载长度
  2. 读取下一个字节数以匹配此长度
  3. 使用UTF-8将它们解码为字符串,然后解码JSON以获取有效载荷
  4. 读取其余的流数据并将其存储到文件中

或者在代码中,客户端:

Or in code, client:

import json
import os
import socket
import struct

BUFFER_SIZE = 4096  # a uniform buffer size to use for our transfers

# pick up an absolute path from the script folder, not necessary tho
file_path = os.path.abspath(os.path.join(os.path.dirname(__file__), "downloads", "cat.png"))

# let's first prepare the payload to send over
payload = {"id": 12, "filename": os.path.basename(file_path), "message": "So cute!"}
# now JSON encode it and then turn it onto a bytes stream by encoding it as UTF-8
json_data = json.dumps(payload).encode("utf-8")
# then connect to the server and send everything
with socket.socket(socket.AF_INET, socket.SOCK_STREAM) as s:  # create a socket
    print("Connecting...")
    s.connect(("127.0.0.1", 1234))  # connect to the server
    # first send the JSON payload length
    print("Sending `{filename}` with a message: {message}.".format(**payload))
    s.sendall(struct.pack(">I", len(json_data)))  # pack as BE 32-bit unsigned int
    # now send the JSON payload itself
    s.sendall(json_data)  # let Python deal with the buffer on its own for the JSON...
    # finally, open the file and 'stream' it to the socket
    with open(file_path, "rb") as f:
        chunk = f.read(BUFFER_SIZE)
        while chunk:
            s.send(chunk)
            chunk = f.read(BUFFER_SIZE)
    # alternatively, if you're using Python 3.5+ you can just use socket.sendfile() instead
    print("Sent.")

和服务器:

import json
import os
import socket
import struct

BUFFER_SIZE = 4096  # a uniform buffer size to use for our transfers

target_path = os.path.abspath(os.path.join(os.path.dirname(__file__), "fileCache"))

with socket.socket(socket.AF_INET, socket.SOCK_STREAM) as s:
    s.bind(("127.0.0.1", 1234))  # bind to the 1234 port on localhost
    s.listen(0)  # allow only one connection so we don't have to deal with data separation
    while True:
        print("Waiting for a connection...")
        connection, address = s.accept()  # wait for and accept the incoming connection
        print("Connection from `{}` accepted.".format(address))
        # read the starting 32 bits and unpack them into an int to get the JSON length
        json_length = struct.unpack(">I", connection.recv(4))[0]
        # now read the JSON data of the given size and JSON decode it
        json_data = b""  # initiate an empty bytes structure
        while len(json_data) < json_length:
            chunk = connection.recv(min(BUFFER_SIZE, json_length - len(json_data)))
            if not chunk:  # no data, possibly broken connection/bad protocol
                break  # just exit for now, you should deal with this case in production
            json_data += chunk
        payload = json.loads(json_data.decode("utf-8"))  # JSON decode the payload
        # now read the rest and store it into a file at the target path
        file_path = os.path.join(target_path, payload["filename"])
        with open(file_path, "wb") as f:  # open the target file for writing...
            chunk = connection.recv(BUFFER_SIZE)  # and stream the socket data to it...
            while chunk:
                f.write(chunk)
                chunk = connection.recv(BUFFER_SIZE)
        # finally, lets print out that we received the data
        print("Received `{filename}` with a message: {message}".format(**payload))

注意:请记住,这是Python 3.x代码-对于Python 2.x,您必须自己处理上下文管理,而不要使用with ...块来打开/关闭套接字.

NOTE: Keep in mind that this is Python 3.x code - for Python 2.x you'll have to deal with context management yourself instead of having the with ... block to open/close your sockets.

仅此而已.当然,在实际情况下,您需要处理断开连接,多个客户端等问题.但这是基础过程.

And that's all there is to it. Of course, in a real setting you need to deal with disconnects, multiple clients, etc. But this is the underlying process.

这篇关于通过套接字发送包含文件的字典(python)的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆