如何在python中读取实木复合地板字节对象 [英] How to read a parquet bytes object in python

查看:113
本文介绍了如何在python中读取实木复合地板字节对象的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有一个python对象,我知道这是一个加载到该对象的镶木地板文件. (我没有可能从文件中实际读取它).

I have a python object which I know this is a parquet file loaded to the object. (I do not have the possibility to actually read it from a file).

对象var_1包含b'PAR1\x15\x....1\x00PAR1

当我检查类型时:

type(var_1)

我得到的结果是bytes

有没有办法阅读此书?说到熊猫数据框?

Is there a way to read this ? say into a pandas data-frame ?

我尝试过: 1)

from fastparquet import ParquetFile
pf = ParquetFile(var_1)

得到了:

TypeError: a bytes-like object is required, not 'str'

2

import pyarrow.parquet as pq
dataset = pq.ParquetDataset(var_1)

并得到:

TypeError: not a path-like object

请注意,>如何读取Parquet文件的解决方案进入Pandas DataFrame?.即pd.read_parquet(var_1, engine='fastparquet')导致TypeError: a bytes-like object is required, not 'str'

推荐答案

您可以通过将bytes对象包装在pyarrow.BufferReader中来实现.

You can do this by wrapping the bytes object in an pyarrow.BufferReader.

import pyarrow as pa
import pyarrow.parquet as pq

var_1 = …    
reader = pa.BufferReader(var_1)
table = pq.read_table(reader)
df = table.to_pandas()  # This results in a pandas.DataFrame

这篇关于如何在python中读取实木复合地板字节对象的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆