如何在python中读取实木复合地板字节对象 [英] How to read a parquet bytes object in python
本文介绍了如何在python中读取实木复合地板字节对象的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!
问题描述
我有一个python对象,我知道这是一个加载到该对象的镶木地板文件. (我没有可能从文件中实际读取它).
I have a python object which I know this is a parquet file loaded to the object. (I do not have the possibility to actually read it from a file).
对象var_1
包含b'PAR1\x15\x....1\x00PAR1
当我检查类型时:
type(var_1)
我得到的结果是bytes
有没有办法阅读此书?说到熊猫数据框?
Is there a way to read this ? say into a pandas data-frame ?
我尝试过: 1)
from fastparquet import ParquetFile
pf = ParquetFile(var_1)
得到了:
TypeError: a bytes-like object is required, not 'str'
2
import pyarrow.parquet as pq
dataset = pq.ParquetDataset(var_1)
并得到:
TypeError: not a path-like object
请注意,>如何读取Parquet文件的解决方案进入Pandas DataFrame?.即pd.read_parquet(var_1, engine='fastparquet')
导致TypeError: a bytes-like object is required, not 'str'
推荐答案
您可以通过将bytes
对象包装在pyarrow.BufferReader
中来实现.
You can do this by wrapping the bytes
object in an pyarrow.BufferReader
.
import pyarrow as pa
import pyarrow.parquet as pq
var_1 = …
reader = pa.BufferReader(var_1)
table = pq.read_table(reader)
df = table.to_pandas() # This results in a pandas.DataFrame
这篇关于如何在python中读取实木复合地板字节对象的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!
查看全文