将pandas.DataFrame转换为字节 [英] Converting pandas.DataFrame to bytes

查看：483 发布时间：2020/5/18 20:02:08 python numpy pandas type-conversion dataframe

本文介绍了将pandas.DataFrame转换为字节的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

我需要将存储在pandas.DataFrame中的数据转换为字节字符串，其中每一列可以具有单独的数据类型(整数或浮点数).这是一组简单的数据:

I need convert the data stored in a pandas.DataFrame into a byte string where each column can have a separate data type (integer or floating point). Here is a simple set of data:

df = pd.DataFrame([ 10, 15, 20], dtype='u1', columns=['a'])
df['b'] = np.array([np.iinfo('u8').max, 230498234019, 32094812309], dtype='u8')
df['c'] = np.array([1.324e10, 3.14159, 234.1341], dtype='f8')

和df看起来像这样:

    a            b                  c
0   10  18446744073709551615    1.324000e+10
1   15  230498234019            3.141590e+00
2   20  32094812309             2.341341e+02

DataFrame知道每列df.dtypes的类型，所以我想做这样的事情:

The DataFrame knows about the types of each column df.dtypes so I'd like to do something like this:

data_to_pack = [tuple(record) for _, record in df.iterrows()]
data_array = np.array(data_to_pack, dtype=zip(df.columns, df.dtypes))
data_bytes = data_array.tostring()

这通常可以正常工作，但是在这种情况下(由于存储在df['b'][0]中的最大值.上面第二行将元组数组转换为具有给定类型集的np.array会导致以下错误:

This typically works fine but in this case (due to the maximum value stored in df['b'][0]. The second line above converting the array of tuples to an np.array with a given set of types causes the following error:

OverflowError: Python int too large to convert to C long

第一行出现错误结果(我相信)，该错误将记录提取为具有单个数据类型(默认为float64)和float64中为最大uint64值选择的表示形式的Series不能直接转换回uint64.

The error results (I believe) in the first line which extracts the record as a Series with a single data type (defaults to float64) and the representation chosen in float64 for the maximum uint64 value is not directly convertible back to uint64.

1)由于DataFrame已经知道每一列的类型，有没有一种方法来创建一个元组行以输入到类型化的numpy.array构造函数中?还是有比上面概述的方法更好的方法来保存此类转换中的类型信息?

1) Since the DataFrame already knows the types of each column is there a way to get around creating a row of tuples for input into the typed numpy.array constructor? Or is there a better way than outlined above to preserve the type information in such a conversion?

2)有没有一种方法可以使用每列的类型信息直接从DataFrame到表示数据的字节字符串.

2) Is there a way to go directly from DataFrame to a byte string representing the data using the type information for each column.

将pandas.DataFrame转换为字节 [英] Converting pandas.DataFrame to bytes

问题描述

推荐答案

相关文章

Python最新文章

热门教程

热门工具

登录关闭

将pandas.DataFrame转换为字节 [英] Converting pandas.DataFrame to bytes

问题描述

推荐答案

相关文章

Python最新文章

热门教程

热门工具

登录 关闭

登录关闭