Python Pandas:仅旋转DataFrame中的某些列，同时保留其他列 [英] Python Pandas: pivot only certain columns in the DataFrame while keeping others

查看：352 发布时间：2020/5/23 22:55:10 python pandas pivot-table

本文介绍了Python Pandas:仅旋转DataFrame中的某些列，同时保留其他列的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

我正在尝试重新排列我使用Pandas从json自动读取的DataFrame.我搜索了但没有成功.

I am trying to re-arrange a DataFrame that I automatically read in from a json using Pandas. I've searched but have had no success.

我有以下json(为方便复制/粘贴而保存为字符串)，在标记值"下带有一堆json对象/字典

I have the following json (saved as a string for copy/paste convenience) with a bunch of json objects/dictionarys under the tag 'value'

json_str = '''{"preferred_timestamp": "internal_timestamp",
    "internal_timestamp": 3606765503.684,
    "stream_name": "ctdpf_j_cspp_instrument",
    "values": [{
        "value_id": "temperature",
        "value": 9.8319
    }, {
        "value_id": "conductivity",
        "value": 3.58847
    }, {
        "value_id": "pressure",
        "value": 22.963
    }]
}'''

我使用函数"json_normalize"将json加载到扁平化的Pandas数据框中.

I use the function 'json_normalize' in order to load the json into a flattened Pandas dataframe.

>>> from pandas.io.json import json_normalize
>>> import simplejson as json
>>> df = json_normalize(json.loads(json_str), 'values', ['preferred_timestamp', 'stream_name', 'internal_timestamp'])
>>> df
      value      value_id preferred_timestamp  internal_timestamp  \
0   9.83190   temperature  internal_timestamp        3.606766e+09   
1   3.58847  conductivity  internal_timestamp        3.606766e+09   
2  22.96300      pressure  internal_timestamp        3.606766e+09   
3  32.89470      salinity  internal_timestamp        3.606766e+09   

               stream_name  
0  ctdpf_j_cspp_instrument  
1  ctdpf_j_cspp_instrument  
2  ctdpf_j_cspp_instrument  
3  ctdpf_j_cspp_instrument

这是我被困的地方.我想获取value和value_id列，并根据value_id将它们透视到新列中.

Here is where I am stuck. I want to take the value and value_id columns and pivot these into new columns based off of value_id.

我希望数据框如下所示:

I want the dataframe to look like the following:

stream_name              preferred_timestamp  internal_timestamp  conductivity  pressure  salinity  temperature    
ctdpf_j_cspp_instrument  internal_timestamp   3.606766e+09        3.58847       22.96300  32.89470  9.83190

我已经尝试过Pivos和Pivot_table Pandas函数，甚至尝试通过使用'set_index'和'stack'来手动旋转表，但这并不是我想要的.

I've tried both the pivot and pivot_table Pandas functions and even tried to manually pivot the tables by using 'set_index' and 'stack' but it's not quite how I want it.

>>> df.pivot_table(values='value', index=['stream_name', 'preferred_timestamp', 'internal_timestamp', 'value_id'])
stream_name              preferred_timestamp  internal_timestamp  value_id    
ctdpf_j_cspp_instrument  internal_timestamp   3.606766e+09        conductivity     3.58847
                                                                  pressure        22.96300
                                                                  salinity        32.89470
                                                                  temperature      9.83190
Name: value, dtype: float64

这很接近，但是似乎并没有将"value_id"中的值转换为单独的列.

This is close, but it didn't seem to pivot the values in 'value_id' into separate columns.

和

>>> df.pivot('stream_name', 'value_id', 'value')
value_id                 conductivity  pressure  salinity  temperature
stream_name                                                           
ctdpf_j_cspp_instrument       3.58847    22.963   32.8947       9.8319

再次关闭，但是缺少我想与此行关联的其他列.

Close again, but it lacks the other columns that I want to be associated with this line.

我被困在这里.是否有一种优雅的方式做到这一点，或者我应该拆分DataFrame并将其重新合并到我想要的方式?

I'm stuck here. Is there an elegant way of doing this or should I split the DataFrames and re-merge them to how I want?

Python Pandas:仅旋转DataFrame中的某些列，同时保留其他列 [英] Python Pandas: pivot only certain columns in the DataFrame while keeping others

问题描述

推荐答案

相关文章

Python最新文章

热门教程

热门工具

登录关闭

Python Pandas:仅旋转DataFrame中的某些列，同时保留其他列 [英] Python Pandas: pivot only certain columns in the DataFrame while keeping others

问题描述

推荐答案

相关文章

Python最新文章

热门教程

热门工具

登录 关闭

登录关闭