如何使用python pandas用破折号替换逗号? [英] How to replace comma with dash using python pandas?

查看:112
本文介绍了如何使用python pandas用破折号替换逗号?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有一个这样的文件:

name|count_dic
name1 |{'x1':123,'x2,bv.':435,'x3':4}
name2|{'x2,bv.':435,'x5':98}
etc.

我正在尝试将数据加载到数据帧中并计算 count_dic 中的键数.问题是 dic 项目用逗号分隔,并且一些键包含逗号.我正在寻找一种方法,能够用-"替换键中的逗号,然后能够在 count_dic.something 中分隔不同的键、值对,如下所示:

I am trying to load the data into a dataframe and count the number of keys in in the count_dic. The problem is that the dic items are separated with comma and also some of the keys contain comma. I am looking for a way to be able to replace commas in the key with '-' and then be able to separate different key,value pairs in the count_dic.something like this:

name|count_dic
name1 |{'x1':123,'x2-bv.':435,'x3':4}
name2|{'x2-bv.':435,'x5':98}
etc.

这就是我所做的.

df = pd.read_csv('file' ,names = ['name','count_dic'],delimiter='|')
data = json.loads(df.count_dic)

我收到以下错误:

TypeError: the JSON object must be str, not 'Series'

任何机构有任何建议吗?

Does any body have any suggestions?

推荐答案

您可以使用 ast.literal_eval 作为加载数据帧的转换器,因为看起来您的数据更像是 Python dict-like... JSON 使用双引号 - 例如:

You can use ast.literal_eval as a converter for loading the dataframe, as it appears you have data that's more Python dict-like... JSON uses double quotes - eg:

import pandas as pd
import ast

df = pd.read_csv('file', delimiter='|', converters={'count_dic': ast.literal_eval})

为您提供以下 DF:

    name                            count_dic
0  name1  {'x2,bv.': 435, 'x3': 4, 'x1': 123}
1  name2            {'x5': 98, 'x2,bv.': 435}

既然count_dic实际上是一个dict,那么你可以应用len来获取key的个数,例如:

Since count_dic is actually a dict, then you can apply len to get the number of keys, eg:

df.count_dic.apply(len)

结果:

0    3
1    2
Name: count_dic, dtype: int64

这篇关于如何使用python pandas用破折号替换逗号?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆