将字符串转换为dict,然后访问key:values ???如何在< class'dict'>中访问数据对于Python? [英] Convert string to dict, then access key:values??? How to access data in a <class 'dict'> for Python?
问题描述
我在访问字典中的数据时遇到问题.
I am having issues accessing data inside a dictionary.
系统:Macbook 2012
Python:Python 3.5.1 :: Continuum Analytics,Inc.
Sys: Macbook 2012
Python: Python 3.5.1 :: Continuum Analytics, Inc.
我正在使用从csv创建的 dask.dataframe .
I am working with a dask.dataframe created from a csv.
假设我从熊猫系列开始:
Assume I start out with a Pandas Series:
df.Coordinates
130 {u'type': u'Point', u'coordinates': [-43.30175...
278 {u'type': u'Point', u'coordinates': [-51.17913...
425 {u'type': u'Point', u'coordinates': [-43.17986...
440 {u'type': u'Point', u'coordinates': [-51.16376...
877 {u'type': u'Point', u'coordinates': [-43.17986...
1313 {u'type': u'Point', u'coordinates': [-49.72688...
1734 {u'type': u'Point', u'coordinates': [-43.57405...
1817 {u'type': u'Point', u'coordinates': [-43.77649...
1835 {u'type': u'Point', u'coordinates': [-43.17132...
2739 {u'type': u'Point', u'coordinates': [-43.19583...
2915 {u'type': u'Point', u'coordinates': [-43.17986...
3035 {u'type': u'Point', u'coordinates': [-51.01583...
3097 {u'type': u'Point', u'coordinates': [-43.17891...
3974 {u'type': u'Point', u'coordinates': [-8.633880...
3983 {u'type': u'Point', u'coordinates': [-46.64960...
4424 {u'type': u'Point', u'coordinates': [-43.17986...
问题是,这不是字典的真实数据框.相反,它是一列充满字典的字符串,就像字典一样.运行此显示它:
The problem is, this is not a true dataframe of dictionaries. Instead, it's a column full of strings that LOOK like dictionaries. Running this show it:
df.Coordinates.apply(type)
130 <class 'str'>
278 <class 'str'>
425 <class 'str'>
440 <class 'str'>
877 <class 'str'>
1313 <class 'str'>
1734 <class 'str'>
1817 <class 'str'>
1835 <class 'str'>
2739 <class 'str'>
2915 <class 'str'>
3035 <class 'str'>
3097 <class 'str'>
3974 <class 'str'>
3983 <class 'str'>
4424 <class 'str'>
我的目标:访问字典中的coordinates
键和值.而已.但这是str
My Goal: Access the coordinates
key and value in the dictionary. That's it. But it's a str
我使用eval
将字符串转换为字典.
I converted the strings to dictionaries using eval
.
new = df.Coordinates.apply(eval)
130 {'coordinates': [-43.301755, -22.990065], 'typ...
278 {'coordinates': [-51.17913026, -30.01201896], ...
425 {'coordinates': [-43.17986794, -22.91000096], ...
440 {'coordinates': [-51.16376782, -29.95488677], ...
877 {'coordinates': [-43.17986794, -22.91000096], ...
1313 {'coordinates': [-49.72688407, -29.33757253], ...
1734 {'coordinates': [-43.574057, -22.928059], 'typ...
1817 {'coordinates': [-43.77649254, -22.86940539], ...
1835 {'coordinates': [-43.17132318, -22.90895217], ...
2739 {'coordinates': [-43.1958313, -22.98755333], '...
2915 {'coordinates': [-43.17986794, -22.91000096], ...
3035 {'coordinates': [-51.01583481, -29.63593292], ...
3097 {'coordinates': [-43.17891379, -22.96476163], ...
3974 {'coordinates': [-8.63388008, 41.14594453], 't...
3983 {'coordinates': [-46.64960938, -23.55902666], ...
4424 {'coordinates': [-43.17986794, -22.91000096], ...
接下来,我输入对象的类型并得到:
Next I text the type of object and get:
130 <class 'dict'>
278 <class 'dict'>
425 <class 'dict'>
440 <class 'dict'>
877 <class 'dict'>
1313 <class 'dict'>
1734 <class 'dict'>
1817 <class 'dict'>
1835 <class 'dict'>
2739 <class 'dict'>
2915 <class 'dict'>
3035 <class 'dict'>
3097 <class 'dict'>
3974 <class 'dict'>
3983 <class 'dict'>
4424 <class 'dict'>
如果我尝试访问我的词典: new.apply(lambda x:x ['coordinates']
If I try to access my dictionaries: new.apply(lambda x: x['coordinates']
---------------------------------------------------------------------------
TypeError Traceback (most recent call last)
<ipython-input-71-c0ad459ed1cc> in <module>()
----> 1 dfCombined.Coordinates.apply(coord_getter)
/Users/linwood/anaconda/envs/dataAnalysisWithPython/lib/python3.5/site-packages/pandas/core/series.py in apply(self, func, convert_dtype, args, **kwds)
2218 else:
2219 values = self.asobject
-> 2220 mapped = lib.map_infer(values, f, convert=convert_dtype)
2221
2222 if len(mapped) and isinstance(mapped[0], Series):
pandas/src/inference.pyx in pandas.lib.map_infer (pandas/lib.c:62658)()
<ipython-input-68-748ce2d8529e> in coord_getter(row)
1 import ast
2 def coord_getter(row):
----> 3 return (ast.literal_eval(row))['coordinates']
TypeError: 'bool' object is not subscriptable
这是某种类型的类,因为当我运行dir
时,我得到一个对象的信息:
It's some type of class, because when I run dir
I get this for one object:
new.apply(lambda x: dir(x))[130]
130 __class__
130 __contains__
130 __delattr__
130 __delitem__
130 __dir__
130 __doc__
130 __eq__
130 __format__
130 __ge__
130 __getattribute__
130 __getitem__
130 __gt__
130 __hash__
130 __init__
130 __iter__
130 __le__
130 __len__
130 __lt__
130 __ne__
130 __new__
130 __reduce__
130 __reduce_ex__
130 __repr__
130 __setattr__
130 __setitem__
130 __sizeof__
130 __str__
130 __subclasshook__
130 clear
130 copy
130 fromkeys
130 get
130 items
130 keys
130 pop
130 popitem
130 setdefault
130 update
130 values
Name: Coordinates, dtype: object
我的问题:我只想访问字典.但是,对象是<class 'dict'>
.如何将其隐藏到常规字典中,或者仅访问key:value对?
My Problem: I just want to access the dictionary. But, the object is <class 'dict'>
. How do I covert this to a regular dict or just access the key:value pairs?
任何想法?
推荐答案
我的第一个直觉是使用json.loads
将字符串转换为字典.但是您发布的示例不遵循json标准,因为它使用单引号而不是双引号.因此,您必须先转换字符串.
My first instinct is to use the json.loads
to cast the strings into dicts. But the example you've posted does not follow the json standard since it uses single instead of double quotes. So you have to convert the strings first.
第二种选择是只使用正则表达式来解析字符串.如果您实际的DataFrame中的字典字符串与我的示例不完全匹配,我希望regex方法会更可靠,因为经纬度坐标是相当标准的.
A second option is to just use regex to parse the strings. If the dict strings in your actual DataFrame do not exactly match my examples, I expect the regex method to be more robust since lat/long coords are fairly standard.
import re
import pandasd as pd
df = pd.DataFrame(data={'Coordinates':["{u'type': u'Point', u'coordinates': [-43.30175, 123.45]}",
"{u'type': u'Point', u'coordinates': [-51.17913, 123.45]}"],
'idx': [130, 278]})
##
# Solution 1- use json.loads
##
def string_to_dict(dict_string):
# Convert to proper json format
dict_string = dict_string.replace("'", '"').replace('u"', '"')
return json.loads(dict_string)
df.CoordDicts = df.Coordinates.apply(string_to_dict)
df.CoordDicts[0]['coordinates']
#>>> [-43.30175, 123.45]
##
# Solution 2 - use regex
##
def get_lat_lon(dict_string):
# Get the coordinates string with regex
rs = re.search("(\-?\d+(\.\d+)?),\s*(\-?\d+(\.\d+)?)", dict_string).group()
# Cast to floats
coords = [float(x) for x in rs.split(',')]
return coords
df.Coords = df.Coordinates.apply(get_lat_lon)
df.Coords[0]
#>>> [-43.30175, 123.45]
这篇关于将字符串转换为dict,然后访问key:values ???如何在< class'dict'>中访问数据对于Python?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!