将字符串转换为dict,然后访问key:values ???如何在< class'dict'>中访问数据对于Python? [英] Convert string to dict, then access key:values??? How to access data in a <class 'dict'> for Python?

查看:78
本文介绍了将字符串转换为dict,然后访问key:values ???如何在< class'dict'>中访问数据对于Python?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我在访问字典中的数据时遇到问题.

I am having issues accessing data inside a dictionary.

系统:Macbook 2012
Python:Python 3.5.1 :: Continuum Analytics,Inc.

Sys: Macbook 2012
Python: Python 3.5.1 :: Continuum Analytics, Inc.

我正在使用从csv创建的 dask.dataframe .

I am working with a dask.dataframe created from a csv.

假设我从熊猫系列开始:

Assume I start out with a Pandas Series:

df.Coordinates
130      {u'type': u'Point', u'coordinates': [-43.30175...
278      {u'type': u'Point', u'coordinates': [-51.17913...
425      {u'type': u'Point', u'coordinates': [-43.17986...
440      {u'type': u'Point', u'coordinates': [-51.16376...
877      {u'type': u'Point', u'coordinates': [-43.17986...
1313     {u'type': u'Point', u'coordinates': [-49.72688...
1734     {u'type': u'Point', u'coordinates': [-43.57405...
1817     {u'type': u'Point', u'coordinates': [-43.77649...
1835     {u'type': u'Point', u'coordinates': [-43.17132...
2739     {u'type': u'Point', u'coordinates': [-43.19583...
2915     {u'type': u'Point', u'coordinates': [-43.17986...
3035     {u'type': u'Point', u'coordinates': [-51.01583...
3097     {u'type': u'Point', u'coordinates': [-43.17891...
3974     {u'type': u'Point', u'coordinates': [-8.633880...
3983     {u'type': u'Point', u'coordinates': [-46.64960...
4424     {u'type': u'Point', u'coordinates': [-43.17986...

问题是,这不是字典的真实数据框.相反,它是一列充满字典的字符串,就像字典一样.运行此显示它:

The problem is, this is not a true dataframe of dictionaries. Instead, it's a column full of strings that LOOK like dictionaries. Running this show it:

df.Coordinates.apply(type)
130      <class 'str'>
278      <class 'str'>
425      <class 'str'>
440      <class 'str'>
877      <class 'str'>
1313     <class 'str'>
1734     <class 'str'>
1817     <class 'str'>
1835     <class 'str'>
2739     <class 'str'>
2915     <class 'str'>
3035     <class 'str'>
3097     <class 'str'>
3974     <class 'str'>
3983     <class 'str'>
4424     <class 'str'>

我的目标:访问字典中的coordinates键和值.而已.但这是str

My Goal: Access the coordinates key and value in the dictionary. That's it. But it's a str

我使用eval将字符串转换为字典.

I converted the strings to dictionaries using eval.

new = df.Coordinates.apply(eval)
130      {'coordinates': [-43.301755, -22.990065], 'typ...
278      {'coordinates': [-51.17913026, -30.01201896], ...
425      {'coordinates': [-43.17986794, -22.91000096], ...
440      {'coordinates': [-51.16376782, -29.95488677], ...
877      {'coordinates': [-43.17986794, -22.91000096], ...
1313     {'coordinates': [-49.72688407, -29.33757253], ...
1734     {'coordinates': [-43.574057, -22.928059], 'typ...
1817     {'coordinates': [-43.77649254, -22.86940539], ...
1835     {'coordinates': [-43.17132318, -22.90895217], ...
2739     {'coordinates': [-43.1958313, -22.98755333], '...
2915     {'coordinates': [-43.17986794, -22.91000096], ...
3035     {'coordinates': [-51.01583481, -29.63593292], ...
3097     {'coordinates': [-43.17891379, -22.96476163], ...
3974     {'coordinates': [-8.63388008, 41.14594453], 't...
3983     {'coordinates': [-46.64960938, -23.55902666], ...
4424     {'coordinates': [-43.17986794, -22.91000096], ...

接下来,我输入对象的类型并得到:

Next I text the type of object and get:

130      <class 'dict'>
278      <class 'dict'>
425      <class 'dict'>
440      <class 'dict'>
877      <class 'dict'>
1313     <class 'dict'>
1734     <class 'dict'>
1817     <class 'dict'>
1835     <class 'dict'>
2739     <class 'dict'>
2915     <class 'dict'>
3035     <class 'dict'>
3097     <class 'dict'>
3974     <class 'dict'>
3983     <class 'dict'>
4424     <class 'dict'>

如果我尝试访问我的词典: new.apply(lambda x:x ['coordinates']

If I try to access my dictionaries: new.apply(lambda x: x['coordinates']

---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
<ipython-input-71-c0ad459ed1cc> in <module>()
----> 1 dfCombined.Coordinates.apply(coord_getter)

/Users/linwood/anaconda/envs/dataAnalysisWithPython/lib/python3.5/site-packages/pandas/core/series.py in apply(self, func, convert_dtype, args, **kwds)
   2218         else:
   2219             values = self.asobject
-> 2220             mapped = lib.map_infer(values, f, convert=convert_dtype)
   2221 
   2222         if len(mapped) and isinstance(mapped[0], Series):

pandas/src/inference.pyx in pandas.lib.map_infer (pandas/lib.c:62658)()

<ipython-input-68-748ce2d8529e> in coord_getter(row)
      1 import ast
      2 def coord_getter(row):
----> 3     return (ast.literal_eval(row))['coordinates']

TypeError: 'bool' object is not subscriptable

这是某种类型的类,因为当我运行dir时,我得到一个对象的信息:

It's some type of class, because when I run dir I get this for one object:

new.apply(lambda x: dir(x))[130]
130           __class__
130        __contains__
130         __delattr__
130         __delitem__
130             __dir__
130             __doc__
130              __eq__
130          __format__
130              __ge__
130    __getattribute__
130         __getitem__
130              __gt__
130            __hash__
130            __init__
130            __iter__
130              __le__
130             __len__
130              __lt__
130              __ne__
130             __new__
130          __reduce__
130       __reduce_ex__
130            __repr__
130         __setattr__
130         __setitem__
130          __sizeof__
130             __str__
130    __subclasshook__
130               clear
130                copy
130            fromkeys
130                 get
130               items
130                keys
130                 pop
130             popitem
130          setdefault
130              update
130              values
Name: Coordinates, dtype: object

我的问题:我只想访问字典.但是,对象是<class 'dict'>.如何将其隐藏到常规字典中,或者仅访问key:value对?

My Problem: I just want to access the dictionary. But, the object is <class 'dict'>. How do I covert this to a regular dict or just access the key:value pairs?

任何想法?

推荐答案

我的第一个直觉是使用json.loads将字符串转换为字典.但是您发布的示例不遵循json标准,因为它使用单引号而不是双引号.因此,您必须先转换字符串.

My first instinct is to use the json.loads to cast the strings into dicts. But the example you've posted does not follow the json standard since it uses single instead of double quotes. So you have to convert the strings first.

第二种选择是只使用正则表达式来解析字符串.如果您实际的DataFrame中的字典字符串与我的示例不完全匹配,我希望regex方法会更可靠,因为经纬度坐标是相当标准的.

A second option is to just use regex to parse the strings. If the dict strings in your actual DataFrame do not exactly match my examples, I expect the regex method to be more robust since lat/long coords are fairly standard.

import re
import pandasd as pd

df = pd.DataFrame(data={'Coordinates':["{u'type': u'Point', u'coordinates': [-43.30175, 123.45]}",
    "{u'type': u'Point', u'coordinates': [-51.17913, 123.45]}"],
    'idx': [130, 278]})


##
# Solution 1- use json.loads
##

def string_to_dict(dict_string):
    # Convert to proper json format
    dict_string = dict_string.replace("'", '"').replace('u"', '"')
    return json.loads(dict_string)

df.CoordDicts = df.Coordinates.apply(string_to_dict)
df.CoordDicts[0]['coordinates']
#>>> [-43.30175, 123.45]


##
# Solution 2 - use regex
##
def get_lat_lon(dict_string):
    # Get the coordinates string with regex
    rs = re.search("(\-?\d+(\.\d+)?),\s*(\-?\d+(\.\d+)?)", dict_string).group()
    # Cast to floats
    coords = [float(x) for x in rs.split(',')]
    return coords

df.Coords = df.Coordinates.apply(get_lat_lon)
df.Coords[0]
#>>> [-43.30175, 123.45]

这篇关于将字符串转换为dict,然后访问key:values ???如何在&lt; class'dict'&gt;中访问数据对于Python?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆