类型错误:“邮政编码"对象不可下标 [英] TypeError: 'Zipcode' object is not subscriptable

查看:34
本文介绍了类型错误:“邮政编码"对象不可下标的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在使用 Python3 并且有一个看起来像的 Pandas df

I'm using Python3 and have a pandas df that looks like

    zip
0   07105
1   00000
2   07030
3   07032
4   07032

我想使用 python 包 uszipcode

I would like to add state and city using the python package uszipcode

import uszipcode
search = SearchEngine(simple_zipcode=False)
def zco(x):
    print(search.by_zipcode(x)['City'])

df['City'] = df[['zip']].fillna(0).astype(int).apply(zco)

但是,我收到以下错误

TypeError: 'Zipcode' object is not subscriptable

有人可以帮忙解决这个错误吗?提前致谢.

Can someone help with the error? Thank you in advance.

推荐答案

调用 search.by_zipcode(x) 返回一个 ZipCode() 实例,不是字典,所以应用 ['City'] 到该对象失败.

The call search.by_zipcode(x) returns a ZipCode() instance, not a dictionary, so applying ['City'] to that object fails.

相反,使用较短别名的 .major_city 属性,.city 属性;你想返回那个值,而不是打印它:

Instead, use either the .major_city attribute of the shorter alias, the .city attribute; you want to return that value, not print it:

def zco(x):
    return search.by_zipcode(x).city

如果您将使用 uszipcode 项目的全部目的是将邮政编码映射到州和城市名称,则不需要使用完整的数据库(下载 450MB).坚持使用只有 9MB 的简单"版本,将 simple_zipcode=False 参数省略给 SearchEngine().

If all you are going to use the uszipcode project for is mapping zip codes to state and city names, you don’t need to use the full database (a 450MB download). Just stick with the ‘simple’ version, which is only 9MB, by leaving out the simple_zipcode=False argument to SearchEngine().

接下来,这将真的很慢..apply() 在底层使用一个简单的循环,对于每一行 .by_zipcode() 方法将使用 SQLAlchemy 查询 SQLite 数据库,创建一个单一的结果对象匹配行中的所有列,然后返回该对象,以便您可以从中获取单个属性.

Next, this is going to be really really slow. .apply() uses a simple loop under the hood, and for each row the .by_zipcode() method will query a SQLite database using SQLAlchemy, create a single result object with all the columns from the matching row, then return that object, just so you can get a single attribute from them.

你最好直接查询数据库,使用 Pandas SQL 方法.uszipcode 包在这里仍然很有用,因为它可以为您下载数据库并创建 SQLAlchemy 会话,SearchEngine.ses 属性 可让您直接访问它,但我会从那里做:

You'd be much better off querying the database directly, with the Pandas SQL methods. The uszipcode package is still useful here as it handles downloading the database for you and creating a SQLAlchemy session, the SearchEngine.ses attribute gives you direct access to it, but from there I'd just do:

from uszipcode import SearchEngine, SimpleZipcode

search = SearchEngine()
query = (
    search.ses.query(
        SimpleZipcode.zipcode.label('zip'),
        SimpleZipcode.major_city.label('city'),
        SimpleZipcode.state.label('state'),
    ).filter(
        SimpleZipcode.zipcode.in_(df['zip'].dropna().unique())
    )
).selectable
zipcode_df = pd.read_sql_query(query, search.ses.connection(), index_col='zip')

创建一个 Pandas 数据框,其中所有唯一的邮政编码都映射到城市和州列.然后,您可以将您的数据框与邮政编码数据框结合起来:

to create a Pandas Dataframe with all your unique zipcodes mapped to city and state columns. You can then join your dataframe with the zipcode dataframe:

df = pd.merge(df, zipcode_df, how='left', left_on='zip', right_index=True)

这会将 citystate 列添加到您的原始数据框中.如果您需要引入更多列,请将它们添加到 search.ses.query(...) 部分,使用 .label() 为它们提供合适的列输出数据帧中的名称(没有 .label(),它们将以 simple_zipcode_zipcode_ 为前缀,具体取决于您所在的类使用).从记录的模型属性中选择,但要考虑到如果您需要访问 完整的 Zipcode 模型属性,您需要使用 SearchEngine(simple_zipcode=False) 来确保获得完整的 450MB 数据集,然后使用 Zipcode..label(...) 而不是 SimpleZipcode..label(...) 在查询中.

This adds city and state columns to your original dataframe. If you need to pull in more columns, add them to the search.ses.query(...) portion, using .label() to give them a suitable column name in the output dataframe (without a .label(), they'll get prefixed with simple_zipcode_ or zipcode_, depending on the class you are using). Pick from the model attributes documented, but take into account that if you need access to the full Zipcode model attributes you need to use SearchEngine(simple_zipcode=False) to ensure you get the full 450MB dataset at your disposal, then use Zipcode.<column>.label(...) instead of SimpleZipcode.<column>.label(...) in the query.

使用邮政编码作为 zipcode_df 数据帧中的索引,这将比在每一行上单独使用 SQLAlchemy 快得多 (zippier :-)).

With the zipcodes as the index in the zipcode_df dataframe, that's going to be a lot faster (zippier :-)) than using SQLAlchemy on each row individually.

这篇关于类型错误:“邮政编码"对象不可下标的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆