pandas KeyError:"['value']不在索引中" [英] Pandas KeyError: “['value'] not in index”

查看:377
本文介绍了 pandas KeyError:"['value']不在索引中"的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我在Pandas数据框中的索引存在一些问题.我想做的是从JSON文件加载数据,创建Pandas数据框,然后从该数据框中选择特定字段并将其发送到我的数据库.

I'm having some issues with the index from a Pandas data frame. What I'm trying to do is load data from a JSON file, create a Pandas data frame and then select specific fields from that data frame and send it to my database.

以下是JSON文件中内容的链接,因此您可以看到字段实际存在: https://pastebin.com/Bzatkg4L

The following is a link to what's in the JSON file so you can see the fields actually exist: https://pastebin.com/Bzatkg4L

import pandas as pd
from pandas.io import sql
import MySQLdb
from sqlalchemy import create_engine

# Open and read the text file where all the Tweets are
with open('US_tweets.json') as f:
    tweets = f.readlines()

# Convert the list of Tweets into a structured dataframe
df = pd.DataFrame(tweets)
# Attributes needed should be here
df = df[['created_at', 'screen_name', 'id', 'country_code', 'full_name', 'lang', 'text']]

# To create connection and write table into MySQL
engine = create_engine("mysql+pymysql://{user}:{pw}@localhost/{db}"
                       .format(user="blah",
                               pw="blah",
                               db="blah"))

df.to_sql(con=engine, name='US_tweets_Table', if_exists='replace', flavor='mysql')

感谢您的帮助!

推荐答案

Pandas不会将JSON文件中的每个对象都映射到数据框中的一列.您的示例文件包含24列:

Pandas doesn't map every object in the JSON file to a column in the dataframe. Your example file contains 24 columns:

with open('tweets.json') as f:
    df = pd.read_json(f, lines = True)
df.columns

返回:

Index(['contributors', 'coordinates', 'created_at', 'entities',
   'favorite_count', 'favorited', 'geo', 'id', 'id_str',
   'in_reply_to_screen_name', 'in_reply_to_status_id',
   'in_reply_to_status_id_str', 'in_reply_to_user_id',
   'in_reply_to_user_id_str', 'is_quote_status', 'lang', 'metadata',
   'place', 'retweet_count', 'retweeted', 'source', 'text', 'truncated',
   'user'],
  dtype='object')

为了更深入地研究JSON数据,我找到了此解决方案,但我希望存在更优雅的方法:

To dig deeper into the JSON data, I found this solution, but I hope a more elegant approach exists: How do I access embedded json objects in a Pandas DataFrame?

例如,df['entities'].apply(pd.Series)['urls'].apply(pd.Series)[0].apply(pd.Series)['indices'][0][0] 返回117.

要访问full_name并将其复制到df,请尝试以下操作: df['full_name'] = df['place'].apply(pd.Series)['full_name'],返回0 Austin, TX.

To access full_name and copy it to the df, try this: df['full_name'] = df['place'].apply(pd.Series)['full_name'], which returns 0 Austin, TX.

这篇关于 pandas KeyError:"['value']不在索引中"的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆