解析json文件 [英] Parsing json files
问题描述
下面是用于可视化分析.json文件中获得的一组推文的代码.解释后,会在map()函数中显示错误.有什么办法解决吗?
Below is the code for visual analysis of a set of tweets obtained in a .json file. Upon interpreting , an error is shown at the map() function. Any way to fix it?
import json
import pandas as pd
import matplotlib.pyplot as plt
tweets_data_path = 'import_requests.txt'
tweets_data = []
tweets_file = open(tweets_data_path, "r")
for line in tweets_file:
try:
tweet = json.loads(line)
tweets_data.append(tweet)
except:
continue
print(len(tweets_data))
tweets = pd.DataFrame()
tweets['text'] = map(lambda tweet: tweet['text'], tweets_data)
这些是导致我获得上述代码的"ValueError"消息的行:
These are the lines leading up to the 'ValueError' message I am getting for the above code :
回溯(最近通话最近): 在第21行的文件"tweet_len.py"中 tweets ['text'] = map(lambda tweet:tweet ['text'],tweets_data)
setitem 中的文件"/usr/lib/python3/dist-packages/pandas/core/frame.py",行1887 self._set_item(键,值)
_set_item
中第1966行的文件"/usr/lib/python3/dist-packages/pandas/core/frame.py" self._ensure_valid_index(值) _ensure_valid_index
中的文件"/usr/lib/python3/dist-packages/pandas/core/frame.py",行1943 引发ValueError('无法设置没有定义索引的框架' ValueError:无法设置没有定义索引和无法转换为系列的值的框架
Traceback (most recent call last): File "tweet_len.py", line 21, in tweets['text'] = map(lambda tweet: tweet['text'], tweets_data)
File "/usr/lib/python3/dist-packages/pandas/core/frame.py", line 1887, in setitem self._set_item(key, value)
File "/usr/lib/python3/dist-packages/pandas/core/frame.py", line 1966, in _set_item
self._ensure_valid_index(value) File "/usr/lib/python3/dist-packages/pandas/core/frame.py", line 1943, in _ensure_valid_index
raise ValueError('Cannot set a frame with no defined index ' ValueError: Cannot set a frame with no defined index and a value that cannot be converted to a Series
我正在使用Python3.
I am using Python3.
以下是收集的Twitter数据的示例(.json格式).
EDIT : Below is a sample of the twitter data collected ( .json format).
{
"created_at": "Sat Mar 05 05:47:23 +0000 2016",
"id": 705993088574033920,
"id_str": "705993088574033920",
"text": "Tumi Inc. civil war: Staff manning US ceasefire hotline 'can't speak Arabic' #fakeheadlinebot #learntocode #makeatwitterbot #javascript",
"source": "\u003ca href=\"http://javascriptiseasy.com\" rel=\"nofollow\"\u003eJavaScript is Easy\u003c/a\u003e",
"truncated": false,
"in_reply_to_status_id": null,
"in_reply_to_status_id_str": null,
"in_reply_to_user_id": null,
"in_reply_to_user_id_str": null,
"in_reply_to_screen_name": null,
"user": {
"id": 4382400263,
"id_str": "4382400263",
"name": "JavaScript is Easy",
"screen_name": "javascriptisez",
"location": "Your Console",
"url": "http://javascriptiseasy.com",
"description": "Get learning!",
"protected": false,
"verified": false,
"followers_count": 167,
"friends_count": 68,
"listed_count": 212,
"favourites_count": 11,
"statuses_count": 55501,
"created_at": "Sat Dec 05 11:18:00 +0000 2015",
"utc_offset": null,
"time_zone": null,
"geo_enabled": false,
"lang": "en",
"contributors_enabled": false,
"is_translator": false,
"profile_background_color": "000000",
"profile_background_image_url": "http://abs.twimg.com/images/themes/theme1/bg.png",
"profile_background_image_url_https": "https://abs.twimg.com/images/themes/theme1/bg.png",
"profile_background_tile": false,
"profile_link_color": "FFCC4D",
"profile_sidebar_border_color": "000000",
"profile_sidebar_fill_color": "000000",
"profile_text_color": "000000",
"profile_use_background_image": false,
"profile_image_url": "http://pbs.twimg.com/profile_images/673099606348070912/xNxp4zOt_normal.jpg",
"profile_image_url_https": "https://pbs.twimg.com/profile_images/673099606348070912/xNxp4zOt_normal.jpg",
"profile_banner_url": "https://pbs.twimg.com/profile_banners/4382400263/1449314370",
"default_profile": false,
"default_profile_image": false,
"following": null,
"follow_request_sent": null,
"notifications": null
},
"geo": null,
"coordinates": null,
"place": null,
"contributors": null,
"is_quote_status": false,
"retweet_count": 0,
"favorite_count": 0,
"entities": {
"hashtags": [{
"text": "fakeheadlinebot",
"indices": [77, 93]
}, {
"text": "learntocode",
"indices": [94, 106]
}, {
"text": "makeatwitterbot",
"indices": [107, 123]
}, {
"text": "javascript",
"indices": [124, 135]
}],
"urls": [],
"user_mentions": [],
"symbols": []
},
"favorited": false,
"retweeted": false,
"filter_level": "low",
"lang": "en",
"timestamp_ms": "1457156843690"
}
推荐答案
I think you can use read_json
:
import pandas as pd
df = pd.read_json('file.json')
print df.head()
contributors coordinates created_at entities \
contributors_enabled NaN NaN 2016-03-05 05:47:23 NaN
created_at NaN NaN 2016-03-05 05:47:23 NaN
default_profile NaN NaN 2016-03-05 05:47:23 NaN
default_profile_image NaN NaN 2016-03-05 05:47:23 NaN
description NaN NaN 2016-03-05 05:47:23 NaN
favorite_count favorited filter_level geo \
contributors_enabled 0 False low NaN
created_at 0 False low NaN
default_profile 0 False low NaN
default_profile_image 0 False low NaN
description 0 False low NaN
id id_str \
contributors_enabled 705993088574033920 705993088574033920
created_at 705993088574033920 705993088574033920
default_profile 705993088574033920 705993088574033920
default_profile_image 705993088574033920 705993088574033920
description 705993088574033920 705993088574033920
... is_quote_status lang \
contributors_enabled ... False en
created_at ... False en
default_profile ... False en
default_profile_image ... False en
description ... False en
place retweet_count retweeted \
contributors_enabled NaN 0 False
created_at NaN 0 False
default_profile NaN 0 False
default_profile_image NaN 0 False
description NaN 0 False
source \
contributors_enabled <a href="http://javascriptiseasy.com" rel="nof...
created_at <a href="http://javascriptiseasy.com" rel="nof...
default_profile <a href="http://javascriptiseasy.com" rel="nof...
default_profile_image <a href="http://javascriptiseasy.com" rel="nof...
description <a href="http://javascriptiseasy.com" rel="nof...
text \
contributors_enabled Tumi Inc. civil war: Staff manning US ceasefir...
created_at Tumi Inc. civil war: Staff manning US ceasefir...
default_profile Tumi Inc. civil war: Staff manning US ceasefir...
default_profile_image Tumi Inc. civil war: Staff manning US ceasefir...
description Tumi Inc. civil war: Staff manning US ceasefir...
timestamp_ms truncated \
contributors_enabled 2016-03-05 05:47:23.690 False
created_at 2016-03-05 05:47:23.690 False
default_profile 2016-03-05 05:47:23.690 False
default_profile_image 2016-03-05 05:47:23.690 False
description 2016-03-05 05:47:23.690 False
user
contributors_enabled False
created_at Sat Dec 05 11:18:00 +0000 2015
default_profile False
default_profile_image False
description Get learning!
[5 rows x 25 columns]
这篇关于解析json文件的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!