如何使用Twitter数据中的Python和AFINN库进行情感分析? [英] How to do sentiment analysis using Python and AFINN library from Twitter data?

查看:1254
本文介绍了如何使用Twitter数据中的Python和AFINN库进行情感分析?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

问题如下:



通过分析来自twitter的推文,了解不同人对于非货币化的看法。使用Twitter API通过使用适当的过滤器提取与印度的恶魔化相关的所有推文。将每个推文文本划分为单词以计算整个推文的情绪。使用字典AFINN按照其含义从+5到-5对单词进行评级。 AFINN是一个字典,由2500个单词组成,根据其含义从+5到-5评级。您可以从以下链接下载字典:** AFINN字典(https://drive.google.com/file/d/1AjcJCNH2Oc9j-bJgjdELaUMlWZ3pXyfZ/view) **现在,您必须为每条推文生成情绪评级。之后,通过从负面情绪推文中过滤出积极的情绪推文来执行情绪分析。



我尝试过:



我已经提取了JSON格式的推特数据。但我不知道如何继续,因为我不知道JSON。有人可以给我下一步该怎么做的代码吗?



 import tweepy 
import json
import time

#Twitter API证书
CONSUMER_KEY = tUnfouXORfnaVNEAyTrLmW2ZU
CONSUMER_SECRET = Yxmd1sLKqp2YwXzJ5IJjaVO6PtrOeq1lKyl5AS2Zu2zktjYZKQ
access_key = 1215780002-2fC55jHbZ4X7NDHgKFJMO1g63Aw0jn1zdmhJjs8
access_secret = MJfwXrZ9hKvfb8EUba7eoKlu5BIPDwRDKAXHZOBPdPc2p

auth = tweepy.OAuthHandler(consumer_key,consumer_secret)
auth.set_access_token(access_key,access_secret)
api = tweepy.API(auth,wait_on_rate_limit = True,wait_on_rate_limit_notify = True)
#refer http://docs.tweepy.org/en/v3.2.0/api.html#API
#tells tweepy.API自动等待费率限制以补充

#Put您的搜索词
searchquery =#Demonetisation

users = tweepy.Cursor(api.search,q = searchquery).items()
count = 0
errorCount = 0

file = open('search.json','w')

而True:
try:
user = next(用户)
count + = 1
#use在开发期间使用count-break来避免twitter限制
#if(count> 10):
#break
除了tweepy.TweepError:
#catches发生速率限制时出现TweepError,休眠,然后重新启动。
#nominally 15分钟,做一点时间以避免注意力。
print(sleeping ....)
user = next(用户)
除了StopIteration:
break
try:
print(Writing到JSON推文号码:+ str(count)
json.dump(user._json,file,sort_keys = True,indent = 4)

除了UnicodeEncodeError:
errorCount + = 1
print(UnicodeEncodeError,errorCount =+ str(errorCount))

print(completed,errorCount =+ str(errorCount)+total tweets =+ str (计数))

解决方案

   Twitter API凭证 
consumer_key = tUnfouXORfnaVNEAyTrLmW2ZU
consumer_secret = Yxmd1sLKqp2YwXzJ5IJjaVO6PtrOeq1lKyl5AS2Zu2zktjYZKQ
access_key = 1215780002-2fC55jHbZ4X7NDHgKFJMO1g63Aw0jn1zdmhJjs8
access_secret = MJfwXrZ9hKvfb8EUba7eoKlu5BIPDwRDKAXHZOBPdPc2p



看起来你刚刚创建了一个大问题。

你刚刚将你的Twitter凭证发布到公共互联网。

建议:尽快使这些凭证失效。


请参阅 19.2。 json - JSON编码器和解码器 - Python 3.6.5文档 [ ^ ]。

THE QUESTION IS AS FOLLOWS:

Find out the views of different people on the demonetization by analysing the tweets from twitter. Use the Twitter API to extract all tweets related to demonetisation in India by using appropriate filters. Divide each tweet text into words to calculate the sentiment of the whole tweet. Rate the word as per its meaning from +5 to -5 using the dictionary AFINN. The AFINN is a dictionary which consists of 2500 words which are rated from +5 to -5 depending on their meaning. You can download the dictionary from the following link: **AFINN dictionary ( https://drive.google.com/file/d/1AjcJCNH2Oc9j-bJgjdELaUMlWZ3pXyfZ/view )** Now, you have to generate a sentiment rating for each tweet. After that, perform the sentiment analysis by filtering out positive sentiment tweets from negative sentiment tweets.

What I have tried:

I have extracted the twitter data which is in JSON format. But I do not know how to proceed since I don't know about JSON. Can anybody please give me the code about what to do next?

import tweepy
import json
import time

#Twitter API credentials
consumer_key = "tUnfouXORfnaVNEAyTrLmW2ZU"
consumer_secret = "Yxmd1sLKqp2YwXzJ5IJjaVO6PtrOeq1lKyl5AS2Zu2zktjYZKQ"
access_key = "1215780002-2fC55jHbZ4X7NDHgKFJMO1g63Aw0jn1zdmhJjs8"
access_secret = "MJfwXrZ9hKvfb8EUba7eoKlu5BIPDwRDKAXHZOBPdPc2p"

auth = tweepy.OAuthHandler(consumer_key, consumer_secret)
auth.set_access_token(access_key, access_secret)
api = tweepy.API(auth, wait_on_rate_limit=True, wait_on_rate_limit_notify=True)
#refer http://docs.tweepy.org/en/v3.2.0/api.html#API
#tells tweepy.API to automatically wait for rate limits to replenish

#Put your search term
searchquery = "#Demonetisation"

users =tweepy.Cursor(api.search,q=searchquery).items()
count = 0
errorCount=0

file = open('search.json', 'w') 

while True:
    try:
        user = next(users)
        count += 1
        #use count-break during dev to avoid twitter restrictions
        #if (count>10):
        #    break
    except tweepy.TweepError:
        #catches TweepError when rate limiting occurs, sleeps, then restarts.
        #nominally 15 minnutes, make a bit longer to avoid attention.
        print ("sleeping....")
        user = next(users)
    except StopIteration:
        break
    try:
        print ("Writing to JSON tweet number:"+str(count))
        json.dump(user._json,file,sort_keys = True,indent = 4)
        
    except UnicodeEncodeError:
        errorCount += 1
        print ("UnicodeEncodeError,errorCount ="+str(errorCount))

print ("completed, errorCount ="+str(errorCount)+" total tweets="+str(count))

解决方案

#Twitter API credentials
consumer_key = "tUnfouXORfnaVNEAyTrLmW2ZU"
consumer_secret = "Yxmd1sLKqp2YwXzJ5IJjaVO6PtrOeq1lKyl5AS2Zu2zktjYZKQ"
access_key = "1215780002-2fC55jHbZ4X7NDHgKFJMO1g63Aw0jn1zdmhJjs8"
access_secret = "MJfwXrZ9hKvfb8EUba7eoKlu5BIPDwRDKAXHZOBPdPc2p"


Looks like you just created a big problem.
You have just published your twitter credentials to public internet.
Advice: have those credentials invalidated as fast as possible.


See 19.2. json — JSON encoder and decoder — Python 3.6.5 documentation[^].


这篇关于如何使用Twitter数据中的Python和AFINN库进行情感分析?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆