有没有办法使用 IMDbPY 提取 IMDb 评论? [英] Is there a way to extract IMDb reviews using IMDbPY?

查看:33
本文介绍了有没有办法使用 IMDbPY 提取 IMDb 评论?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我不需要 Kaggle 中提供的数据集.我想使用 IMDbPY 或任何其他抓取方法从 IMDb 中提取电影评论.

https://imdbpy.github.io/

解决方案

虽然从 imdbpy docs 中并不明显

一>.您始终可以通过检查变量的键来检查变量的属性.当您使用 imdbpy 抓取电影时,并非您要查找的所有信息都不会立即可用.在您的情况下,您想获得评论.所以你必须添加它们.我们可以在信息集中看到,有三种不同类型的评论;评论"、外部评论"和评论家评论".与这些关联的键尚未添加.下面的例子展示了它是如何完成的.

from imdb import IMDb# 创建一个 IMDb 类的实例ia = IMDb()the_matrix = ia.get_movie('0133093')打印(排序(the_matrix.keys()))# 显示可以为电影获取的所有信息集print(ia.get_movie_infoset()) #我们可以添加的信息.将添加密钥ia.update(the_matrix, ['外部评论'])ia.update(the_matrix, ['评论'])ia.update(the_matrix, ['评论家评论'])# 显示信息集添加了哪些键print(the_matrix.infoset2keys['external review']) #没有外部评论,所以没有添加keyprint(the_matrix.infoset2keys['reviews']) # 很多评论.添加键:'评论'print(the_matrix.infoset2keys['critic review']) #添加键:'metascore'和'metacritic url'# 打印(the_matrix['评论'])print(sorted(the_matrix.keys())) #查看我们添加的新键

I do not need the data-set, that's available in Kaggle . I want to extract a movie review from IMDb using IMDbPY or any other scraping method .

https://imdbpy.github.io/

解决方案

While it is not obvious from the imdbpy docs. You can always check the attributes of variable by checking the keys of the variables. Not all information that you are looking for is not immediately available when you scrape a movie using imdbpy. In your case you want to get the reviews. So you have to add them. We can see in the infoset, that there are three different types of reviews; 'reviews', 'external reviews', and 'critic reviews'. The keys that are associated with these are not added yet. The example below shows how it is done.

from imdb import IMDb

# create an instance of the IMDb class
ia = IMDb()

the_matrix = ia.get_movie('0133093')
print(sorted(the_matrix.keys()))

# show all information sets that can be fetched for a movie
print(ia.get_movie_infoset()) #Information we can add. Keys will be added
ia.update(the_matrix, ['external reviews'])
ia.update(the_matrix, ['reviews'])
ia.update(the_matrix, ['critic reviews'])
# show which keys were added by the information set
print(the_matrix.infoset2keys['external reviews']) #no external reviews, so no key is added
print(the_matrix.infoset2keys['reviews']) # A lot of reviews. Adds key: 'reviews'
print(the_matrix.infoset2keys['critic reviews']) #Adds the keys: 'metascore', and 'metacritic url'
# print(the_matrix['reviews'])
print(sorted(the_matrix.keys())) #Check out the new keys that we have added

这篇关于有没有办法使用 IMDbPY 提取 IMDb 评论?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆