获取instagram帖子的喜欢者列表-Python&硒 [英] Getting list of likers for an instagram post - Python & Selenium
问题描述
我正在训练网络爬网.为此,我向自己提出挑战,以获取喜欢instagram上的帖子的所有人员的列表.我的问题是,我坚持只获得点赞者的前11个用户名的问题.在找到喜欢的内容时,我找不到自动滚动过程的正确方法.
I'm training to web crawling. To do so, I've challenged myself to get the list of all people having liked a post on instagram. My problem is that I'm stuck to the point where I only get the first 11 usernames of likers. I cannot find the right way to automate the scrolling process while getting the likes.
这是我在Jupyter Notebook中的过程(它尚不能作为脚本运行):
Here is my process in Jupyter Notebook (it doesn't work as a script yet):
from selenium import webdriver
import pandas as pd
driver = webdriver.Chrome()
driver.get('https://www.instagram.com/p/BuE82VfHRa6/')
userid_element = driver.find_elements_by_xpath('//*[@id="react-root"]/section/main/div/div/article/div[2]/section[2]/div/div/a')[0].click()
elems = driver.find_elements_by_xpath("//*[@id]/div/a")
users = []
for elem in elems:
users.append(elem.get_attribute('title'))
print(users)
你们有什么主意吗?
Do you guys have any idea?
非常感谢
推荐答案
我想instagram网站使用喜欢的用户元素的数量最多为17.
因此,这是一个循环
I guess instagram site use liked user elements maximum 17.
so, this is one loop
- 从网络获取元素列表
- 保存到我的列表
- 向下滚动以获得新元素
- 检查,这是最后一个滚动元素吗?
driver.get('https://www.instagram.com/p/BuE82VfHRa6/')
userid_element = driver.find_elements_by_xpath('//*[@id="react-root"]/section/main/div/div/article/div[2]/section[2]/div/div/a')[0].click()
time.sleep(2)
# here, you can see user list you want.
# you have to scroll down to download more data from instagram server.
# loop until last element with users table view height value.
users = []
height = driver.find_element_by_xpath("/html/body/div[3]/div/div[2]/div/div").value_of_css_property("padding-top")
match = False
while match==False:
lastHeight = height
# step 1
elements = driver.find_elements_by_xpath("//*[@id]/div/a")
# step 2
for element in elements:
if element.get_attribute('title') not in users:
users.append(element.get_attribute('title'))
# step 3
driver.execute_script("return arguments[0].scrollIntoView();", elements[-1])
time.sleep(1)
# step 4
height = driver.find_element_by_xpath("/html/body/div[3]/div/div[2]/div/div").value_of_css_property("padding-top")
if lastHeight==height:
match = True
print(users)
print(len(users))
driver.quit()
我在近100个喜欢的帖子中进行了测试,并且效果很好.
I test in near 100 liked post, and it worked.
这篇关于获取instagram帖子的喜欢者列表-Python&硒的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!