获取instagram帖子的喜​​欢者列表-Python&硒 [英] Getting list of likers for an instagram post - Python & Selenium

查看:42
本文介绍了获取instagram帖子的喜​​欢者列表-Python&硒的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在训练网络爬网.为此,我向自己提出挑战,以获取喜欢instagram上的帖子的所有人员的列表.我的问题是,我坚持只获得点赞者的前11个用户名的问题.在找到喜欢的内容时,我找不到自动滚动过程的正确方法.

I'm training to web crawling. To do so, I've challenged myself to get the list of all people having liked a post on instagram. My problem is that I'm stuck to the point where I only get the first 11 usernames of likers. I cannot find the right way to automate the scrolling process while getting the likes.

这是我在Jupyter Notebook中的过程(它尚不能作为脚本运行):

Here is my process in Jupyter Notebook (it doesn't work as a script yet):

from selenium import webdriver
import pandas as pd

driver = webdriver.Chrome()

driver.get('https://www.instagram.com/p/BuE82VfHRa6/')

userid_element = driver.find_elements_by_xpath('//*[@id="react-root"]/section/main/div/div/article/div[2]/section[2]/div/div/a')[0].click()

elems = driver.find_elements_by_xpath("//*[@id]/div/a")

users = []

for elem in elems:
    users.append(elem.get_attribute('title'))

print(users)


你们有什么主意吗?


Do you guys have any idea?

非常感谢

推荐答案

我想instagram网站使用喜欢的用户元素的数量最多为17.
因此,这是一个循环

I guess instagram site use liked user elements maximum 17.
so, this is one loop

  1. 从网络获取元素列表
  2. 保存到我的列表
  3. 向下滚动以获得新元素
  4. 检查,这是最后一个滚动元素吗?

driver.get('https://www.instagram.com/p/BuE82VfHRa6/')

userid_element = driver.find_elements_by_xpath('//*[@id="react-root"]/section/main/div/div/article/div[2]/section[2]/div/div/a')[0].click()
time.sleep(2)

# here, you can see user list you want.
# you have to scroll down to download more data from instagram server.
# loop until last element with users table view height value.

users = []

height = driver.find_element_by_xpath("/html/body/div[3]/div/div[2]/div/div").value_of_css_property("padding-top")
match = False
while match==False:
    lastHeight = height

    # step 1
    elements = driver.find_elements_by_xpath("//*[@id]/div/a")

    # step 2
    for element in elements:
        if element.get_attribute('title') not in users:
            users.append(element.get_attribute('title'))

    # step 3
    driver.execute_script("return arguments[0].scrollIntoView();", elements[-1])
    time.sleep(1)

    # step 4
    height = driver.find_element_by_xpath("/html/body/div[3]/div/div[2]/div/div").value_of_css_property("padding-top")
    if lastHeight==height:
        match = True

print(users)
print(len(users))
driver.quit()

我在近100个喜欢的帖子中进行了测试,并且效果很好.

I test in near 100 liked post, and it worked.

这篇关于获取instagram帖子的喜​​欢者列表-Python&硒的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆