使用python硒从youtube中提取评论 [英] comments extraction from youtube using python selenium

查看:103
本文介绍了使用python硒从youtube中提取评论的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在使用Python Selenium从youtube提取评论

I am using Python Selenium for comments extraction from youtube

from selenium import webdriver
browser = webdriver.Firefox()
browser.get("https://www.youtube.com/watch?v=a6NhKKl-iR0")
for elem in browser.find_elements_by_xpath('//body'):
print elem.text

如何获取评论?

推荐答案

注释在具有comment-renderer-text-content

for elem in browser.find_elements_by_xpath('//div[@class="comment-renderer-text-content"]'):
    print elem.text

哪个给您:

great stuff man. question: why use selenium for this site when the data you're looking for is in the source code and could be scraped with requests/beautifulsoup? disclaimer: i'm commenting a year later so the source code may be different :)
Good question, if the data is in source you're right, selenium is overkill.  I use selenium when I find it quicker to not have to reverse engineer a site looking for sever calls which return json data that only exists inside the browser etc...  So the bottom line is if you're really crafty picking off JSON calls to the server and replicating that without needing to have the DOM built for you than it's a much better be to use BeautifulSoup or Python Requests.   However if you're creating for instance an automated program to automatically pin, like stuff on facebook etc... you will most likely not be able to pull that off very easily just using BeautifulSoup. 
Answered my questions very well.
Great job! I do have a questions though. What if the site is built in silverlight? Then I cannot see the Xpath of each element...
the first test was slow because of a slow loading adserver, you can see it in firefox at the bottom bar.
This is good stuff.
Clear and useful although i'm using java. Thx
YOU ARE BETTER THAN A PROFESSIONAL TEACHER MAN!!!.. 
thanks man

注释是动态加载的,因此您可能需要等待元素的出现:

The comments are loaded dynamically so you might need to wait for the presecnce of the elements:

from selenium.webdriver.common.by import By
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.support import expected_conditions as EC


def wait(dr, x):
    element = WebDriverWait(dr, 20).until(
        EC.presence_of_all_elements_located((By.XPATH, x))
    )
    return element


from selenium import webdriver

browser = webdriver.Firefox()
browser.get("https://www.youtube.com/watch?v=a6NhKKl-iR0")

for elem in wait(browser, '//div[@class="comment-renderer-text-content"]'):
    print elem.text

这篇关于使用python硒从youtube中提取评论的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆