使用selenium获取当前视频标记网址 [英] Getting current video tag URL with selenium

查看:181
本文介绍了使用selenium获取当前视频标记网址的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在尝试使用selenium(使用python绑定)获取当前的html5视频标记网址:

I'm trying to get the current html5 video tag URL using selenium (with python bindings):

from selenium import webdriver


driver = webdriver.Chrome()
driver.get('https://www.youtube.com/watch?v=9x6YclsLHN0')

video = driver.find_element_by_tag_name('video')
url = driver.execute_script("return arguments[0].currentSrc;", video)
print url

driver.quit()

问题在于 url 值打印为空。为什么这样,我该如何解决?

The problem is that the url value is printed empty. Why is that and how can I fix it?

我怀疑这是因为脚本被执行而且 currentSrc 在视频标记初始化之前返回值。我试图添加显式等待,但仍然打印出一个空字符串:

I suspect that this is because the script is executed and the currentSrc value is returned before the video tag has been initialized. I've tried to add an Explicit Wait, but still got an empty string printed:

from selenium.webdriver.common.by import By
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.support import expected_conditions as EC

wait = WebDriverWait(driver, 5)
video = wait.until(EC.visibility_of_element_located((By.TAG_NAME, 'video')))

这让我觉得我需要按照异步 进行操作。可能正在收听媒体活动并等待视频开始播放。

Which makes me feel I need to do it asynchronously. May be listening for the media events and wait for the video to start playing.

我也很确定 currentSrc 应该可以工作,因为如果我在控制台中执行代码并手动等待视频启动 - 我看到它打印视频 currentSrc 属性值。

I'm also pretty sure currentSrc should work, because if I execute the code in the console and manually wait for a video to start - I see it printing the video currentSrc attribute value.

FYI,也尝试使用java绑定,结果相同,为空字符串:

FYI, also tried with java bindings, same result, an empty string:

WebDriver driver = new ChromeDriver();
driver.get("https://www.youtube.com/watch?v=9x6YclsLHN0");

WebElement video = driver.findElement(By.tagName("video"));

JavascriptExecutor js = (JavascriptExecutor) driver;
String url = (String) js.executeScript("return arguments[0].currentSrc;", video);

System.out.println(url);


推荐答案

根据 W3视频标签规范


currentSrc DOM属性最初是空字符串。它的值
由资源选择算法改变。

The currentSrc DOM attribute is initially the empty string. Its value is changed by the resource selection algorithm.

这解释了问题中描述的行为。这也意味着要可靠地获得 currentSrc 值,我们需要等到媒体资源定义

Which explains the behavior described in the question. This also means that to get the currentSrc value reliably, we need to wait until the media resource has it defined.

订阅 loadstart 媒体事件 execute_async_script() 诀窍:

Subscribing to the loadstart media event through execute_async_script() did the trick:

driver.set_script_timeout(10) 

url = driver.execute_async_script("""
    var video = arguments[0],
        callback = arguments[arguments.length - 1];

    video.addEventListener('loadstart', listener);

    function listener() {
        callback(video.currentSrc);
    };
""", video)
print(url)

这篇关于使用selenium获取当前视频标记网址的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆