使用Python / Selenium / Best工具为Job获取通过JavaScript生成的图像请求的URI? [英] Using Python/Selenium/Best Tool For The Job to get URI of image requests generated through JavaScript?
问题描述
我从第三方供应商处获得了一些启动图片请求的JavaScript。我想弄清楚这个图像请求的URI。
I have some JavaScript from a 3rd party vendor that is initiating an image request. I would like to figure out the URI of this image request.
我可以在浏览器中加载页面,然后监控Live HTTP Headers或Tamper Data为了找出图像请求URI,但我更愿意创建一个命令行进程来执行此操作。
I can load the page in my browser, and then monitor "Live HTTP Headers" or "Tamper Data" in order to figure out the image request URI, but I would prefer to create a command line process to do this.
我的直觉是可能使用python + qtwebkit,但也许有更好的方法。
My intuition is that it might be possible using python + qtwebkit, but perhaps there is a better way.
澄清:我可能有这个(过于简化的代码)。
To clarify: I might have this (overly simplified code).
<script>
suffix = magicNumberFunctionIDontHaveAccessTo();
url = "http://foobar.com/function?parameter=" + suffix
img = document.createElement('img'); img.src=url; document.all.body.appendChild(img);
</script>
然后一旦页面加载,我就可以通过嗅探数据包找出网址。但我不能只从源头上弄明白,因为我无法预测magicNumberFunction的结果......()。
Then once the page is loaded, I can go figure out the url by sniffing the packets. But I can't just figure it out from the source, because I can't predict the outcome of magicNumberFunction...().
任何帮助都会非常感激!
Any help would be muchly appreciated!
谢谢。
推荐答案
最终,我做到了python,使用Selenium-RC。这个解决方案需要selenium-rc的python文件,你需要启动java服务器(java -jar selenium-server.jar)
Ultimately, I did it in python, using Selenium-RC. This solution requires the python files for selenium-rc, and you need to start the java server ("java -jar selenium-server.jar")
from selenium import selenium
import unittest
import lxml.html
class TestMyDomain(unittest.TestCase):
def setUp(self):
self.selenium = selenium("localhost", \
4444, "*firefox", "http://www.MyDomain.com")
self.selenium.start()
def test_mydomain(self):
htmldoc = open('site-list.html').read()
url_list = [link for (element, attribute,link,pos) in lxml.html.iterlinks(htmldoc)]
for url in url_list:
try:
sel = self.selenium
sel.open(url)
sel.select_window("null")
js_code = '''
myDomainWindow = this.browserbot.getUserWindow();
for(obj in myDomainWindow) {
/* This code grabs the OMNITURE tracking pixel img */
if ((obj.substring(0,4) == 's_i_') && (myDomainWindow[obj].src)) {
var ret = myDomainWindow[obj].src;
}
}
ret;
'''
omniture_url = sel.get_eval(js_code) #parse&process this however you want
except Exception, e:
print 'We ran into an error: %s' % (e,)
self.assertEqual("expectedValue", observedValue)
def tearDown(self):
self.selenium.stop()
if __name__ == "__main__":
unittest.main()
这篇关于使用Python / Selenium / Best工具为Job获取通过JavaScript生成的图像请求的URI?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!