使用Python / Selenium / Best工具为Job获取通过JavaScript生成的图像请求的URI? [英] Using Python/Selenium/Best Tool For The Job to get URI of image requests generated through JavaScript?

查看:193
本文介绍了使用Python / Selenium / Best工具为Job获取通过JavaScript生成的图像请求的URI?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我从第三方供应商处获得了一些启动图片请求的JavaScript。我想弄清楚这个图像请求的URI。

I have some JavaScript from a 3rd party vendor that is initiating an image request. I would like to figure out the URI of this image request.

我可以在浏览器中加载页面,然后监控Live HTTP Headers或Tamper Data为了找出图像请求URI,但我更愿意创建一个命令行进程来执行此操作。

I can load the page in my browser, and then monitor "Live HTTP Headers" or "Tamper Data" in order to figure out the image request URI, but I would prefer to create a command line process to do this.

我的直觉是可能使用python + qtwebkit,但也许有更好的方法。

My intuition is that it might be possible using python + qtwebkit, but perhaps there is a better way.

澄清:我可能有这个(过于简化的代码)。

To clarify: I might have this (overly simplified code).

<script>
suffix = magicNumberFunctionIDontHaveAccessTo();
url = "http://foobar.com/function?parameter=" + suffix
img = document.createElement('img'); img.src=url; document.all.body.appendChild(img);
</script>

然后一旦页面加载,我就可以通过嗅探数据包找出网址。但我不能只从源头上弄明白,因为我无法预测magicNumberFunction的结果......()。

Then once the page is loaded, I can go figure out the url by sniffing the packets. But I can't just figure it out from the source, because I can't predict the outcome of magicNumberFunction...().

任何帮助都会非常感激!

Any help would be muchly appreciated!

谢谢。

推荐答案

最终,我做到了python,使用Selenium-RC。这个解决方案需要selenium-rc的python文件,你需要启动java服务器(java -jar selenium-server.jar)

Ultimately, I did it in python, using Selenium-RC. This solution requires the python files for selenium-rc, and you need to start the java server ("java -jar selenium-server.jar")

from selenium import selenium
import unittest
import lxml.html

class TestMyDomain(unittest.TestCase):
    def setUp(self):
        self.selenium = selenium("localhost", \
            4444, "*firefox", "http://www.MyDomain.com")
        self.selenium.start()

    def test_mydomain(self):

        htmldoc = open('site-list.html').read()
        url_list = [link for (element, attribute,link,pos) in lxml.html.iterlinks(htmldoc)]
        for url in url_list:

            try: 
                sel = self.selenium
                sel.open(url)        
                sel.select_window("null")
                js_code = '''
                myDomainWindow = this.browserbot.getUserWindow();
                for(obj in myDomainWindow) {  

                   /* This code grabs the OMNITURE tracking pixel img */
                    if ((obj.substring(0,4) == 's_i_') && (myDomainWindow[obj].src)) {              
                        var ret = myDomainWindow[obj].src;
                    } 
                }        
                ret;
                '''
                omniture_url = sel.get_eval(js_code) #parse&process this however you want


            except Exception, e:
                print 'We ran into an error: %s' % (e,)


        self.assertEqual("expectedValue", observedValue)


    def tearDown(self):
        self.selenium.stop()

if __name__ == "__main__":
    unittest.main()

这篇关于使用Python / Selenium / Best工具为Job获取通过JavaScript生成的图像请求的URI?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆