不链接到文件迭代器的Python链接 [英] Python Link to File Iterator not Iterating

查看:64
本文介绍了不链接到文件迭代器的Python链接的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

这个问题让我难过了几天,我相信我终于把它的范围缩小到了这段代码.如果有人能告诉我如何解决此问题,以及为什么会发生,那就太棒了.

This one has had me stumped for a couple of days now and I believe I've finally narrowed it down to this block of code. If anyone can tell me how to fix this, and why it is happening it would be awesome.

import urllib2

GetLink = 'http://somesite.com/search?q=datadata#page'
holder = range(1,3)

for LinkIncrement in holder:
    h = GetLink + str(LinkIncrement)
    ReadLink = urllib2.urlopen(h)
    f = open('test.txt', 'w')

    for line in ReadLink:
        f.write(line)  

    f.close()
    main() #calls function main that does stuff with the file
    continue

问题在于,如果我执行以下操作,则只能从'http://somesite.com/search?q=datadata#page'中写入数据.

The problem is it will only write the data from 'http://somesite.com/search?q=datadata#page' if I do the below the results print correctly.

for LinkIncrement in holder:
    h = GetLink + str(LinkIncrement)
    print h

我正在复制的链接确实以这种方式增加,并且我能够通过复制和粘贴来打开URL.此外,我已经尝试过使用while循环进行此操作,但始终会得到相同的结果.

The link I am copying does indeed increment in this manner and I am able to open the urls by copying and pasting. Additionally, I have tried this with a while loop, but always get the same results.

下面的代码用递增的URL /search?q=datadata#page1/search?q=datadata#page2/search?q=datadata#page3打开3个选项卡.只是无法使其在我的代码中起作用.

The below code opens 3 tabs with the incremented urls /search?q=datadata#page1, /search?q=datadata#page2, and /search?q=datadata#page3. Just can't make it work in my code.

import webbrowser
import urllib2
h = ''
def tab(passed):
    url = passed
    webbrowser.open_new_tab(url + '/')

def test():

    g = 'http://somesite.com/search?q=datadata#page'
    f = urllib2.urlopen(g)      
    NewVar = 1
    PageCount = 1

    while PageCount < 4:

            h = g + str(NewVar)                  
            PageCount += 1
            NewVar += 1
            tab(h)
test()

感谢Falsetru帮助我解决了这个问题.该网站在首页之后的所有页面上都使用json.

Thanks to Falsetru for helping me figure this out. The website was using json for any pages after the first page.

推荐答案

在url中,#之后的部分(

In the url, the part after # (fragment identifier) is not passed to web server; Server respond with same content because parts before framents identifier are same.

#something由浏览器(javascript)处理.您需要查看javascript中会发生什么.

#something is handled by browser (javascript). You need to see what happens in javascript.

这篇关于不链接到文件迭代器的Python链接的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆