硒蟒下载,但文件保存为.part [英] Selenium-python downloading but file is saved as .part

查看:118
本文介绍了硒蟒下载,但文件保存为.part的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我的脚本工作,但它将文件保存为.part,尽管检查这是与手动下载的文件相同的大小和幸运的完成。我不明白为什么它被保存为部分文件。 Sorta不方便我的下一个想法。有没有人知道为什么会这样?这是我的代码...它的工作原理...

 从selenium import webdriver 
从selenium.common.exceptions import NoSuchElementException
从selenium.webdriver.common.keys导入键
导入时间
导入机械化
导入urllib
从urllib导入urlretrieve

fp = webdriver.FirefoxProfile()

fp.set_preference(browser.download.folderList,1)
fp.set_preference(browser.download.manager.showWhenStarting,False)
fp.set_preference(browser.download.dir,'users / matthewyoung / Downloads')
fp.set_preference(browser.helperApps.neverAsk.saveToDisk,纯文本)
fp .set_preference(browser.download.manager.scanWhenDone,False)
fp.set_preference(browser.download.manager.showAlertOnComplete,True)
fp.set_preference(browser.download.manager。 useWindow,False)
fp.set_preference(browser.helperApps.alwaysAsk.force,False)

browser = webdriver.Firefox(firefox_profile = fp)



#browser = webdriver.Firefox()#获取本地会话的firefox
browser.get(http://vizier.u-strasbg.fr/vizier/surveys.htx)#加载页面
在browser.title中声明VizieR
#p = raw_input('星名? ')
elem = browser.find_element_by_name(' - c')#查找查询框
elem.send_keys('mwc 560'+ Keys.RETURN)
time.sleep(0.2)#让页面加载,将被添加到API
elem = browser.find_element_by_name(' - out.max')
elem.send_keys('unlimited'+ Keys.TAB)
elem2 =
time.sleep(0.5)
elem2.send_keys('; -Separated-Values')
time.sleep(0.5)
elem2.send_keys(Keys.TAB)
elem2.send_keys(Keys.TAB)
time.sleep(0.2)
browser.find_element_by_class_name('data')。submit()
time.sleep(3.0)
#df = elem2.send_keys(Keys.SPACE)
#print df
browser.close()
/ pre>

解决方案

正在下载为.part,因为出现弹出窗口另存为对话窗口。 Python无法处理弹出窗口。我发现,当您尝试在webdriver中设置自定义配置文件的设置时,它不一定可以正常工作(例如,我可以在selenium中设置自定义配置文件来下载csv而不是pdf)。但是,我可以通过在Firefox中创建自定义配置文件来解决我的pdf问题。我不是很有经验的tsv文件,所以我不知道这将是什么设置。如果您可以创建一个新的firefox配置文件(遵循以下说明: https://support.mozilla.org/en-US/kb/profile-manager-create-and-remove-firefox-profiles
您可以尝试设置该配置文件默认保存tsv。如果您不知道确切的设置进入并更改about:config,您可以尝试单击弹出窗口中的复选框来始终保存这些类型的文件。



从那里,您将您的个人资料设置为您创建的自定义个人资料:

  profile = webdriver.firefox.firefox_profile.FirefoxProfile(/ Users / matthewyoung / Library / Application Support / Firefox / Profiles /您的个人资料名称)

My script works but it's saving the file as a .part, although checking this against a manually downloaded file its the same size and thankfully complete. I can't understand why it's being saved as a partial file though. Sorta inconvenient for my next idea. Does anybody have an idea of why this might be? Here's my code...which works...

from selenium import webdriver
from selenium.common.exceptions import NoSuchElementException
from selenium.webdriver.common.keys import Keys
import time
import mechanize
import urllib
from urllib import urlretrieve

fp = webdriver.FirefoxProfile()

fp.set_preference("browser.download.folderList",1)
fp.set_preference("browser.download.manager.showWhenStarting",False)
fp.set_preference("browser.download.dir",'Users/matthewyoung/Downloads')
fp.set_preference("browser.helperApps.neverAsk.saveToDisk","Plain text")
fp.set_preference("browser.download.manager.scanWhenDone",False)
fp.set_preference("browser.download.manager.showAlertOnComplete",True)
fp.set_preference("browser.download.manager.useWindow",False)
fp.set_preference("browser.helperApps.alwaysAsk.force",False)

browser = webdriver.Firefox(firefox_profile=fp)



#browser = webdriver.Firefox() # Get local session of firefox
browser.get("http://vizier.u-strasbg.fr/vizier/surveys.htx") # Load page
assert "VizieR" in browser.title
#p = raw_input('Star name? ')
elem = browser.find_element_by_name('-c') # Find the query box
elem.send_keys('mwc 560' + Keys.RETURN)
time.sleep(0.2) # Let the page load, will be added to the API
elem=browser.find_element_by_name('-out.max')
elem.send_keys('unlimited'+Keys.TAB)
elem2=browser.find_element_by_name('-out.form')
time.sleep(0.5)
elem2.send_keys('; -Separated-Values')
time.sleep(0.5)
elem2.send_keys(Keys.TAB)
elem2.send_keys(Keys.TAB)
time.sleep(0.2)
browser.find_element_by_class_name('data').submit()
time.sleep(3.0)
#df=elem2.send_keys(Keys.SPACE)
#print df
browser.close()

解决方案

It is downloading as .part because that popup save as dialog window appears. Python cannot deal with the popup window. I have found that when you try to set settings for a custom profile in webdriver it doesn't necessarily work (for instance I was able to set a custom profile in selenium to download a csv but not a pdf). However, I was able to solve my pdf problem by creating a custom profile in firefox. I am not very experienced with tsv files so I am not sure what setting that would be. If you can create a new firefox profile (following the instructions here: https://support.mozilla.org/en-US/kb/profile-manager-create-and-remove-firefox-profiles) you can try to set that profile to save tsv by default. If you don't know the exact setting to go in and change in "about:config" you can try just click the checkbox on the popup to always save those kinds of files.

From there you set your profile to that custom profile you created like this:

    profile = webdriver.firefox.firefox_profile.FirefoxProfile("/Users/matthewyoung/Library/Application Support/Firefox/Profiles/"YOUR PROFILE NAME")

Keep in mind that YOUR PROFILE NAME will have a bunch of random letters first, so follow that path to find the actual profile name.

这篇关于硒蟒下载,但文件保存为.part的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆