从Google专利中使用Python 3.4下载文件 [英] Download files using Python 3.4 from Google Patents
问题描述
我想下载(使用Python 3.4)Google专利批量下载页面上的所有(.zip)文件 http://www.google.com/googlebooks/uspto-patents-grants-text.html
I would like to download (using Python 3.4) all (.zip) files on the Google Patent Bulk Download Page http://www.google.com/googlebooks/uspto-patents-grants-text.html
我想知道这相当于大量的数据。)我想将所有文件保存一年,目录 [年] ,所以1976年每周)文件。我想将它们保存到我的Python脚本所在的目录。
(I am aware that this amounts to a large amount of data.) I would like to save all files for one year in directories [year], so 1976 for all the (weekly) files in 1976. I would like to save them to the directory that my Python script is in.
我尝试使用 urllib .request
包,但我可以得到足够远的http文本,而不是如何点击文件下载它。
I've tried using the urllib.request
package, but I could get far enoughto get to the http text, not how to "click" on the file to download it.
import urllib.request
url = 'http://www.google.com/googlebooks/uspto-patents-grants-text.html'
savename = 'google_patent_urltext'
urllib.request.urlretrieve(url, savename )
非常感谢您的帮助。
推荐答案
据了解,一个命令将模拟左键文件并自动下载。如果是这样,可以使用硒。
如下:
As I understand you seek for a command that will simulate leftclicking on file and automatically download it. If so, you can use Selenium. something like:
from selenium import webdriver
from selenium.webdriver.firefox.firefox_profile import FirefoxProfile
profile = FirefoxProfile ()
profile.set_preference("browser.download.folderList",2)
profile.set_preference("browser.download.manager.showWhenStarting",False)
profile.set_preference("browser.download.dir", 'D:\\') #choose folder to download to
profile.set_preference("browser.helperApps.neverAsk.saveToDisk",'application/octet-stream')
driver = webdriver.Firefox(firefox_profile=profile)
driver.get('https://www.google.com/googlebooks/uspto-patents-grants-text.html#2015')
filename = driver.find_element_by_xpath('//a[contains(text(),"ipg150106.zip")]') #use loop to list all zip files
filename.click()
更新!应该使用'application / octet-stream'zip-mime类型,而不是application / zip。现在应该工作:)
这篇关于从Google专利中使用Python 3.4下载文件的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!