无法将下载的文件存储在其相关文件夹中 [英] Can't store downloaded files in their concerning folders

查看:111
本文介绍了无法将下载的文件存储在其相关文件夹中的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我已经用python与硒结合编写了一个脚本,以从网页上下载少量文档文件(以.doc结尾).我不希望使用requestsurllib模块下载文件的原因是因为我当前正在申请访问的网站没有将任何真实的url连接到每个文件.它们是用javascript加密的.但是,我在脚本中选择了一个链接来模仿该链接.

I've written a script in python in combination with selenium to download few document files (ending with .doc) from a webpage. The reason I do not wish to use requests or urllib module to download the files is because the website I'm currently palying with do not have any true url connected to each file. They are javascript encrypted. However, I've chosen a link within my script to mimic the same.

此刻我的脚本做什么:

  1. 在桌面上创建一个主文件夹
  2. 在主文件夹中创建带有要下载文件名称的子文件夹
  3. 首先下载文件,然后单击其链接,然后将其放入主文件夹中. (this is what I need rectified)

如何修改脚本以下载文件,然后单击链接并将其下载到相关文件夹中?

How can I modify my script to download the files initiating click on their links and put the downloaded files in their concerning folders?

这是我到目前为止的尝试:

This is my try so far:

import os
import time
from selenium import webdriver

link ='https://www.online-convert.com/file-format/doc' 

dirf = os.path.expanduser('~')
desk_location = dirf + r'\Desktop\file_folder'
if not os.path.exists(desk_location):os.mkdir(desk_location)

def download_files():
    driver.get(link)
    for item in driver.find_elements_by_css_selector("a[href$='.doc']")[:2]:
        filename = item.get_attribute("href").split("/")[-1]
        #creating new folder in accordance with filename to store the downloaded file in thier concerning folder
        folder_name = item.get_attribute("href").split("/")[-1].split(".")[0]
        #set the new location of the folders to be created
        new_location = os.path.join(desk_location,folder_name)
        if not os.path.exists(new_location):os.mkdir(new_location)
        #set the location of the folders the downloaded files will be within
        file_location = os.path.join(new_location,filename)
        item.click()

        time_to_wait = 10
        time_counter = 0
        try:
            while not os.path.exists(file_location):
                time.sleep(1)
                time_counter += 1
                if time_counter > time_to_wait:break
        except Exception:pass

if __name__ == '__main__':
    chromeOptions = webdriver.ChromeOptions()
    prefs = {'download.default_directory' : desk_location,
            'profile.default_content_setting_values.automatic_downloads': 1
        }
    chromeOptions.add_experimental_option('prefs', prefs)
    driver = webdriver.Chrome(chrome_options=chromeOptions)
    download_files()

下图表示当前如何存储下载的文件(the files are outside of their concerning folders):

The following image represents how the downloaded files are currently stored (the files are outside of their concerning folders):

推荐答案

我刚刚添加了文件的重命名来移动它.因此它可以像您所拥有的一样工作,但是一旦下载了文件,便会将其移至正确的路径:

I just added the the rename of the file to move it. So it'll work just as you have it, but then once it downloads the file, will move it to the correct path:

os.rename(desk_location + '\\' + filename, file_location)

完整代码:

import os
import time
from selenium import webdriver

link ='https://www.online-convert.com/file-format/doc' 

dirf = os.path.expanduser('~')
desk_location = dirf + r'\Desktop\file_folder'
if not os.path.exists(desk_location):
    os.mkdir(desk_location)

def download_files():
    driver.get(link)
    for item in driver.find_elements_by_css_selector("a[href$='.doc']")[:2]:
        filename = item.get_attribute("href").split("/")[-1]
        #creating new folder in accordance with filename to store the downloaded file in thier concerning folder
        folder_name = item.get_attribute("href").split("/")[-1].split(".")[0]
        #set the new location of the folders to be created
        new_location = os.path.join(desk_location,folder_name)
        if not os.path.exists(new_location):
            os.mkdir(new_location)
        #set the location of the folders the downloaded files will be within
        file_location = os.path.join(new_location,filename)
        item.click()

        time_to_wait = 10
        time_counter = 0

        try:
            while not os.path.exists(file_location):
                time.sleep(1)
                time_counter += 1
                if time_counter > time_to_wait:break
            os.rename(desk_location + '\\' + filename, file_location)
        except Exception:pass

if __name__ == '__main__':
    chromeOptions = webdriver.ChromeOptions()
    prefs = {'download.default_directory' : desk_location,
            'profile.default_content_setting_values.automatic_downloads': 1
        }
    chromeOptions.add_experimental_option('prefs', prefs)
    driver = webdriver.Chrome(chrome_options=chromeOptions)
    download_files()

这篇关于无法将下载的文件存储在其相关文件夹中的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆