如何在许多文件上运行python脚本以获取许多输出文件? [英] How can I run a python script on many files to get many output files?

查看:48
本文介绍了如何在许多文件上运行python脚本以获取许多输出文件?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我是编程新手,我编写了一个脚本来从vcf文件中提取文本.我正在使用Linux虚拟机并运行Ubuntu.我已通过在命令行中运行该脚本,方法是将目录更改为包含vcf文件的文件,然后输入 python script.py .

I am new at programming and I have written a script to extract text from a vcf file. I am using a Linux virtual machine and running Ubuntu. I have run this script through the command line by changing my directory to the file with the vcf file in and then entering python script.py.

我的脚本知道要处理哪个文件,因为我的脚本的开头是:

My script knows which file to process because the beginning of my script is:

my_file = open("inputfile1.vcf", "r+")
outputfile = open("outputfile.txt", "w")

该脚本将我需要的信息放入列表中,然后将其写入到outputfile中.但是,我有很多输入文件(全部为 .vcf ),并希望将它们写入与输入名称相似的不同输出文件中(例如 input_processed.txt ).

The script puts the information I need into a list and then I write it to outputfile. However, I have many input files (all .vcf) and want to write them to different output files with a similar name to the input (such as input_processed.txt).

我是否需要运行shell脚本来遍历文件夹中的文件?如果是这样,我将如何更改python脚本以适应这种情况?即将列表写入输出文件?

Do I need to run a shell script to iterate over the files in the folder? If so how would I change the python script to accommodate this? I.e writing the list to an outputfile?

推荐答案

我会将其集成到Python脚本中,这将使您也可以轻松地在其他平台上运行它,并且也不会增加太多代码.

I would integrate it within the Python script, which will allow you to easily run it on other platforms too and doesn't add much code anyway.

import glob
import os

# Find all files ending in 'vcf'
for vcf_filename in glob.glob('*.vcf'):
    vcf_file = open(vcf_filename, 'r+')

    # Similar name with a different extension
    output_filename = os.path.splitext(vcf_filename)[0] + '.txt'
    outputfile = open(output_filename, 'w')

    # Process the data
    ...

要在单独的目录中输出结果文件,我将:

To output the resulting files in a separate directory I would:

import glob
import os

output_dir = 'processed'
os.makedirs(output_dir, exist_ok=True)

# Find all files ending in 'vcf'
for vcf_filename in glob.glob('*.vcf'):
    vcf_file = open(vcf_filename, 'r+')

    # Similar name with a different extension
    output_filename = os.path.splitext(vcf_filename)[0] + '.txt'
    outputfile = open(os.path.join(output_dir, output_filename), 'w')

    # Process the data
    ...

这篇关于如何在许多文件上运行python脚本以获取许多输出文件?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆