如何在许多文件上运行python脚本以获取许多输出文件? [英] How can I run a python script on many files to get many output files?
问题描述
我是编程新手,我编写了一个脚本来从vcf文件中提取文本.我正在使用Linux虚拟机并运行Ubuntu.我已通过在命令行中运行该脚本,方法是将目录更改为包含vcf文件的文件,然后输入 python script.py
.
I am new at programming and I have written a script to extract text from a vcf file. I am using a Linux virtual machine and running Ubuntu. I have run this script through the command line by changing my directory to the file with the vcf file in and then entering python script.py
.
我的脚本知道要处理哪个文件,因为我的脚本的开头是:
My script knows which file to process because the beginning of my script is:
my_file = open("inputfile1.vcf", "r+")
outputfile = open("outputfile.txt", "w")
该脚本将我需要的信息放入列表中,然后将其写入到outputfile中.但是,我有很多输入文件(全部为 .vcf
),并希望将它们写入与输入名称相似的不同输出文件中(例如 input_processed.txt
).
The script puts the information I need into a list and then I write it to outputfile. However, I have many input files (all .vcf
) and want to write them to different output files with a similar name to the input (such as input_processed.txt
).
我是否需要运行shell脚本来遍历文件夹中的文件?如果是这样,我将如何更改python脚本以适应这种情况?即将列表写入输出文件?
Do I need to run a shell script to iterate over the files in the folder? If so how would I change the python script to accommodate this? I.e writing the list to an outputfile?
推荐答案
我会将其集成到Python脚本中,这将使您也可以轻松地在其他平台上运行它,并且也不会增加太多代码.
I would integrate it within the Python script, which will allow you to easily run it on other platforms too and doesn't add much code anyway.
import glob
import os
# Find all files ending in 'vcf'
for vcf_filename in glob.glob('*.vcf'):
vcf_file = open(vcf_filename, 'r+')
# Similar name with a different extension
output_filename = os.path.splitext(vcf_filename)[0] + '.txt'
outputfile = open(output_filename, 'w')
# Process the data
...
要在单独的目录中输出结果文件,我将:
To output the resulting files in a separate directory I would:
import glob
import os
output_dir = 'processed'
os.makedirs(output_dir, exist_ok=True)
# Find all files ending in 'vcf'
for vcf_filename in glob.glob('*.vcf'):
vcf_file = open(vcf_filename, 'r+')
# Similar name with a different extension
output_filename = os.path.splitext(vcf_filename)[0] + '.txt'
outputfile = open(os.path.join(output_dir, output_filename), 'w')
# Process the data
...
这篇关于如何在许多文件上运行python脚本以获取许多输出文件?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!