如何合并两个计划执行的程序 [英] How to merge two programs with scheduled execution
问题描述
我正在尝试合并两个程序或编写第三个程序,以将这两个程序称为函数.他们应该一个接一个地运行,并且间隔一定时间(以分钟为单位).诸如make文件之类的东西,稍后将包含更多程序.我无法合并它们,也无法将它们放入某种格式,从而无法在新的main
程序中调用它们.
I am trying to merge two programs or write a third program that will call these two programs as function. They are supposed to run one after the other and after interval of certain time in minutes. something like a make file which will have few more programs included later. I am not able to merge them nor able to put them into some format that will allow me to call them in a new main
program.
program_ master_id.py
从文件夹位置选择*.csv
文件,并在计算后将master_ids.csv
文件附加到文件夹的另一个位置.
program_master_id.py
picks the *.csv
file from a folder location and after computing appends the master_ids.csv
file in another location of folder.
Program_ master_count.py
将count
相对于各个timeseries
中的Ids
计数进行除法.
Program_master_count.py
divides the count
with respect to the count ofIds
in the respective timeseries
.
程序_1 master_id.py
import pandas as pd
import numpy as np
# csv file contents
# Need to change to path as the Transition_Data has several *.CSV files
csv_file1 = 'Transition_Data/Test_1.csv'
csv_file2 = '/Transition_Data/Test_2.csv'
#master file to be appended only
master_csv_file = 'Data_repository/master_lac_Test.csv'
csv_file_all = [csv_file1, csv_file2]
# read csv into df using list comprehension
# I use buffer here, replace stringIO with your file path
df_all = [pd.read_csv(csv_file) for csv_file in csv_file_all]
# processing
# =====================================================
# concat along axis=0, outer join on axis=1
merged = pd.concat(df_all, axis=0, ignore_index=True, join='outer').set_index('Ids')
# custom function to handle/merge duplicates on Ids (axis=0)
def apply_func(group):
return group.fillna(method='ffill').iloc[-1]
# remove Ids duplicates
merged_unique = merged.groupby(level='Ids').apply(apply_func)
# do the subtraction
df_master = pd.read_csv(master_csv_file, index_col=['Ids']).sort_index()
# select matching records and horizontal concat
df_matched = pd.concat([df_master,merged_unique.reindex(df_master.index)], axis=1)
# use broadcasting
df_matched.iloc[:, 1:] = df_matched.iloc[:, 1:].sub(df_matched.iloc[:, 0], axis=0)
print(df_matched)
程序_2 master_count.py
#This does not give any error nor gives any output.
import pandas as pd
import numpy as np
csv_file1 = '/Data_repository/master_lac_Test.csv'
csv_file2 = '/Data_repository/lat_lon_master.csv'
df1 = pd.read_csv(csv_file1).set_index('Ids')
# need to sort index in file 2
df2 = pd.read_csv(csv_file2).set_index('Ids').sort_index()
# df1 and df2 has a duplicated column 00:00:00, use df1 without 1st column
temp = df2.join(df1.iloc[:, 1:])
# do the division by number of occurence of each Ids
# and add column 00:00:00
def my_func(group):
num_obs = len(group)
# process with column name after 00:30:00 (inclusive)
group.iloc[:,4:] = (group.iloc[:,4:]/num_obs).add(group.iloc[:,3], axis=0)
return group
result = temp.groupby(level='Ids').apply(my_func)
我正在尝试编写一个main
程序,该程序将首先调用master_ids.py
,然后调用master_count.py
.他们是将两者合并到一个程序中还是将它们编写为函数并在新程序中调用这些函数的方法?请提出建议.
I am trying to write a main
program that will call master_ids.py
first and then master_count.py
. Is their a way to merge both in one program or write them as functions and call those functions in a new program ? Please suggest.
推荐答案
Okey,可以说您有program1.py:
Okey, lets say you have program1.py:
import pandas as pd
import numpy as np
def main_program1():
csv_file1 = 'Transition_Data/Test_1.csv'
...
return df_matched
然后是program2.py:
And then program2.py:
import pandas as pd
import numpy as np
def main_program2():
csv_file1 = '/Data_repository/master_lac_Test.csv'
...
result = temp.groupby(level='Ids').apply(my_func)
return result
您现在可以在单独的python程序中使用它们,例如main.py
You can now use these in a separate python program, say main.py
import time
import program1 # imports program1.py
import program2 # imports program2.py
df_matched = program1.main_program1()
print(df_matched)
# wait
min_wait = 1
time.sleep(60*min_wait)
# call the second one
result = program2.main_program2()
有很多方法可以改进"这些方法,但是希望这会告诉您要点.我特别建议您使用 __name__ =="__main__"怎么办? ? 在每个文件中,因此它们可以轻松地从命令行执行或从python调用.
There are lots of ways to 'improve' these, but hopefully this will show you the gist. I would in particular recommend you use the What does if __name__ == "__main__": do? in each of the files, so that they can easily be executed from the command-line or called from python.
另一种选择是shell脚本,它会将您的'master_id.py'和'master_count.py'变成(以最简单的形式)
Another option is a shell script, which for your 'master_id.py' and 'master_count.py' become (in its simplest form)
python master_id.py
sleep 60
python master_count.py
保存在"main.sh"中,可以执行为
saved in 'main.sh' this can be executed as
sh main.sh
这篇关于如何合并两个计划执行的程序的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!