Python-使用不同的变量遍历同一查询,合并数据帧 [英] Python - loop through same query with different variables, merge data frames

查看:123
本文介绍了Python-使用不同的变量遍历同一查询,合并数据帧的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我在SAS中有一个查询,在该查询中,我使用Macro变量来重复对具有不同变量的Teradata的查询.我们有5个数据库,每个状态一个,我在其中运行相同的查询,但是使用变量更新状态,然后修复所有数据集.我正在寻找有关如何在python中执行此操作的帮助.

I have a query in SAS where I use a Macro variable to repeat a query to Teradata with a different variable. We have 5 databases, one for each state, where I run the same query, but use the variable to update the state, then mend all data sets. I'm looking for help in how I can do this in python.

循环遍历{state1,state2,state3,state4,state5},并将每个查询另存为{stateX} _df,然后合并所有

loop through {state1, state2, state3, state4, state5} and save each query as {stateX}_df then merge all

import teradata as td
import pandas as pd
from teradata import tdodbc

udaExec = td.UdaExec(appConfigFile="udaexec.ini")

with udaExec.connect("${dataSourceName}", LoginTimeout=120) as session:     

query1 = """database my_db_{state1};"""

     query2 = """  
                select  distinct
                {state1}, item_a, item_b
                from table

              """  
    session.execute(query1)
    session.execute(query2)

    {stateX}_df = pd.read_sql(query2), session)

推荐答案

以下是使用易失表的改进版本:

Here is an improved version with volatile table use: Python SQL loop variables through multiple queries

udaExec = td.UdaExec(appConfigFile="udaexec.ini")

with udaExec.connect("${dataSourceName}") as session:

state_dataframes = []
STATES = ["state1", "state2", "state3", "state4", "state5"]

for state in STATES:

        query1 = """database my_db_{};"""

        query2 = """   
        create set volatile table v_table
        ,no fallback, no before journal, no after journal as
        (  
        select top 10
        '{}' as state
        ,t.*
        from table t
        )   
        with data
        primary index (dw_key)  
        on commit preserve rows;
        """

        query3 = """
        create set volatile table v_table_2
        ,no fallback, no before journal, no after journal as
        (  
        select t.*
        from v_table t
        )   
        with data
        primary index (dw_clm_key)  
        on commit preserve rows;

        """

        query4 = """

        select t.* 
        from v_table_2 t

        """

        session.execute(query1.format(state))
        session.execute(query2.format(state))
        session.execute(query3)
        session.execute(query4)
        state_dataframes.append(pd.read_sql(query4, session))
        session.execute("DROP TABLE v_table")
        session.execute("DROP TABLE v_table_2")

all_states_df = pd.concat(state_dataframes)

这篇关于Python-使用不同的变量遍历同一查询,合并数据帧的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆