哪个是有效的,使用sql联接查询或使用pandas合并查询? [英] which one is effecient, join queries using sql, or merge queries using pandas?

查看:96
本文介绍了哪个是有效的,使用sql联接查询或使用pandas合并查询?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我想使用pandas dataframe中多个表中的数据.我有两种从服务器下载数据的想法,一种方法是使用SQL联接和检索数据,一种方法是分别下载数据帧并使用pandas.merge合并它们.

I want to use data from multiple tables in a pandas dataframe. I have 2 idea for downloading data from the server, one way is to use SQL join and retrieve data and one way is to download dataframes separately and merge them using pandas.merge.

当我想将数据下载到pandas时.

when I want to download data into pandas.

query='''SELECT table1.c1, table2.c2
    FROM table1
    INNER JOIN table2 ON table1.ID=table2.ID where condidtion;'''
df = pd.read_sql(query,engine)

熊猫合并

df1 = pd.read_sql('select c1 from table1 where condition;',engine)
df2 = pd.read_sql('select c2 from table2 where condition;',engine)
df = pd.merge(df1,df2,on='ID', how='inner')

哪个更快?假设我要对2个以上的表和2个列执行此操作. 有什么更好的主意吗? 如果有必要知道我使用PostgreSQL.

which one is faster? Assume that I want to do that for more than 2 tables and 2 columns. Is there any better idea? If it is necessary to know I use PostgreSQL.

推荐答案

前者比后者快.前者仅需对数据库进行一次调用,并返回已加入并已过滤的结果.但是,后者对数据库进行两次调用,然后将结果集合并到应用程序/程序中.

The former is faster than the latter. The former just do a single call to the database, and return the result already joined and filtered. However, the latter do two calls to the database, and then it merges the result sets in the application/program.

这篇关于哪个是有效的,使用sql联接查询或使用pandas合并查询?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆