Cloudera Impala中的多个查询执行 [英] Multiple query execution in cloudera impala
问题描述
是否可以在impala中同时执行多个查询?如果是,则impala如何处理它?</ p>
Is it possible to execute multiple queries at the same time in impala ? If yes, how does impala handle it?
推荐答案
我当然会自己做一些测试,但是我无法使多个查询得以执行:
我正在使用Impala连接,并从.sql文件读取查询。
I would certainly do some tests on your own, but I was not able to get multiple queries to execute: I was using Impala connection, and reading query from a .sql file. This works for single commands.
from impala.dbapi import connect
# actual server and port changed for this post for security
conn=connect(host='impala server', port=11111,auth_mechanism="GSSAPI")
cursor = conn.cursor()
cursor.execute((open("sandbox/z_temp.sql").read()))
这是我收到的错误。
HiveServer2Error: AnalysisException: Syntax error in line 2:
这是.sql文件中SQL的外观。
This is what the SQL looked like in the .sql file.
Select * FROM database1.table1;
Select * FROM database1.table2;
我能够在单独的.sql文件中遍历所有.sql的SQL命令运行多个命令在指定文件夹中的文件。
I was able to run multiple commands with the SQL commands in separate .sql files iterating over all .sql files in a specified folder.
#Create list of file names for recon .sql files this will be sorted
#Numbers at begining of filename are important to sort so that files will be executed in correct order
file_names = glob.glob('folder/.sql')
asc_names = sorted(file_names, reverse = False)
filename = ""
for file_name in asc_names:
str_filename = str(file_name)
print(filename)
query = (open(str_filename).read())
cursor = conn.cursor()
# creates an error log dataframe to print, or write to file at end of job.
try:
# Each SQL command must be executed seperately
cursor.execute(query)
df_id= pd.DataFrame([{'test_name': str_filename[-40:], 'test_status': 'PASS'}])
df_log = df_log.append(df_id, ignore_index=True)
except:
df_id= pd.DataFrame([{'test_name': str_filename[-40:], 'test_status': 'FAIL'}])
df_log = df_log.append(df_id, ignore_index=True)
continue
另一种方法是将一个.sql文件中的所有SQL语句用;分隔;然后循环通过.sql文件将语句分割出来;一次运行一次。
Another way to do this would be to have all of the SQL statements in one .sql file separated by ; then loop thru the .sql file splitting statements out by ; running one at a time.
from impala.dbapi import connect
from impala.util import as_pandas
conn=connect(host='impalaserver', port=11111, auth_mechanism='GSSAPI')
cursor = conn.cursor()
# split SQL statements from one file seperated by ';', Note: last command will not have semicolon at end.
sql_file = open("sandbox/temp.sql").read()
sql = sql_file.split(';')
for cmd in sql:
# This gets rid of the non printing characters you may have
cmd = cmd.replace('/r','')
cmd = cmd.replace('/n','')
# This runs your SQL commands one at a time.
cursor.execute(cmd)
print(cmd)
这篇关于Cloudera Impala中的多个查询执行的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!