Python将Cassandra数据读入pandas [英] Python read Cassandra data into pandas

查看：42 发布时间：2021/12/31 17:22:22 python pandas cassandra

本文介绍了Python将Cassandra数据读入pandas的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

将 Cassandra 数据读入 Pandas 的正确和最快方法是什么?现在我使用下面的代码但是它很慢...

What is the proper and fastest way to read Cassandra data into pandas? Now I use the following code but it's very slow...

import pandas as pd

from cassandra.cluster import Cluster
from cassandra.auth import PlainTextAuthProvider
from cassandra.query import dict_factory

auth_provider = PlainTextAuthProvider(username=CASSANDRA_USER, password=CASSANDRA_PASS)
cluster = Cluster(contact_points=[CASSANDRA_HOST], port=CASSANDRA_PORT,
    auth_provider=auth_provider)

session = cluster.connect(CASSANDRA_DB)
session.row_factory = dict_factory

sql_query = "SELECT * FROM {}.{};".format(CASSANDRA_DB, CASSANDRA_TABLE)

df = pd.DataFrame()

for row in session.execute(sql_query):
    df = df.append(pd.DataFrame(row, index=[0]))

df = df.reset_index(drop=True).fillna(pd.np.nan)

阅读 1000 行需要 1 分钟，而我还有多一点"...如果我运行相同的查询，例如.在 DBeaver 中，我在一分钟内获得了全部结果(约 40k 行).

Reading 1000 rows takes 1 minute, and I have a "bit more"... If I run the same query eg. in DBeaver, I get the whole results (~40k rows) within a minute.

谢谢！！！

推荐答案

我在官方邮件列表(完美运行):

I got the answer at the official mailing list (it works perfectly):

尝试定义自己的pandas行工厂:

try to define your own pandas row factory:

def pandas_factory(colnames, rows):
    return pd.DataFrame(rows, columns=colnames)

session.row_factory = pandas_factory
session.default_fetch_size = None

query = "SELECT ..."
rslt = session.execute(query, timeout=None)
df = rslt._current_rows

这就是我做的方式 - 它应该更快......

That's the way i do it - an it should be faster...

如果你找到一个更快的方法 - 我很感兴趣 :)

If you find a faster method - i'm interested in :)

迈克尔

这篇关于Python将Cassandra数据读入pandas的文章就介绍到这了，希望我们推荐的答案对大家有所帮助，也希望大家多多支持IT屋！

查看全文

Python将Cassandra数据读入pandas [英] Python read Cassandra data into pandas

问题描述

推荐答案

相关文章

Python最新文章

热门教程

热门工具

登录关闭

Python将Cassandra数据读入pandas [英] Python read Cassandra data into pandas

问题描述

推荐答案

相关文章

Python最新文章

热门教程

热门工具

登录 关闭

登录关闭