使用PyE的Elasticsearch批量索引 [英] Elasticsearch bulk index in chunks using PyEs

查看：129 发布时间：2017/2/24 17:34:15 python csv elasticsearch

本文介绍了使用PyE的Elasticsearch批量索引的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

我有一个简单的python脚本用于索引包含1百万行的CSV文件：

I have a simple python script for indexing a CSV file containing 1 million rows:

import csv
from pyes import *

reader = csv.reader(open('data.csv', 'rb'))

conn = ES('127.0.0.1:9200', timeout=20.0)

counter = 0
for row in reader:
        try:
                data = {"name":row[5]}
                conn.index(data,'namesdb',counter, bulk=True)
                counter += 1
        except:
                pass

这种方法很好，但是当我们进入成千上万的时候，所有的都会呈指数下降。

This works quite well but as we go into the thousands, it all slows down exponentially.

猜测如果我在较小的块中执行索引ES将表现更好。

I'm guessing if I did the index in smaller chunks ES will perform better.

有更有效的方法吗？会sleep（）延迟帮助？

Is there a more efficient way of doing this? Would a sleep() delay help? or is there an easy way to break up the csv into smaller chunks programmatically?

感谢。

使用PyE的Elasticsearch批量索引 [英] Elasticsearch bulk index in chunks using PyEs

问题描述

推荐答案

相关文章

Python最新文章

热门教程

热门工具

登录关闭

使用PyE的Elasticsearch批量索引 [英] Elasticsearch bulk index in chunks using PyEs

问题描述

推荐答案

相关文章

Python最新文章

热门教程

热门工具

登录 关闭

登录关闭