在elasticsearch中插入多个文档 [英] Insert multiple documents in elasticsearch

查看:482
本文介绍了在elasticsearch中插入多个文档的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我必须在Elastic中插入一个json数组.链接中可接受的答案建议在每个json条目之前插入标题行.答案是2岁,市场上是否有更好的解决方案?我需要手动编辑json文件吗?

I have to insert a json array in elastic. The accepted answer in the link suggests to insert a header-line before each json entry. The answer is 2 years old, is there a better solution out in the market? Need I edit my json file manually?

有没有办法在Elasticsearch服务器中导入json文件(包含100个文档)??

[
  {
    "id":9,
    "status":"This is cool."
  },
  ...
]

推荐答案

好的,那么您可以使用简单的shell脚本来完成一些非常简单的操作(请参见下文).这个想法是不必手动编辑文件,而是让Python进行编辑并创建另一个文件,该文件的格式应符合

OK, then there's something pretty simple you can do using a simple shell script (see below). The idea is to not have to edit your file manually, but let Python do it and create another file whose format complies with what the _bulk endpoint expects. It does the following:

  1. 首先,我们声明一个小的Python脚本,该脚本读取您的JSON文件并创建一个具有所需文件格式的新脚本,以发送到_bulk端点.
  2. 然后,我们运行该Python脚本并存储批量文件
  3. 最后,我们使用简单的curl命令将在第2步中创建的文件发送到_bulk端点
  4. 开始,您现在有了一个包含文档的新ES索引
  1. First, we declare a little Python script that reads your JSON file and creates a new one with the required file format to be sent to the _bulk endpoint.
  2. Then, we run that Python script and store the bulk file
  3. Finally, we send the file created in step 2 to the _bulk endpoint using a simple curl command
  4. There you go, you now have a new ES index containing your documents

bulk.sh:

#!/bin/sh

# 0. Some constants to re-define to match your environment
ES_HOST=localhost:9200
JSON_FILE_IN=/path/to/your/file.json
JSON_FILE_OUT=/path/to/your/bulk.json

# 1. Python code to transform your JSON file
PYTHON="import json,sys;
out = open('$JSON_FILE_OUT', 'w');
with open('$JSON_FILE_IN') as json_in:
    docs = json.loads(json_in.read());
    for doc in docs:
        out.write('%s\n' % json.dumps({'index': {}}));
        out.write('%s\n' % json.dumps(doc, indent=0).replace('\n', ''));
"

# 2. run the Python script from step 1
python -c "$PYTHON"

# 3. use the output file from step 2 in the curl command
curl -s -XPOST $ES_HOST/index/type/_bulk --data-binary @$JSON_FILE_OUT

您需要:

  1. 将上述脚本保存在bulk.sh文件中并对其进行chmod(即chmod u+x bulk.sh)
  2. 修改ordre顶部(步骤0)中的三个变量以匹配您的环境
  3. 使用./bulk.sh
  4. 运行它
  1. save the above script in the bulk.sh file and chmod it (i.e. chmod u+x bulk.sh)
  2. modify the three variable at the top (step 0) in ordre to match your environment
  3. run it using ./bulk.sh

这篇关于在elasticsearch中插入多个文档的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆