将新行JSON上载到Elasticsearch批量API [英] Upload new line JSON to Elasticsearch bulk API

查看:65
本文介绍了将新行JSON上载到Elasticsearch批量API的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在尝试使用Bulk API将新行JSON上载到Elasticsearch.我要上传的批量JSON如下所示,每个JSON都换行了:

I'm trying to upload a new line JSON to Elasticsearch using the Bulk API. The bulk JSON I'm uploading looks like this, with each JSON on a new line:

{"ip": "x.x.x.x", "seen": true, "classification": "malicious", "spoofable": false, "first_seen": "2020-03-31", "last_seen": "2020-04-15", "actor": "unknown", "tags": ["ADB Worm", "HTTP Alt Scanner", "Mirai", "Web Scanner"], "cve": [], "metadata": {"country": "United Kingdom", "country_code": "GB", "city": "redacted", "organization": "redacted", "rdns": "", "asn": "ASxxx", "tor": false, "os": "Linux 2.2-3.x", "category": "isp"}, "raw_data": {"scan": [{"port": 80, "protocol": "TCP"}, {"port": 81, "protocol": "TCP"}, {"port": 88, "protocol": "TCP"}, {"port": 5555, "protocol": "TCP"}, {"port": 8080, "protocol": "TCP"}], "web": {}, "ja3": []}}
{"ip": "x.x.x.x", "seen": true, "classification": "malicious", "spoofable": true, "first_seen": "2020-04-09", "last_seen": "2020-04-11", "actor": "unknown", "tags": ["Eternalblue", "SMB Scanner"], "cve": ["CVE-2017-0144"], "metadata": {"country": "United Kingdom", "country_code": "GB", "city": "redacted", "organization": "redacted", "rdns": "host.somehost.com", "asn": "ASxxx", "tor": false, "os": "Windows 7/8", "category": "isp"}, "raw_data": {"scan": [{"port": 445, "protocol": "TCP"}], "web": {}, "ja3": []}}
{"ip": "x.x.x.x", "seen": true, "classification": "malicious", "spoofable": true, "first_seen": "2019-09-05", "last_seen": "2020-04-06", "actor": "unknown", "tags": ["Mirai"], "cve": [], "metadata": {"country": "United Kingdom", "country_code": "GB", "city": "redacted", "organization": "redacted", "rdns": "redacted", "asn": "ASxxx", "tor": false, "os": "Linux 2.2.x-3.x (Embedded)", "category": "isp"}, "raw_data": {"scan": [{"port": 23, "protocol": "TCP"}, {"port": 2323, "protocol": "TCP"}], "web": {}, "ja3": []}}

JSON的开头没有索引或键.因此,当我尝试使用此命令上传它时(my_index是没有映射的空白索引).

There's no index or key at the head of the JSON. So of course when I try to upload it with this command (my_index is a blank index with no mapping).

curl -s -H 'Content-Type: application/x-ndjson' -X POST http://localhost:9200/my_index/_bulk --data-binary @my_newline_json.json

我收到错误消息:

{"error":{"root_cause":[{"type":"illegal_argument_exception","reason":"The bulk request must be terminated by a newline [\\n]"}],"type":"illegal_argument_exception","reason":"The bulk request must be terminated by a newline [\\n]"},"status":400}

因此,如果我按照正确理解问题,文档,问题在于该错误是因为在JSON的开头未指定索引或类型.我的问题是我不了解如何添加必要的索引和类型,以便可以读取JSON.

So if I understand the problem correctly as per the docs, the issue is that the error is because there's no index or type specified at the start of the JSON. My problem is that I don't understand how to add the necessary index and type so that the JSON can be read.

我正在使用Curl创建数据并将其添加到索引中,那么最好的方法是格式化curl命令以正确创建索引并允许JSON上传?

I'm using Curl to create and add data to my index so what would the best way be format a curl command to create the index properly and allow my JSON to be uploaded?

(我以前通过 MosheZada 使用了出色的Elasticsearch_loader工具,该工具可用于指定索引并键入命令.效果很好,但是我试图了解该命令中发生了什么,以及如果需要如何使用Curl进行相同的操作.)

(I have previously used the excellent Elasticsearch_loader tool by MosheZada which lets you specify the index and type in the command. This worked well but I'm trying to understand what is happening in that command and how I could do the same thing with Curl if needed.)

推荐答案

curl -H 'Content-Type: application/x-ndjson' -XPOST 'localhost:9200/index-name/doc-type/_bulk?pretty' --data-binary @my_newline_json.json

将批量JSON更改为以下格式.您的my_newline_json.json应该看起来像这样:

Change your bulk JSON, to the following format. Your my_newline_json.json should look like this:

{"index":{}}
{"ip": "x.x.x.x", "seen": true, "classification": "malicious", "spoofable": false, "first_seen": "2020-03-31", "last_seen": "2020-04-15", "actor": "unknown", "tags": ["ADB Worm", "HTTP Alt Scanner", "Mirai", "Web Scanner"], "cve": [], "metadata": {"country": "United Kingdom", "country_code": "GB", "city": "redacted", "organization": "redacted", "rdns": "", "asn": "ASxxx", "tor": false, "os": "Linux 2.2-3.x", "category": "isp"}, "raw_data": {"scan": [{"port": 80, "protocol": "TCP"}, {"port": 81, "protocol": "TCP"}, {"port": 88, "protocol": "TCP"}, {"port": 5555, "protocol": "TCP"}, {"port": 8080, "protocol": "TCP"}], "web": {}, "ja3": []}}
{"index":{}}
{"ip": "x.x.x.x", "seen": true, "classification": "malicious", "spoofable": true, "first_seen": "2020-04-09", "last_seen": "2020-04-11", "actor": "unknown", "tags": ["Eternalblue", "SMB Scanner"], "cve": ["CVE-2017-0144"], "metadata": {"country": "United Kingdom", "country_code": "GB", "city": "redacted", "organization": "redacted", "rdns": "host.somehost.com", "asn": "ASxxx", "tor": false, "os": "Windows 7/8", "category": "isp"}, "raw_data": {"scan": [{"port": 445, "protocol": "TCP"}], "web": {}, "ja3": []}}
{"index":{}}
{"ip": "x.x.x.x", "seen": true, "classification": "malicious", "spoofable": true, "first_seen": "2019-09-05", "last_seen": "2020-04-06", "actor": "unknown", "tags": ["Mirai"], "cve": [], "metadata": {"country": "United Kingdom", "country_code": "GB", "city": "redacted", "organization": "redacted", "rdns": "redacted", "asn": "ASxxx", "tor": false, "os": "Linux 2.2.x-3.x (Embedded)", "category": "isp"}, "raw_data": {"scan": [{"port": 23, "protocol": "TCP"}, {"port": 2323, "protocol": "TCP"}], "web": {}, "ja3": []}}

不要忘记在内容的末尾添加新行.

批量JSON格式:

输出结果:

这篇关于将新行JSON上载到Elasticsearch批量API的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆