弹性搜索得到的结果太多,需要帮助过滤查询 [英] elasticsearch getting too many results, need help filtering query
问题描述
我有很多问题了解ES查询系统的底层。例如,我有以下查询:
{
size:0,
query:{
bool:{
必须:[
{
term:{
referer:www.xx.yy.com
}
},
{
range:{
@timestamp:{
gte:now,
lt:now-1h
}
}
}
]
}
},
aggs:{
间隔:{
date_histogram :{
field:@timestamp,
interval:0.5h
},
aggs:{
what :$ {
$ b b}
请求得到太多结果:
status:500,reason:
ElasticsearchException [org.elasticsearch.common.breaker.CircuitBreakingException:
数据太大,字段[@timestamp]的数据将大于[3200306380 / 2.9gb]]的限制
;嵌套:
UncheckedExecutionException [org.elasticsearch.common.breaker.CircuitBreakingException:
数据太大,字段[@timestamp]的数据将大于限制
[3200306380 / 2.9gb]] ;嵌套:CircuitBreakingException [Data too
large,field [@timestamp]的数据将大于
的限制[3200306380 / 2.9gb]];
我已经尝试了这个请求:
code> {
size:0,
filter:{
and:[
{
term:{
referer:www.geoportail.gouv.fr
}
},
{
range:{
@timestamp:{
from:2014-10-04,
to:2014-10-05
}
}
}
]
},
aggs:{
interval:{
date_histogram:{
field:@timestamp,
间隔:0.5h
},
aggs:{
what:{
cardinality:{
field 主持人
}
}
}
}
}
}
我想过滤数据,以获得正确的结果,任何帮助将不胜感激!
我发现了一个解决方案,这很奇怪。
我已经遵循dimzak建议并清除缓存:
curl --noproxy localhost -XPOSThttp: / localhost:9200 / _cache / clear
然后我使用过滤而不是Olly建议的查询: / p>
{
size:0,
query:{
:{
query:{
term:{
referer:www.xx.yy.fr
}
},
filter:{
range:{
@timestamp:{
from:2014-10-04T00:00,
to :2014-10-05T00:00
}
}
}
}
},
aggs:{
interval:{
date_histogram:{
field:@timestamp,
interval:0.5h
},
aggs:{
what:{
cardinality:{
field:host
}
}
}
}
}
}
我不能给你两个ansxwer,我认为dimzak值得最好,但赞成你两个人:)
I'm having much problem understanding the underlying of ES querying system.
I've got the following query for example:
{
"size": 0,
"query": {
"bool": {
"must": [
{
"term": {
"referer": "www.xx.yy.com"
}
},
{
"range": {
"@timestamp": {
"gte": "now",
"lt": "now-1h"
}
}
}
]
}
},
"aggs": {
"interval": {
"date_histogram": {
"field": "@timestamp",
"interval": "0.5h"
},
"aggs": {
"what": {
"cardinality": {
"field": "host"
}
}
}
}
}
}
That request get too many results:
"status" : 500, "reason" : "ElasticsearchException[org.elasticsearch.common.breaker.CircuitBreakingException: Data too large, data for field [@timestamp] would be larger than limit of [3200306380/2.9gb]]; nested: UncheckedExecutionException[org.elasticsearch.common.breaker.CircuitBreakingException: Data too large, data for field [@timestamp] would be larger than limit of [3200306380/2.9gb]]; nested: CircuitBreakingException[Data too large, data for field [@timestamp] would be larger than limit of [3200306380/2.9gb]]; "
I've tryied that request:
{
"size": 0,
"filter": {
"and": [
{
"term": {
"referer": "www.geoportail.gouv.fr"
}
},
{
"range": {
"@timestamp": {
"from": "2014-10-04",
"to": "2014-10-05"
}
}
}
]
},
"aggs": {
"interval": {
"date_histogram": {
"field": "@timestamp",
"interval": "0.5h"
},
"aggs": {
"what": {
"cardinality": {
"field": "host"
}
}
}
}
}
}
I would like to filter the data in order to be able to get a correct result, any help would be much appreciated!
I found a solution, it's kind of weird. I've followed dimzak adviced and clear the cache:
curl --noproxy localhost -XPOST "http://localhost:9200/_cache/clear"
Then I used filtering instead of querying as Olly suggested:
{
"size": 0,
"query": {
"filtered": {
"query": {
"term": {
"referer": "www.xx.yy.fr"
}
},
"filter" : {
"range": {
"@timestamp": {
"from": "2014-10-04T00:00",
"to": "2014-10-05T00:00"
}
}
}
}
},
"aggs": {
"interval": {
"date_histogram": {
"field": "@timestamp",
"interval": "0.5h"
},
"aggs": {
"what": {
"cardinality": {
"field": "host"
}
}
}
}
}
}
I cannot give you both the ansxwer, I think dimzak deserves it best, but thumbs up to you two guys :)
这篇关于弹性搜索得到的结果太多,需要帮助过滤查询的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!