使用Java API获取Elasticsearch的所有记录 [英] Getting all records from Elasticsearch using Java API
问题描述
n [[Wild Thing] [localhost:9300] [indices:data / read / search [phase / DFS]]];
嵌套:QueryPhaseExecutionException [结果窗口太大,从
+大小必须小于或等于:[10000],但是[10101]。
我的代码如下
客户端客户端;
尝试{
client = TransportClient.builder()。build()。
addTransportAddress(new InetSocketTransportAddress(InetAddress.getByName(localhost),9300));
int from = 1;
int to = 100;
while(from< = 131881){
SearchResponse response = client
.prepareSearch(demo_risk_data)
.setSearchType(SearchType.DFS_QUERY_THEN_FETCH).setFrom(from)
.setQuery(QueryBuilders.boolQuery()。mustNot(QueryBuilders.termQuery(user_agent,)))
.setSize(to).setExplain(true).execute()。actionGet()
if(response.getHits()。getHits()。length> 0){
for(SearchHit searchData:response.getHits()。getHits()){
JSONObject value = new的JSONObject(searchData.getSource());
System.out.println(value.toString());
}
}
}
}
总计目前存在的记录数是131881,所以我从从= 1
和到= 100
开始,然后获得100条记录直到从< = 131881
。有没有办法,我可以检查得到记录在一组说100,直到没有进一步的记录在Elasticsearch。
您可以这样做:
客户端客户端;
尝试{
client = TransportClient.builder()。build()。
addTransportAddress(new InetSocketTransportAddress(InetAddress.getByName(localhost),9300));
QueryBuilder qb = QueryBuilders.boolQuery()。mustNot(QueryBuilders.termQuery(user_agent,));
SearchResponse scrollResp = client.prepareSearch(demo_risk_data)
.addSort(SortParseElement.DOC_FIELD_NAME,SortOrder.ASC)
.setScroll(new TimeValue(60000))
.setQuery qb)
.setSize(100).execute()。actionGet();
//滚动直到没有命中返回
while(true){
//中断条件:没有命中返回
if(scrollResp.getHits()。 getHits()。length == 0){
break;
}
//否则读取结果
for(SearchHit命中:scrollResp.getHits()。getHits()){
JSONObject value = new JSONObject(searchData。的getSource());
System.out.println(value.toString());
}
//准备下一个查询
scrollResp = client.prepareSearchScroll(scrollResp.getScrollId())。setScroll(new TimeValue(60000))。execute()。actionGet );
}
}
I am trying to get all the records from Elasticsearch using Java API. But I receive the below error
n[[Wild Thing][localhost:9300][indices:data/read/search[phase/dfs]]]; nested: QueryPhaseExecutionException[Result window is too large, from + size must be less than or equal to: [10000] but was [10101].
My code is as below
Client client;
try {
client = TransportClient.builder().build().
addTransportAddress(new InetSocketTransportAddress(InetAddress.getByName("localhost"), 9300));
int from = 1;
int to = 100;
while (from <= 131881) {
SearchResponse response = client
.prepareSearch("demo_risk_data")
.setSearchType(SearchType.DFS_QUERY_THEN_FETCH).setFrom(from)
.setQuery(QueryBuilders.boolQuery().mustNot(QueryBuilders.termQuery("user_agent", "")))
.setSize(to).setExplain(true).execute().actionGet();
if (response.getHits().getHits().length > 0) {
for (SearchHit searchData : response.getHits().getHits()) {
JSONObject value = new JSONObject(searchData.getSource());
System.out.println(value.toString());
}
}
}
}
Total number of records currently present are 131881 ,so I start with from = 1
and to = 100
and then get 100 records until from <= 131881
. Is there are way where I can check get records in set of say 100 until there are no further records in Elasticsearch.
Yes, you can do so using the scroll API, which the Java client also supports.
You can do it like this:
Client client;
try {
client = TransportClient.builder().build().
addTransportAddress(new InetSocketTransportAddress(InetAddress.getByName("localhost"), 9300));
QueryBuilder qb = QueryBuilders.boolQuery().mustNot(QueryBuilders.termQuery("user_agent", ""));
SearchResponse scrollResp = client.prepareSearch("demo_risk_data")
.addSort(SortParseElement.DOC_FIELD_NAME, SortOrder.ASC)
.setScroll(new TimeValue(60000))
.setQuery(qb)
.setSize(100).execute().actionGet();
//Scroll until no hits are returned
while (true) {
//Break condition: No hits are returned
if (scrollResp.getHits().getHits().length == 0) {
break;
}
// otherwise read results
for (SearchHit hit : scrollResp.getHits().getHits()) {
JSONObject value = new JSONObject(searchData.getSource());
System.out.println(value.toString());
}
// prepare next query
scrollResp = client.prepareSearchScroll(scrollResp.getScrollId()).setScroll(new TimeValue(60000)).execute().actionGet();
}
}
这篇关于使用Java API获取Elasticsearch的所有记录的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!