ElasticsearchTemplate 检索大数据集 [英] ElasticsearchTemplate retrieve big data sets
问题描述
我是 ElasticsearchTemplate 的新手.我想根据我的查询从 Elasticsearch 获取 1000 个文档.我已经使用 QueryBuilder 创建了我的查询,它运行良好.我浏览了以下链接,其中指出可以使用扫描和滚动来实现大数据集.
链接一
链接二一个>
I am new to ElasticsearchTemplate. I want to get 1000 documents from Elasticsearch based on my query.
I have used QueryBuilder to create my query , and it is working perfectly.
I have gone through the following links , which states that it is possible to achieve big data sets using scan and scroll.
link one
link two
我正在尝试在以下代码段中实现此功能,我从上面提到的链接之一复制粘贴了这些代码.但我收到以下错误:
I am trying to implement this functionality in the following section of code, which I have copy pasted from one of the link , mentioned above. But I am getting following error :
ResultsMapper 类型不是通用的;它不能用参数 <myInputDto> 进行参数化.
MyInputDto
是我项目中带有 @Document
批注的类.一天结束,我只想从 Elasticsearch 检索 1000 个文档.我试图找到 size
参数,但我认为它不受支持.
MyInputDto
is a class with @Document
annotation in my project.
End of the day , I just want to retrieve 1000 documents from Elasticsearch.
I tried to find size
parameter but I think it is not supported.
String scrollId = esTemplate.scan(searchQuery, 1000, false);
List<MyInputDto> sampleEntities = new ArrayList<MyInputDto>();
boolean hasRecords = true;
while (hasRecords) {
Page<MyInputDto> page = esTemplate.scroll(scrollId, 5000L,
new ResultsMapper<MyInputDto>() {
@Override
public Page<MyInputDto> mapResults(SearchResponse response) {
List<MyInputDto> chunk = new ArrayList<MyInputDto>();
for (SearchHit searchHit : response.getHits()) {
if (response.getHits().getHits().length <= 0) {
return null;
}
MyInputDto user = new MyInputDto();
user.setId(searchHit.getId());
user.setMessage((String) searchHit.getSource().get("message"));
chunk.add(user);
}
return new PageImpl<MyInputDto>(chunk);
}
});
if (page != null) {
sampleEntities.addAll(page.getContent());
hasRecords = page.hasNextPage();
} else {
hasRecords = false;
}
}
这里有什么问题?有没有其他选择来实现这一目标?如果有人能告诉我这个(代码)在后端是如何工作的,我将不胜感激.
What is the issue here ? Is there any other alternative to achieve this? I will be thankful if somebody could tell me how this ( code ) is working in the back end.
推荐答案
解决方案 1
如果你想使用 ElasticsearchTemplate
,使用 CriteriaQuery
会更简单易读,因为它允许使用 setPageable
设置页面大小代码>方法.通过滚动,您可以获得下一组数据:
If you want to use ElasticsearchTemplate
, it would be much simpler and readable to use CriteriaQuery
, as it allows to set the page size with setPageable
method. With scrolling, you can get next sets of data:
CriteriaQuery criteriaQuery = new CriteriaQuery(Criteria.where("productName").is("something"));
criteriaQuery.addIndices("prods");
criteriaQuery.addTypes("prod");
criteriaQuery.setPageable(PageRequest.of(0, 1000));
ScrolledPage<TestDto> scroll = (ScrolledPage<TestDto>) esTemplate.startScroll(3000, criteriaQuery, TestDto.class);
while (scroll.hasContent()) {
LOG.info("Next page with 1000 elem: " + scroll.getContent());
scroll = (ScrolledPage<TestDto>) esTemplate.continueScroll(scroll.getScrollId(), 3000, TestDto.class);
}
esTemplate.clearScroll(scroll.getScrollId());
解决方案 2
如果您想使用 org.elasticsearch.client.Client
而不是 ElasticsearchTemplate
,则 SearchResponse
允许设置要返回的搜索命中:
If you'd like to use org.elasticsearch.client.Client
instead of ElasticsearchTemplate
, then SearchResponse
allows to set the number of search hits to return:
QueryBuilder prodBuilder = ...;
SearchResponse scrollResp = client.
prepareSearch("prods")
.setScroll(new TimeValue(60000))
.setSize(1000)
.setTypes("prod")
.setQuery(prodBuilder)
.execute().actionGet();
ObjectMapper mapper = new ObjectMapper();
List<TestDto> products = new ArrayList<>();
try {
do {
for (SearchHit hit : scrollResp.getHits().getHits()) {
products.add(mapper.readValue(hit.getSourceAsString(), TestDto.class));
}
LOG.info("Next page with 1000 elem: " + products);
products.clear();
scrollResp = client.prepareSearchScroll(scrollResp.getScrollId())
.setScroll(new TimeValue(60000))
.execute()
.actionGet();
} while (scrollResp.getHits().getHits().length != 0);
} catch (IOException e) {
LOG.error("Exception while executing query {}", e);
}
这篇关于ElasticsearchTemplate 检索大数据集的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!