无法从弹性客户端搜索响应中获取_source字典键值 [英] Unable to fetch _source dictionary key-val from elastic client search response

查看:63
本文介绍了无法从弹性客户端搜索响应中获取_source字典键值的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在尝试访问hits _source词典以加载到数据库中. hits返回null,我在做什么错了?

I am trying to access the hits _source dictionary to load into a db. hits returns null, what am I doing wrong ?

注意: searchResponse与JSON数据一起返回,并且调试信息对其进行确认.

Notes: searchResponse is returned with the JSON data, and the debuginformation confirms it.

但是,Hit和_Source类以及基础数据变量不可访问,并且变量hit返回null.

However, the Hit and the _Source class and the underlying data variables are not accessible and variable hits returns null.

下面在调试模式下的局部变量代码显示了数据. 如果需要的话,我可以提供更多数据或局部变量的图像或调试信息窗口(如果这样做可以帮助解决问题).

The below code of the local variable in debug mode shows the data. If needed then I can include more data or an image of the local variables or debug information window if that would help the scope of the problem.

谢谢.

试图通过searchResponse.Documents和foreach语句访问_source键值对,以访问匹配中的元素.但是无法访问_source键值对.

Tried accessing the _source key value pairs with searchResponse.Documents and foreach statement to access the elements inside hits. But was not able to access _source key value pairs.

/*Declared classes in visual studio console application for c#:
.NET framework 4.5*/

class Program
{

    public class Doc
    {
        public int took { get; set; }
        public bool timed_out { get; set; }
        public _Shards _shards { get; set; }
        public Hits hits { get; set; }
    }

    public class _Shards
    {
        public int total { get; set; }
        public int successful { get; set; }
        public int skipped { get; set; }
        public int failed { get; set; }
    }

    public class Hits
    {
        public int total { get; set; }
        public float max_score { get; set; }
        public Hit[] hits { get; set; }
    }

    public class Hit
    {
        public string _index { get; set; }
        public string _type { get; set; }
        public string _id { get; set; }
        public float _score { get; set; }
        public _Source _source { get; set; }
    }

    public class _Source
    {
        public int duration { get; set; }
        public string group_id { get; set; }
        public DateTime var_time { get; set; }
        public string var_name { get; set; }
    }

    static void Main(string[] args)
    {
        var uri = new Uri("http://domain_name.val.url:9203/");
        var pool = new SingleNodeConnectionPool(uri);
        var connectionSettings = new ConnectionSettings(pool)
                                .DisableDirectStreaming();
        var resolver = new IndexNameResolver(connectionSettings);
        var client = new ElasticClient(connectionSettings);

        if (!client.IndexExists("test_index").Exists)
        {
            client.CreateIndex("test_index");
        }

        var searchResponse = client.Search<Doc>(s => s
        .Index("test_index")
        .AllTypes()
        .Size(1)
        .Query(q => q
        .MatchAll())
        .TypedKeys(null)
        .SearchType(Elasticsearch.Net.SearchType.DfsQueryThenFetch)
        .Scroll("30s")
      );
    MessageBox.Show("searchResponse.DebugInformation=" + searchResponse.DebugInformation);
    }
}





Elastic Search示例URL数据:


Elastic Search sample URL data:

{
  "took" : 12,
  "timed_out" : false,
  "_shards" : {
    "total" : 5,
    "successful" : 5,
    "skipped" : 0,
    "failed" : 0
  },
  "hits" : {
    "total" : 2700881,
    "max_score" : 1.0,
    "hits" : [
      {
        "_index" : "test_index",
        "_type" : "doc",
        "_id" : "R22224!!5333e7e4-9ee3-45f4-9dc3-2a8b8d8cdcf8",
        "_score" : 1.0,
        "_source" : {
          "duration" : 14986283,
          "group_id" : "com",
          "var_time" : "2018-04-24T17:05:13.082+02:00",
          "var_name" : "2",
        }
      }
    ]
  }
}



更新: 办公室内部的某人建议使用以下代码解决方案,然后遍历键值对.

Update: Someone internally in the office suggested using the following code solution and then iterating through the key value pair.

        var searchResponse = client.Search<Doc>(s => s
            .Index("test_index")
            .AllTypes()
            .Size(10)
            .Query(q => q
            .MatchAll())
            .TypedKeys(null)
            .SearchType(Elasticsearch.Net.SearchType.DfsQueryThenFetch)
            .Scroll("30s")
            .RequestConfiguration(r=>r
            .DisableDirectStreaming()
            )
            );
        var raw = Encoding.UTF8.GetString(searchResponse.ApiCall.ResponseBodyInBytes);  
        JavaScriptSerializer jss = new JavaScriptSerializer();
        jss.MaxJsonLength = Int32.MaxValue;
        var pairs = jss.Deserialize<Dictionary<string, dynamic>>(raw); 

推荐答案

您似乎误解了客户端的API.您无需声明_ShardsHitHits_Source等.客户端会代您为您反序列化Elasticsearch API的这些部分.

It looks like you've misunderstood the API of the client; you don't need to declare _Shards, Hit, Hits, _Source, etc. the client takes care of deserializing these parts of the Elasticsearch API for you.

您需要定义的唯一部分是POCO,它将映射到响应中每个"_source"字段中的JSON对象,即

The only part that you need to define is a POCO that will map to the JSON object in each "_source" field in the response, i.e.

{
  "duration" : 14986283,
  "group_id" : "com",
  "var_time" : "2018-04-24T17:05:13.082+02:00",
  "var_name" : "2",
}

看起来就像_Source POCO一样(尽管我倾向于给它起一个更有意义的名字!).我们现在就称它为MyDocument.

which it looks like the _Source POCO does (although I'd be inclined to give it a more meaningful name!). Let's just call it MyDocument for now.

MyDocument定义为

public class MyDocument
{
    [PropertyName("duration")]
    public int Duration { get; set; }

    [PropertyName("group_id")]
    public string GroupId { get; set; }

    [PropertyName("var_time")]
    public DateTime Time { get; set; }

    [PropertyName("var_name")]
    public string Name { get; set; }
}

一个简单的搜索就是

var pool = new SingleNodeConnectionPool(new Uri("http://localhost:9200"));

var settings = new ConnectionSettings(pool)
    .DefaultMappingFor<MyDocument>(m => m
        .IndexName("test_index")
        .TypeName("doc")
    );

var client = new ElasticClient(settings);

var searchResponse = client.Search<MyDocument>();

// A collection of the top 10 matching documents
var documents = searchResponse.Documents;

只要文档的通用类型为MyDocument,并且在请求中未明确定义,则DefaultMappingFor<MyDocument>(...)将使用索引名称"test_index"和类型名称"doc".

The DefaultMappingFor<MyDocument>(...) will use the index name "test_index" and type name "doc" whenever the generic type of document is MyDocument, and they are not explicitly defined in the request.

上面的搜索向Elasticsearch生成以下查询

The above search generates the following query to Elasticsearch

POST http://localhost:9200/test_index/doc/_search
{}

现在,您似乎想使用Scroll API返回所有匹配的文档.要使用Scroll API做到这一点,您将编写一个循环以使只要返回文档就不断发出滚动请求

Now, it looks like you want to use the Scroll API to return all matching documents. To do this with the Scroll API, you would write a loop to keep making scroll requests so long as documents are being returned

var searchResponse = client.Search<MyDocument>(s => s
    .Size(1000)
    .Scroll("30s")
);

while (searchResponse.Documents.Any())
{
    foreach (var document in searchResponse.Documents)
    {
        // do something with this set of 1000 documents
    }

    // make an additional request
    searchResponse = client.Scroll<MyDocument>("30s", searchResponse.ScrollId);
}

// clear scroll id at the end
var clearScrollResponse = client.ClearScroll(c => c.ScrollId(searchResponse.ScrollId));

有一个ScrollAll可观察的帮助程序,您可以使用它来简化编写过程,并且

There is a ScrollAll observable helper that you can use to make this easier to write, and that parallelizes the operation using sliced_scroll. The same operation as above, but using ScrollAll

// set to number of shards in targeted indices
var numberOfSlices = 4;

var scrollAllObservable = client.ScrollAll<MyDocument>("30s", numberOfSlices);

Exception exception = null;
var manualResetEvent = new ManualResetEvent(false);

var scrollAllObserver = new ScrollAllObserver<MyDocument>(
    onNext: s => 
    {
        var documents = s.SearchResponse.Documents;

        foreach (var document in documents)
        {
            // do something with this set of documents
        }
    },
    onError: e =>
    {
        exception = e;
        manualResetEvent.Set();
    },
    onCompleted: () => manualResetEvent.Set()
);

scrollAllObservable.Subscribe(scrollAllObserver);

manualResetEvent.WaitOne();

if (exception != null)
    throw exception;

如果不需要对观察者的所有控制,则可以使用简化版本.这样,您确实需要为整个操作指定最大运行时间

If you don't need all of the control over the observer, you can use the simplified version. With this, you do need to specify a maximum run time for the overall operation though

var numberOfSlices = 4;

var scrollAllObservable = client.ScrollAll<MyDocument>("30s", numberOfSlices)
    .Wait(TimeSpan.FromHours(2), onNext: s =>
        {
           var documents = s.SearchResponse.Documents;

           foreach (var document in documents)
           {
                // do something with this set of documents
            }
        });

这篇关于无法从弹性客户端搜索响应中获取_source字典键值的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆