使用 Spark 中的复杂过滤从 elasticsearch 中获取 esJsonRDD [英] Fetching esJsonRDD from elasticsearch with complex filtering in Spark

查看:27
本文介绍了使用 Spark 中的复杂过滤从 elasticsearch 中获取 esJsonRDD的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我目前正在基于单行弹性查询的 Spark Job 过滤中获取 elasticsearch RDD(示例):

I am currently fetching the elasticsearch RDD in our Spark Job filtering based on one-line elastic query as such (example):

val elasticRdds = sparkContext.esJsonRDD(esIndex, s"?default_operator=AND&q=director.name:DAVID + \n movie.name:SEVEN")

现在,如果我们的搜索查询变得复杂,例如:

Now if our search query becomes complex like:

{
    "query": {
        "filtered": {
            "query": {
                "query_string": {
                    "default_operator": "AND",
                    "query": "director.name:DAVID + \n movie.name:SEVEN"
                }
            },
            "filter": {
                "nested": {
                    "path": "movieStatus.boxoffice.status",
                    "query": {
                        "bool": {
                            "must": [
                                {
                                    "match": {
                                        "movieStatus.boxoffice.status.rating": "A"
                                    }
                                },
                                {
                                    "match": {
                                        "movieStatus.boxoffice.status.oscar": "false"
                                    }
                                }
                            ]
                        }
                    }
                }
           }
        }
    }
}

我是否仍可以将该查询转换为内嵌弹性查询以与 esJsonRDD 一起使用?或者无论如何,上述查询仍然可以按原样esJsonRDD一起使用?如果没有,在 Spark 中获取此类 RDD 的更好方法是什么?

Can I still convert that query to in-line elastic query to use it with esJsonRDD? Or is there anyway that the above query could still be used as is with esJsonRDD? If not, what is the better way to fetch such RDDs in Spark?

因为 esJsonRDD 似乎只接受内联(一行)弹性查询.

Because esJsonRDD seems to accept only inline(one line) elastic queries.

推荐答案

使用三重引号:

val query = """{
"query": {
    "filtered": {
        "query": {
            "query_string": {
                "default_operator": "AND",
                "query": "director.name:DAVID + \n movie.name:SEVEN"
            }
        },
        "filter": {
            "nested": {
                "path": "movieStatus.boxoffice.status",
                "query": {
                    "bool": {
                        "must": [
                            {
                                "match": {
                                    "movieStatus.boxoffice.status.rating": "A"
                                }
                            },
                            {
                                "match": {
                                    "movieStatus.boxoffice.status.oscar": "false"
                                }
                            }
                        ]
                    }
                }
            }
        }
     }
  }
}"""

val elasticRdds = sparkContext.esJsonRDD(esIndex, query)

这篇关于使用 Spark 中的复杂过滤从 elasticsearch 中获取 esJsonRDD的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆