使用Glue将数据输入到AWS Elastic Search [英] Input data to AWS Elastic Search using Glue
问题描述
我正在寻找一种使用AWS Glue python或pyspark将数据插入AWS Elastic Search的解决方案.我已经看到了用于弹性搜索的Boto3 SDK,但是找不到任何将数据插入弹性搜索的功能.谁能帮我找到解决方案?任何有用的链接或代码吗?
I'm looking for a solution to insert data to AWS Elastic Search using AWS Glue python or pyspark. I have seen Boto3 SDK for Elastic Search but could not find any function to insert data into Elastic Search. Can anyone help me to find solution ? Any useful links or code ?
推荐答案
对于aws胶,您需要在作业中添加一个额外的罐子.
For aws glue you need to add an additional jar to the job.
- 从
df.write.format("org.elasticsearch.spark.sql").\ option("es.resource", "index/document").\ option("es.nodes", host).\ option("es.port", port).\ save()
如果您使用的是AWS托管弹性搜索,请尝试将其设置为true
If you are using aws managed elastic search, try setting this to true
option("es.nodes.wan.only", "true")
有关更多属性,请检查 https://www.elastic.co/guide/en/elasticsearch/hadoop/current/configuration.html
For more properties check https://www.elastic.co/guide/en/elasticsearch/hadoop/current/configuration.html
注意elasticsearch-spark连接器仅在scala 2.11上预构建,而spark 2.4和spark 3.0在scala 2.12上预构建,因此仅与spark 2.3兼容.
NOTE The elasticsearch-spark connector is compatible with spark 2.3 only as it is prebuilt on scala 2.11 while spark 2.4 and spark 3.0 is prebuilt on scala 2.12
这篇关于使用Glue将数据输入到AWS Elastic Search的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!