自动化蜂巢Activiy使用AWS [英] Automating Hive Activiy using aws

查看：238 发布时间：2015/12/1 13:24:54 hadoop amazon-web-services hive amazon-data-pipeline

本文介绍了自动化蜂巢Activiy使用AWS的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

我想每天我的自动化脚本蜂巢，为了做到这一点，我有一个选项是数据管道。但问题是存在的，我是从出口发电机-DB数据S3和蜂巢的脚本，我操纵这些数据。我给这个输入和输出的蜂巢脚本这就是问题的开始，因为一个蜂房的活动必须有输入和输出，但我必须给他们的脚本文件。

I would like to automate my hive script every day , in order to do that i have an option which is data pipeline. But the problem is there that i am exporting data from dynamo-db to s3 and with a hive script i am manipulating this data. I am giving this input and output in hive-script that's where the problem starts because a hive-activity has to have input and output but i have to give them in script file.

我试图找到一种方法来自动完成这个蜂巢脚本并等待一些想法？

I am trying to find a way to automate this hive-script and waiting for some ideas ?

干杯，

推荐答案

您可以禁用分段蜂巢活动运行任意蜂巢脚本。

You can disable staging on Hive Activity to run any arbitrary Hive Script.

stage = false

做这样的事情：

Do something like:

{
  "name": "DefaultActivity1",
  "id": "ActivityId_1",
  "type": "HiveActivity",
  "stage": "false",
  "scriptUri": "s3://baucket/query.hql",
  "scriptVariable": [
    "param1=value1",
    "param2=value2"
  ],
  "schedule": {
    "ref": "ScheduleId_l"
  },
  "runsOn": {
    "ref": "EmrClusterId_1"
  }
},

这篇关于自动化蜂巢Activiy使用AWS的文章就介绍到这了，希望我们推荐的答案对大家有所帮助，也希望大家多多支持IT屋！

查看全文

自动化蜂巢Activiy使用AWS [英] Automating Hive Activiy using aws

问题描述

推荐答案

相关文章

云存储最新文章

热门教程

热门工具

登录关闭

自动化蜂巢Activiy使用AWS [英] Automating Hive Activiy using aws

问题描述

推荐答案

相关文章

云存储最新文章

热门教程

热门工具

登录 关闭

登录关闭