从斯卡拉访问鲨鱼表(蜂巢)(鲨鱼壳) [英] Accessing Shark tables (Hive) from Scala (shark-shell)

查看:146
本文介绍了从斯卡拉访问鲨鱼表(蜂巢)(鲨鱼壳)的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

鲨鱼0.8.0 运行在蜂巢-0.9.0 。我可以通过调用鲨鱼在蜂巢编程。我创建了一些表格和数据加载它们。

I have shark-0.8.0 which runs on hive-0.9.0. I am able to program on Hive by invoking shark. I created a few tables and loaded them with data.

现在,我尝试使用斯卡拉来访问这些表中的数据。我通过调用鲨鱼壳斯卡拉外壳。但是,当我尝试选择,我得到一个错误,该表是不是present。

Now, I am trying to access the data from these tables using Scala. I invoked the Scala shell using shark-shell. But when I try to select, I get an error that the table is not present.

scala> val artists = sc.sql2rdd("select artist from default.lastfm")

Hive history file=/tmp/hduser2/hive_job_log_hduser2_201405091617_1513149542.txt
151.738: [GC 317312K->83626K(1005568K), 0.0975990 secs]
151.836: [Full GC 83626K->76005K(1005568K), 0.4523880 secs]
152.313: [GC 80536K->76140K(1005568K), 0.0030990 secs]
152.316: [Full GC 76140K->62214K(1005568K), 0.1716240 secs]
FAILED: Error in semantic analysis: Line 1:19 Table not found 'lastfm'
shark.api.QueryExecutionException: FAILED: Error in semantic analysis: Line 1:19 Table not found 'lastfm'
    at shark.SharkDriver.tableRdd(SharkDriver.scala:149)
    at shark.SharkContext.sql2rdd(SharkContext.scala:100)
    at <init>(<console>:17)
    at <init>(<console>:22)
    at <init>(<console>:24)
    at <init>(<console>:26)
    at <init>(<console>:28)
    at <init>(<console>:30)
    at <init>(<console>:32)
    at .<init>(<console>:36)
    at .<clinit>(<console>)
    at .<init>(<console>:11)
    at .<clinit>(<console>)
    at $export(<console>)
    at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
    at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
    at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
    at java.lang.reflect.Method.invoke(Method.java:606)
    at org.apache.spark.repl.SparkIMain$ReadEvalPrint.call(SparkIMain.scala:629)
    at org.apache.spark.repl.SparkIMain$Request$$anonfun$10.apply(SparkIMain.scala:890)
    at scala.tools.nsc.interpreter.Line$$anonfun$1.apply$mcV$sp(Line.scala:43)
    at scala.tools.nsc.io.package$$anon$2.run(package.scala:25)
    at java.lang.Thread.run(Thread.java:744)

从文档( https://github.com/amplab/shark/维基/鲨鱼用户指南),这些步骤都足以让鲨鱼和使用运行,选择数据斯卡拉。还是我失去了一些东西?是否有需要修改,​​以便从中获得一些鲨鱼的配置文件鲨鱼壳

From the documentation (https://github.com/amplab/shark/wiki/Shark-User-Guide), these steps are enough to get Shark up and running and select data using Scala. Or am I missing something? Is there some configuration file that needs to be modified to enable access to Shark from shark-shell ?

推荐答案

让你更新你的鲨鱼蜂巢目录配置正确反映蜂巢metastore JDBC连接信息?

Have you updated your shark-hive directory configuration to properly reflect the hive metastore jdbc connection info?

您将需要复制蜂房-default.xml中蜂巢-site.xml中。然后确保metastore属性设置。

You will need to copy the hive-default.xml to hive-site.xml . Then ensure the metastore properties are set.

下面是蜂巢-site.xml中的基本信息

Here is the basic info in hive-site.xml

<property>
  <name>javax.jdo.option.ConnectionURL</name>
  <value>jdbc:mysql://myhost/metastore</value>
  <description>the URL of the MySQL database</description>
</property>

<property>
  <name>javax.jdo.option.ConnectionDriverName</name>
  <value>com.mysql.jdbc.Driver</value>
</property>

<property>
  <name>javax.jdo.option.ConnectionUserName</name>
  <value>hive</value>
</property>

<property>
  <name>javax.jdo.option.ConnectionPassword</name>
  <value>mypassword</value>
</property>

您可以在这里得到更多的细节:<一href=\"http://www.cloudera.com/content/cloudera-content/cloudera-docs/CDH4/4.2.0/CDH4-Installation-Guide/cdh4ig_topic_18_4.html\"相对=nofollow>配置蜂巢metastore

You can get more details here: configuring hive metastore

这篇关于从斯卡拉访问鲨鱼表(蜂巢)(鲨鱼壳)的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆