在Scala中导入Avro模式 [英] importing avro schema in Scala

查看:270
本文介绍了在Scala中导入Avro模式的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在编写一个简单的Twitter程序,其中我正在使用Kafka阅读推文,并希望使用Avro进行序列化.到目前为止,我刚刚在Scala中设置了twitter配置,现在想使用此配置读取tweet.

I am writing a simple twitter program, where I am reading Tweets using Kafka and want to use Avro for serialization. So far I have just set up twitter configuration in Scala and now want to read tweets using this config.

如何在程序中导入文件tweets.avsc中定义的以下avro模式?

How do I import the following avro schema as defined in the file tweets.avsc in my program?

{
    "namespace": "tweetavro",
    "type": "record",
    "name": "Tweet",
    "fields": [
        {"name": "name", "type": "string"},
        {"name": "text", "type": "string"}
    ]
}

我在网上跟踪了一些示例,这些示例显示了类似import tweetavro.Tweet的内容,以便在Scala中导入模式,以便我们可以像

I followed some examples on web which shows something like import tweetavro.Tweet to import the schema in Scala so that we can use it like

def main (args: Array[String]) {
    val twitterStream = TwitterStream.getStream
    twitterStream.addListener(new OnTweetPosted(s => sendToKafka(toTweet(s))))
    twitterStream.filter(filterUsOnly)
  }

  private def toTweet(s: Status): Tweet = {
    new Tweet(s.getUser.getName, s.getText)
  }

  private def sendToKafka(t:Tweet) {
    println(toJson(t.getSchema).apply(t))
    val tweetEnc = toBinary[Tweet].apply(t)
    val msg = new KeyedMessage[String, Array[Byte]](KafkaTopic, tweetEnc)
    kafkaProducer.send(msg)
  }

我正在使用相同的插件,并在pom.xml

I am following the same and using the below following plugins in pom.xml

<!-- AVRO MAVEN PLUGIN -->
<plugin>
  <groupId>org.apache.avro</groupId>
  <artifactId>avro-maven-plugin</artifactId>
  <version>1.7.7</version>
  <executions>
    <execution>
      <phase>generate-sources</phase>
      <goals>
        <goal>schema</goal>
      </goals>
      <configuration>
        <sourceDirectory>${project.basedir}/src/main/avro/</sourceDirectory>
        <outputDirectory>${project.basedir}/src/main/scala/</outputDirectory>
      </configuration>
    </execution>
  </executions>
</plugin>


<!-- MAVEN COMPILER PLUGIN -->
<plugin>
  <groupId>org.apache.maven.plugins</groupId>
  <artifactId>maven-compiler-plugin</artifactId>
  <configuration>
    <source>1.7</source>
    <target>1.7</target>
  </configuration>
</plugin>

完成所有这些操作后,我仍然无法执行import tweetavro.Tweet

After doing all this, still i cannot do import tweetavro.Tweet

Anayone可以帮忙吗?

Can anayone please help?

谢谢!

推荐答案

您应该首先将该架构编译成一个类.我不确定Scala中是否有适用于Avro的库,该库已经可以生产了,但是您可以为Java生成一个类并在Scala中使用它:

You should first compile that schema into a class. I'm not sure there is a library for Avro in Scala which is production ready but you may generate a class for Java and use it in Scala:

java -jar /path/to/avro-tools-1.7.7.jar compile schema tweet.avsc .

根据需要更改此行,您应该获得此工具生成的tweetavro.Tweet类.然后,您可以将其放入您的项目中,并按照您刚才描述的方式使用.

Change this line for your needs and you should get a tweetavro.Tweet class generated by this tool. Then you can place it into your project and use in the way you've just described.

更多信息此处

upd:仅供参考,看来Scala中有一个图书馆,但我以前从未使用过它

upd: FYI it seems there is a library in Scala but I've never used it before

这篇关于在Scala中导入Avro模式的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆