Avro-java.io.IOException:不是数据文件 [英] Avro - java.io.IOException: Not a data file

查看:69
本文介绍了Avro-java.io.IOException:不是数据文件的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在使用 https://github.com/allegro/json-avro-converter 将我的json消息转换为avro文件.调用convertToAvro方法后,我得到一个字节数组:byte [] byteArrayJson.然后,我使用来自Apache的commons库:

I am using https://github.com/allegro/json-avro-converter to convert my json message into an avro file. After calling the convertToAvro method I get a byte array: byte[] byteArrayJson. Then I am using the commons library from Apache:

FileUtils.writeByteArrayToFile(myFile.avro, byteArrayJson);

文件已创建.当我尝试将其转换为json时,使用:

The file is created. When I try to reconvert it to json, using:

java -jar avro-tools-1.8.1.jar tojson myFile.avro > testCheck.json


Exception in thread "main" java.io.IOException: Not a data file.
    at 
org.apache.avro.file.DataFileStream.initialize(DataFileStream.java:105)
    at org.apache.avro.file.DataFileStream.<init>(DataFileStream.java:84)
    at org.apache.avro.tool.DataFileReadTool.run(DataFileReadTool.java:71)
    at org.apache.avro.tool.Main.run(Main.java:87)
    at org.apache.avro.tool.Main.main(Main.java:76)

我创建了一个Junit测试,并使用了convertToJson方法(来自上一个链接)并声明了字符串,一切正常.但是用广口瓶不能用.难道我做错了什么?我使用的是cmd,而不是powerShell,因为我在SO帖子中看到这可以更改编码.我认为问题出在编码,但是我不知道在哪里看.(我使用Windows作为操作系统)

I have created a Junit test and used convertToJson method (from the previous link) and assert the strings and it is everything ok. But with the jar it is not working. Am I doing something wrong? I am using the cmd, not powerShell, because I saw in a SO post that this can change the encoding. I think that the problem is with encoding, but I have no idea where to look. (I am using windows as OS)

推荐答案

原因是从这两种不同方式生成的avro文件不包含相同的数据,这是预期的行为.

The reason is that the avro file do not contain same data when produced from these 2 different ways and this is expected behavior.

作为测试,请使用此命令生成avro文件

As a test, use this command to generate the avro file

java -jar avro-tools-1.8.2.jar fromjson  --schema-file avroschema.json
testCheck.json > myFile2.auro

现在阅读此内容并用Java打印,请注意它不包含仅AVRO RECORD它还至少包含scme-请参阅下面的String转换数据.这意味着使用acro工具生成和使用avro转换器时,AVRO文件中的数据是不同的

Now read this and print in Java, and notice that it doesnt contain ONLY AVRO RECORD It contains the scme as well ( at least ) -see the String converted data below. This means the data in AVRO files is different when generated using acro tools and when using avro converter

bjavro.schemaœ{"type":"record","name":"Acme","fields":[{"name":"username","type":"string"}]}avro.c

当您尝试使用 tojson 命令读取从转换器生成的avro文件时,工具API中的验证失败".

The validation within tools API "fails" when you try to read an avro file generated from converter with tojson command.

现在,使用转换器生成文件时,使用Acro工具读取"json"的正确命令是 fragtojson .看到我们真的只读取JSON片段(此处为avro记录)

Now the correct command to use to read the "json" using acro tools when the file is generated using converter is fragtojson. See that we are really reading only JSON fragment ( an avro record here )

java -jar avro-tools-1.8.2.jar fragtojson --schema-file avroschema.json myFile.avro > myFile21.json

这里的另一种想法是避免完全使用AVRO工具,并使用转换器作为依赖项来创建自己的可执行jar,并使用它读取AVRO JSON记录.

Another thought here is avoid using AVRO tools altogether and create your own executable jar with converter as dependency, and use it read AVRO JSON records.

这篇关于Avro-java.io.IOException:不是数据文件的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆