生成Avro架构文件并存储在HDFS中 [英] Generate Avro Schema File and Store in HDFS
问题描述
我正在使用avro工具从HDFS中的avro文件生成模式文件,并使用以下命令将其转储到Linux文件系统:
I'm using avro tools to generate a schema file from an avro file in HDFS and dump it to the Linux file system using this command:
hadoop jar /usr/bin/Avro/avro-tools-1.8.1.jar getschema /dw/hpm/ap_drg/ap_drg.avro > usr/bin/StageSchema/ap_drg.avsc
这很好用,并为我获取了我需要的文件.然而;我希望模式文件位于HDFS中,而不是Linux文件系统中.如何更改此命令以完成此操作?我还有另一种方法应该这样做吗?
This works great and gets me the file I need. However; I would like the schema file to be in HDFS and not the Linux file system. How can I change this command to accomplish this? Is there another way I should be doing this?
推荐答案
玩了一段时间,终于找到了可行的方法:
Played around for a bit and finally figured out something that worked:
hadoop jar /usr/bin/Avro/avro-tools-1.8.1.jar getschema /dw/hpm/ap_drg/ap_drg.avro | hadoop fs -put -f - /dw/schemas/hpm/ap_drg/ap_drg.avsc
这将从hdfs上的Avro文件中提取Avro模式,并写入hdfs中的Avro模式文件. -f
将确保任何现有的模式文件都将被覆盖.
This will extract an Avro schema from an Avro file on hdfs and write to an Avro schema file in hdfs. The -f
will make sure any existing schema file will be overwritten.
这篇关于生成Avro架构文件并存储在HDFS中的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!