如何从Twitter读取Flume生成的数据文件 [英] How to read data files generated by flume from twitter

查看：112 发布时间：2020/11/8 23:52:52 hadoop twitter flume

本文介绍了如何从Twitter读取Flume生成的数据文件的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

我在HDFS上使用flume生成了一些twitter数据日志文件，日志文件的实际格式是什么?我期待的是json格式的数据.但是看起来此.有人可以帮助我如何读取此数据吗?或者我做这件事的方式有什么问题

I have generated few twitter data log files using flume on HDFS , what is the actual format of the log file ? I was expecting data in json format. But it looks like this. Could someone help me on how to read this data ? or what is wrong with the way I have done this

推荐答案

从此链接下载文件(hive-serdes-1.0-SNAPSHOT.jar)

DOWNLOAD THE FILE (hive-serdes-1.0-SNAPSHOT.jar) from this link
http://files.cloudera.com/samples/hive-serdes-1.0-SNAPSHOT.jar

然后将此文件放入$ HIVE_HOME/lib
将罐子放入蜂巢壳

Then put this file in your $HIVE_HOME/lib
Add the jar into hive shell

hive> ADD JAR file:///home/hadoop/work/hive-0.10.0/lib/hive-serdes-1.0-SNAPSHOT.jar

在配置单元中创建表格

hive> CREATE TABLE tweets (
id BIGINT,
created_at STRING,
source STRING,
favorited BOOLEAN,
retweeted_status STRUCT<
text:STRING,
user:STRUCT<screen_name:STRING,name:STRING>,
retweet_count:INT>,
entities STRUCT<
urls:ARRAY<STRUCT<expanded_url:STRING>>,
user_mentions:ARRAY<STRUCT<screen_name:STRING,name:STRING>>,
hashtags:ARRAY<STRUCT<text:STRING>>>,
text STRING,
user STRUCT<
screen_name:STRING,
name:STRING,
friends_count:INT,
followers_count:INT,
statuses_count:INT,
verified:BOOLEAN,
utc_offset:INT,
time_zone:STRING>,
in_reply_to_screen_name STRING
) 
ROW FORMAT SERDE 'com.cloudera.hive.serde.JSONSerDe';

从hdfs将数据加载到表中

load data into table from hdfs

hive> load data inpath '/home/hadoop/work/flumedata' into table tweets;

现在分析您来自此表的Twitter数据

Now analyze you twitter data from this table

hive> select id,text,user from tweets;

您已经完成了，但是它是反序列化的数据，现在可以从配置单元表中进行序列化了.

you done, but it is deserialized data, now serialize from hive table..

这篇关于如何从Twitter读取Flume生成的数据文件的文章就介绍到这了，希望我们推荐的答案对大家有所帮助，也希望大家多多支持IT屋！

查看全文

如何从Twitter读取Flume生成的数据文件 [英] How to read data files generated by flume from twitter

问题描述

推荐答案

相关文章

其他开发最新文章

热门教程

热门工具

登录关闭

如何从Twitter读取Flume生成的数据文件 [英] How to read data files generated by flume from twitter

问题描述

推荐答案

相关文章

其他开发最新文章

热门教程

热门工具

登录 关闭

登录关闭