将数据从HDFS导入HBase(cdh3u2) [英] Import data from HDFS to HBase (cdh3u2)
问题描述
我已经安装了hadoop和hbase cdh3u2。在hadoop中,我在 /home/file.txt
路径中有一个文件。它有如下数据:
one,1
two,2
three,3
我想将这个文件导入到hbase中。在那里,第一个字段应该被解析为String,第二个字段被解析为整数,然后它应该被推入到hbase中。帮助我做到这一点
a感谢dvance ....
我喜欢使用Apache Pig来接收HBase,因为它非常简单,直接且灵活。
这是一个Pig脚本,可以帮你完成工作你已经创建了表格和列族。要创建表和列系列,您需要:
$ hbase shell
>创建'mydata','mycf'
将文件移至HDFS:
$ hadoop fs -put /home/file.txt /user/surendhar/file.txt
然后,编写猪脚本以存储 HBaseStorage (您可能需要查看 请注意,在上面的脚本中,键将是 其他选项包括: I have Installed hadoop and hbase cdh3u2. In hadoop i have a file at the path I want to import this file into hbase. in that, the first field should parsed as String, and 2nd field parsed as integer, and then it should pushed into hbase. Help me to do this aThanks in dvance.... I like using Apache Pig for ingest into HBase because it is simple, straightforward, and flexible. Here is a Pig script that would do the job for you, after you have created the table and the column family. To create the table and the column family, you'll do: Move the file to HDFS: Then, write the pig script to store with HBaseStorage (you may have to look up how to set up and run Pig): Note that in the above script, the key is going to be Some other options would be: Push the data up with the hbase shell using some sort of script (i.e., sed, perl, python) that transforms the lines of csv into shell 这篇关于将数据从HDFS导入HBase(cdh3u2)的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!
A = LOAD'file.txt'使用PigStorage(',')as(strdata:chararray,intdata:long);
STORE A INTO'hbase:// mydata'
USING org.apache.pig.backend.hadoop.hbase.HBaseStorage(
'mycf:intdata');
strdata
。如果您想从某个东西创建自己的密钥,请使用 FOREACH 语句来生成密钥。 HBaseStorage假定前一个关系(本例中 A :: strdata
)中的第一件事是关键。
放入
命令。再次,这应该只是如果记录的数量很小。
$ cat /home/file.txt | transform.pl
放'mydata','one','mycf:intdata','1'
放'mydata','two','mycf:intdata','2'
把'mydata','three','mycf:intdata','3'
$ cat /home/file.txt | transform.pl | hbase shell
/home/file.txt
. it has the data likeone,1
two,2
three,3
$ hbase shell
> create 'mydata', 'mycf'
$ hadoop fs -put /home/file.txt /user/surendhar/file.txt
A = LOAD 'file.txt' USING PigStorage(',') as (strdata:chararray, intdata:long);
STORE A INTO 'hbase://mydata'
USING org.apache.pig.backend.hadoop.hbase.HBaseStorage(
'mycf:intdata');
strdata
. If you want to create your own key from something, use a FOREACH statement to generate the key. HBaseStorage assumes that the first thing in the previous relation (A::strdata
in this case) is the key.
put
commands. Again, this should only be done if the number of records is small.$ cat /home/file.txt | transform.pl
put 'mydata', 'one', 'mycf:intdata', '1'
put 'mydata', 'two', 'mycf:intdata', '2'
put 'mydata', 'three', 'mycf:intdata', '3'
$ cat /home/file.txt | transform.pl | hbase shell