为cassandra创建ColumnFamilyInputFormat的自定义InputFormat [英] Create Custom InputFormat of ColumnFamilyInputFormat for cassandra

查看:200
本文介绍了为cassandra创建ColumnFamilyInputFormat的自定义InputFormat的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在使用 cassandra 1.2,hadoop 1.2

I am working on a project, using cassandra 1.2, hadoop 1.2

我已经创建了正常的cassandra映射器和reducer,想要创建我自己的输入格式类,它将从cassandra读取记录,我将得到所需的列的值,通过拆分和索引,分裂和价值
所以,我计划创建自定义Format类。但我很困惑,不能知道,我会怎么做呢?要扩展和实现哪些类,以及我将如何获取行键,列名称,列值等。

I have created my normal cassandra mapper and reducer, but I want to create my own Input format class, which will read the records from cassandra, and I'll get the desired column's value, by splitting that value using splitting and indexing , so, I planned to create custom Format class. but I'm confused and not able to know, how would I make it? What classes are to be extend and implement, and how I will able to fetch the row key, column name, columns value etc.

有我的Mapperclass如下:

I have my Mapperclass as follow:

    public class MyMapper extends
            Mapper<ByteBuffer, SortedMap<ByteBuffer, IColumn>, Text, Text> {
        private Text word = new Text();
        MyJDBC db = new MyJDBC();

        public void map(ByteBuffer key, SortedMap<ByteBuffer, IColumn> columns,
                Context context) throws IOException, InterruptedException {

            long std_id = Long.parseLong(ByteBufferUtil.string(key));
            long newSavePoint = 0;
            if (columns.values().isEmpty()) {
            System.out.println("EMPTY ITERATOR");
            sb.append("column_N/A" + ":" + "N/A" + " , ");                  
            } else {
                for (IColumn cell : columns.values()) {
                    name = ByteBufferUtil.string(cell.name());
                    String value = null;
                    if (name.contains("int")) {
                    value = String.valueOf(ByteBufferUtil.toInt(cell.value()));
                    } else {
                    value = ByteBufferUtil.string(cell.value());
                    }
                String[] data = value.toString().split(",");
                // if (data[0].equalsIgnoreCase("login")) {
                    Long[] dif = getDateDiffe(d1, d2);

// logics i want to perform inside my custominput class , rather here, i just want a simple mapper class        
if (condition1 && condition2) {             
myhits++;
sb.append(":\t " + data[0] + "  " + data[2] + "  "+ data[1] /* + " " + data[3] */+ "\n");
newSavePoint = d2;
}
}
sb.append("~" + like + "~" + newSavePoint + "~");
word.set(sb.toString().replace("\t", ""));
}

db.setInterval(Long.parseLong(ByteBufferUtil.string(key)), newSavePoint);
db.setHits(Long.parseLong(ByteBufferUtil.string(key)), like + "");
context.write(new Text(ByteBufferUtil.string(key)), word);
}

我想减少我的Mapper类逻辑,我的自定义输入类。

I want to decrease my Mapper Class logics, and want to perform same calculations on my custom input class.

请帮助,我希望堆栈的正r4esponse ...

Please help, i wish for the positive r4esponse from stackies...

推荐答案

您可以通过将Mapper逻辑移动到您的自定义输入类(如您已经指示的那样)来完成预期任务。

You can do the intended task by moving the Mapper logic to your custom input class (as you have indicated already)

不错的帖子,它解释了一个类似的问题陈述,你有。我认为这可能解决你的问题。

I found this nice post which explains a similar problem statement as you have. I think it might solve your problem.

这篇关于为cassandra创建ColumnFamilyInputFormat的自定义InputFormat的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆