映射减少程序来实现hadoop框架中的数据结构 [英] map reduce program to implement data structure in hadoop framework

查看:186
本文介绍了映射减少程序来实现hadoop框架中的数据结构的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

这是Hadoop中的数据结构实现。我想使用map-reduce编程在Hadoop中实现索引。
第1部分=我想在表格中使用索引号来存储这个文本文件的每个单词。 [能够完成]
第二部分=现在我想为这个新创建的表执行哈希[不能完成]
第一部分我能完成但第二部分我面临困难
假设我有一个包含3行的文本文件:
你的工作怎么样
你的家人怎么样
嗨你好吗

我想用索引来存储这个文本文件。我有map-reduce代码,它返回每个单词的索引值,这个索引值可以存储在索引表(hash table)中
包含每个单词的索引值的输出:
how 0,
如何14,
是3,
是18,
工作12,
你的7,

现在在散列表中存储每个单词(索引值)与模块(文件中不同元素的数量)的散列表应用散列让说4。对于单词的每个索引值并应用散列函数(模块'%')存储在散列表中。如果在同一地点发生碰撞,请到下一个地点存放。

  0%4 = 0 '在散列索引0处)
14%4 = 2(在索引2处存储'如何')
18%4 = 2(由于碰撞,存储'是'在散列索引3处)
7%4 = 3(由于碰撞,将'您''存储在索引4处)


解决方案


$ b

  Hashtable hashtable = new Hashtable()可以创建Hashtable对象并放置键和值。 ; 

如何找到钥匙?
Ans。你有完全不同的单词计数和单词的索引。
key = index%不同的单词
值=单词

在将记录插入散列表之前,检查碰撞是否发生在该键上。
如何检查碰撞发生?
Ans。

 布尔型碰撞= hashtable.containsKey(key); 

如果碰撞是真的,则线性检查键+ 1,键+ 2,...和当你碰到的是false时,使用下面的行在hashtable中插入键和值。
$ b $ $ $ $ $ $ $ $ $ hashtable.put(key,value) ;


This is a data structure implementation in Hadoop. I want to implement indexing in Hadoop using map-reduce programming. Part 1 = I want to store this text file each word using index number in a table. [Able to complete] Part 2 = Now I want to perform the hashing for this newly created table [not able to complete] 1st part I am able to complete but 2nd part I m facing difficulty  Suppose if I have a text file containing 3 lines: how is your job how is your family hi how are you

I want to store this text file using indexing. I have map-reduce code that returns index value of every word, this index value I am able to store in index table (hash table) Output that contains index values of every word: how 0, how 14, is 3, is 18, job 12, your 7,

Now to store in hash table apply hashing for every word (index value) with modules (number of distinct elements in file) let say 4. For every index value of word and apply hash function (modules'%') to store in hash table. If there is a collision for same location then go to next location and store it.

  0%4=0(store 'how' at hash index 0)
  14%4=2(store 'how' at has index 2)
  18%4=2(store 'is' at hash index 3 because of collision) 
  7%4=3 (store 'your' at index 4 because of collision)

解决方案

you can create Hashtable object and put the key and value.

Hashtable hashtable = new Hashtable(); 

How to find key? Ans. you have total distinct words count and word's index. key = index % no of distinct word value = word

Before insert record in hashtable, check collision is occur or not for that key. How can I check collision occur? Ans.

boolean collision=hashtable.containsKey(key);  

if collision is true, then linearly check for key+1, key+2,...and when you get collision is false, insert the key and value in hashtable using below line.

hashtable.put(key,value);

这篇关于映射减少程序来实现hadoop框架中的数据结构的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆