从Web应用程序运行MapReduce作业 [英] Run MapReduce Job from a web application

查看:118
本文介绍了从Web应用程序运行MapReduce作业的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

参照类似的问题:
运行Hadoop来自另一个Java程序的作业

从一个简单的java程序调用mapreduce作业



我在Hadoop远程机器上也有一个mapreduce job jar文件,创建一个Web应用程序,通过按钮单击事件,将会调出jar文件并执行作业。这个网络应用程序运行在一个单独的机器。



我已经尝试了上述两个帖子的建议,但无法使其工作,即使在wordcount提供的示例,但仍遇到错误消息NoClassDefFoundError。



是否有任何我缺少的代码行?



以下是我拥有的代码:

p>

  public void buttonClick(ClickEvent event){
UserGroupInformation ugi;
尝试{
ugi = UserGroupInformation.createProxyUser(hadoopUser,UserGroupInformation.getLoginUser());
ugi.doAs(new PrivilegedExceptionAction< Object>(){
public Object run()throws Exception {
runHadoopJob();
return null;
}
});
} catch(IOException e){
e.printStackTrace();
} catch(InterruptedException e){
e.printStackTrace();


$ b $ private boolean runHadoopJob(){
try {
Configuration conf = new Configuration();
conf.set(fs.default.name,hdfs://192.168.4.248:9000);
conf.set(mapred.job.tracker,192.168.4.248:9001);
工作职位=新职位(conf,WordCount);
job.setMapperClass(TokenizerMapper.class);
job.setReducerClass(IntSumReducer.class);
job.setJarByClass(TokenizerMapper.class);
job.setOutputKeyClass(Text.class);
job.setOutputValueClass(IntWritable.class);
FileInputFormat.addInputPath(job,new Path(/ flume / events / 160114 / *));
Path out = new Path(output);
FileSystem fs = FileSystem.get(conf);
fs.delete(out,true);
FileOutputFormat.setOutputPath(job,out);
job.waitForCompletion(true);
System.out.println(作业完成);
} catch(Exception e){
e.printStackTrace();
}
返回true;
}




引起:java.lang.NoClassDefFoundError: org / codehaus / jackson / map / JsonMappingException $ b $ org.apache.hadoop.mapreduce.Job $ 1.run(Job.java:513)$ b $ java.util.AccessController.doPrivileged(Native Method)
$ b at javax.security.auth.Subject.doAs(Unknown Source)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1149)
at org.apache.hadoop .mapreduce.Job.connect(Job.java:511)
at org.apache.hadoop.mapreduce.Job.submit(Job.java:499)
at org.apache.hadoop.mapreduce.Job .waitForCompletion(Job.java:530)
at com.example.hadoopclient.HDFSTable.runHadoopJob(HDFSTable.java:181)
at com.example.hadoopclient.HDFSTable.access $ 0(HDFSTable.java: 120)
at com.example.hadoopclient.HDFSTable $ SearchButtonClickListener.buttonClick(HDFSTable.java:116)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Met hod)
at sun.reflect.NativeMethodAccessorImpl.invoke(Unknown Source)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(Unknown Source)
at java.lang.reflect.Method.invoke(Unknown来源)
at com.vaadin.event.ListenerMethod.receiveEvent(ListenerMethod.java:510)
... 36 more


在我的hadoop core-site.xml文件中添加了以下内容,其中 hadoop 是用户组my hadoopUser 属于

 < property> 
< name> hadoop.proxyuser.kohtianan.groups< / name>
<值> hadoop< /值>
< description>< / description>
< / property>
<属性>
< name> hadoop.proxyuser.kohtianan.hosts< / name>
<值> *< /值>
< description>< / description>
< / property>


解决方案

要使map-reduce程序运行,您需要 jackson-mapper-asl - *。jar 和 jackson-core-asl - *。jar map-reduce程序类路径。实际的jar文件名将根据您使用的hadoop发行版和版本而有所不同。

这些文件位于 $ HADOOP_HOME / lib 文件夹。
解决这个问题的两种方法:
$ b $ ul

  • 使用调用map-reduce程序hadoop jar 命令。这将确保所有必需的jar文件都会自动包含在map-reduce程序的类路径中。


  • 如果您希望触发map-reduce作业从应用程序中,确保在应用程序类路径中包含这些jar文件(和其他必需的jar文件),以便当您生成map-reduce程序时,它会自动从应用程序类路径中获取jar文件。 / p>





  • org.apache.hadoop.ipc.RemoteException:用户:kohtianan不允许
    to impersonate hadoopUser


    这个错误表明用户 kohtianan 的确无法访问Hadoop DFS。你可以做的是,只需在HDFS上创建一个目录(来自hdfs超级用户)并将该目录的所有者改为 kohtianan 。这应该可以解决您的问题。


    With reference to similar questions: Running a Hadoop Job From another Java Program and Calling a mapreduce job from a simple java program

    I too have a mapreduce job jar file in a Hadoop remote machine, and I'm creating a web application that, with a button click event, will call out to the jar file and execute the job. This web app is running on a separate machine.

    I've tried the suggestions from both of the posts above but could not get it to work, even working on the wordcount example provided, but still encountering the error message NoClassDefFoundError.

    Is there any lines of code I'm missing?

    Below is the code i have:

    public void buttonClick(ClickEvent event) {
            UserGroupInformation ugi;
            try {
                ugi = UserGroupInformation.createProxyUser("hadoopUser", UserGroupInformation.getLoginUser());
                ugi.doAs(new PrivilegedExceptionAction<Object>(){
                    public Object run() throws Exception {
                        runHadoopJob();
                        return null;
                    }
                });
            } catch (IOException e) {
                e.printStackTrace();
            } catch (InterruptedException e) {
                e.printStackTrace();
            }   
        }
    
    private boolean runHadoopJob(){
    try {       
                Configuration conf = new Configuration();
                conf.set("fs.default.name", "hdfs://192.168.4.248:9000");
                conf.set("mapred.job.tracker", "192.168.4.248:9001");
                Job job = new Job(conf, "WordCount");
                job.setMapperClass(TokenizerMapper.class);
                job.setReducerClass(IntSumReducer.class);
                job.setJarByClass(TokenizerMapper.class);
                job.setOutputKeyClass(Text.class);
                job.setOutputValueClass(IntWritable.class);
                FileInputFormat.addInputPath(job, new Path("/flume/events/160114/*"));
                Path out = new Path("output");
                FileSystem fs = FileSystem.get(conf);
                fs.delete(out, true);
                FileOutputFormat.setOutputPath(job, out);
                job.waitForCompletion(true);
                System.out.println("Job Finished");
            } catch (Exception e) {
                e.printStackTrace();
            }
    return true;
    }
    

    Caused by: java.lang.NoClassDefFoundError: org/codehaus/jackson/map/JsonMappingException at org.apache.hadoop.mapreduce.Job$1.run(Job.java:513) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Unknown Source) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1149) at org.apache.hadoop.mapreduce.Job.connect(Job.java:511) at org.apache.hadoop.mapreduce.Job.submit(Job.java:499) at org.apache.hadoop.mapreduce.Job.waitForCompletion(Job.java:530) at com.example.hadoopclient.HDFSTable.runHadoopJob(HDFSTable.java:181) at com.example.hadoopclient.HDFSTable.access$0(HDFSTable.java:120) at com.example.hadoopclient.HDFSTable$SearchButtonClickListener.buttonClick(HDFSTable.java:116) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(Unknown Source) at sun.reflect.DelegatingMethodAccessorImpl.invoke(Unknown Source) at java.lang.reflect.Method.invoke(Unknown Source) at com.vaadin.event.ListenerMethod.receiveEvent(ListenerMethod.java:510) ... 36 more

    Added the following to my hadoop core-site.xml file, where hadoop is the usergroup my hadoopUser belongs to

    <property>
               <name>hadoop.proxyuser.kohtianan.groups</name>
               <value>hadoop</value>
               <description></description>
             </property>
             <property>
               <name>hadoop.proxyuser.kohtianan.hosts</name>
               <value>*</value>
               <description></description>
             </property>
    

    解决方案

    For map-reduce program to run, you need to have jackson-mapper-asl-*.jar and jackson-core-asl-*.jar files present on your map-reduce program class-path. The actual jar file names will vary based on the hadoop distribution and version you are using.

    These files are present under $HADOOP_HOME/lib folder. Two ways to solve this problem:

    • Invoke map-reduce program using hadoop jar command. This will ensure that all the required jar files are automatically included in your map-reduce program's class-path.

    • If you wish to trigger a map-reduce job from your application, make sure you include these jar files (and other necessary jar files) in your application class-path, so that when you spawn a map-reduce program it automatically picks up the jar files from the application class-path.

    org.apache.hadoop.ipc.RemoteException: User: kohtianan is not allowed to impersonate hadoopUser

    This error indicates that the user kohtianan does not have access to Hadoop DFS. What you can do is, just create a directory on HDFS (from hdfs superuser) and change the owner of that directory to kohtianan. This should resolve your issue.

    这篇关于从Web应用程序运行MapReduce作业的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

    查看全文
    登录 关闭
    扫码关注1秒登录
    发送“验证码”获取 | 15天全站免登陆