写入HDFS:文件被覆盖 [英] Writing to HDFS : File is overwritten

查看:790
本文介绍了写入HDFS:文件被覆盖的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在写信给hadoop文件系统。但每次追加内容时,它都会覆盖数据而不是将其添加到现有的数据/文件中。下面提供了这样做的代码。对于不同的数据,这个代码被一次又一次地调用。每次出现问题时都打开一个新的SequenceFile.Writer?



每次我将路径作为新路径(someDir);

  public void writeToHDFS(Path path,long uniqueId,String data){
FileSystem fs = path.getFileSystem(conf);
SequenceFile.Writer inputWriter = new SequenceFile.Writer(fs,conf,
path,LongWritable.class,MyWritable.class);
inputWriter.append(new LongWritable(uniqueId ++),new MyWritable(data));
inputWriter.close();


解决方案

目前没有办法通过API附加到现有的SequenceFile。当创建新的 SequenceFile.Writer 对象时,它不会追加到 Path 中的现有文件,而是覆盖它。查看我的早期问题



正如Thomas指出的那样,如果您保留相同的 SequenceFile.Writer 对象,您将能够附加到该文件,直到您致电 close()


I am writing to hadoop file system. But everytime I append something, it overwrites the data instead of adding it to the existing data/file. The code which is doing this is provided below. This code is called again and again for different data. Is opening a new SequenceFile.Writer everytime a problem?

Each time I am getting the path as new Path("someDir");

  public void writeToHDFS(Path path, long uniqueId, String data){
      FileSystem fs = path.getFileSystem(conf);
      SequenceFile.Writer inputWriter = new SequenceFile.Writer(fs, conf,
          path, LongWritable.class, MyWritable.class);
      inputWriter.append(new LongWritable(uniqueId++), new MyWritable(data));
      inputWriter.close();
  }

解决方案

There is currently no way to append to an existing SequenceFile through the API. When you make the new SequenceFile.Writer object, it will not append to an existing file at that Path, but instead overwrite it. See my earlier question.

As Thomas points out, if you keep the same SequenceFile.Writer object, you will be able to append to the file until you call close().

这篇关于写入HDFS:文件被覆盖的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆