使用Python在hdfs上读取/写入文件 [英] Read/Write files on hdfs using Python
本文介绍了使用Python在hdfs上读取/写入文件的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!
问题描述
我是Python的新手,我想从hdfs中读取一个文件(已实现).
I am a newbie to Python, I want to read a file from hdfs (which I have achieved).
读取文件后,我正在执行一些字符串操作,我想将这些修改后的内容写入输出文件中.
after reading the file I am doing some string operations and I want to write these modified contents into the output file.
由于打开对我不起作用,因此读取了我使用子进程(花费了很多时间)获得的文件.
Reading the file I achieved using subprocess (which took a lot of time) since open didn't work for me.
cat = Popen(["hadoop", "fs", "-cat", "/user/hdfs/test-python/input/test_replace"],stdout=PIPE)
现在,问题是如何写入具有修改后内容的输出文件.
Now, how to write to the output file with the modified contents is the question.
我们非常感谢您的帮助
推荐答案
You can use a library for reading and writing to HDFS, like https://github.com/mtth/hdfs
这篇关于使用Python在hdfs上读取/写入文件的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!
查看全文