卷曲下载到HDFS [英] Curl download to HDFS
问题描述
我有以下代码:
curl -o fileName.csv url | xargs hdfs dfs -moveFromLocal $1 /somePath/
当我执行此代码时,curl将来自请求的值放入fileName.csv中,文件将移至HDFS.我想知道是否可以,将curl输出保留在内存中,发送到管道,然后将值写入HDFS?
When i execute this code, curl put the values from request inside fileName.csv, the file are moved to HDFS. I wanna know if i can, mantain the curl output in memory, send to pipe and just write the values inside HDFS?
类似的东西(有效):
curl url | xargs hdfs dfs -put $1 /somePath
推荐答案
hdfs dfs -put
命令可以接受stdin的文件输入,使用熟悉的习惯用法指定-
表示stdin:
The hdfs dfs -put
command can accept file input from stdin, using the familiar idiom of specifying -
to mean stdin:
> curl -sS https://www.google.com/robots.txt | hdfs dfs -put - /robots.txt
> hdfs dfs -ls /robots.txt
-rw-r--r-- 3 cnauroth supergroup 6880 2017-07-06 09:07 /robots.txt
另一种选择是使用外壳程序进程替换以允许处理curl
(或实际上您选择的任何命令)的标准输出,好像它是另一个命令的文件输入一样:
Another option is to use shell process substitution to allow treating the stdout of curl
(or really any command you choose) as if it were a file input to another command:
> hdfs dfs -put <(curl -sS https://www.google.com/robots.txt) /robots.txt
> hdfs dfs -ls /robots.txt
-rw-r--r-- 3 cnauroth supergroup 6880 2017-07-05 15:07 /robots.txt
这篇关于卷曲下载到HDFS的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!