没有用于方案的文件系统:sftp [英] No FileSystem for scheme: sftp

查看:160
本文介绍了没有用于方案的文件系统:sftp的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在尝试在distcp的Hadoop中使用sftp,如下所示

I am trying to use sftp in hadoop with distcp like below

hadoop distcp -D fs.sftp.credfile=/home/bigsql/cred.prop sftp://<<ip address>>:22/export/home/nz/samplefile hdfs:///user/bigsql/distcp

但是我收到以下错误

15/11/23 13:29:06 INFO tools.DistCp: Input Options: DistCpOptions{atomicCommit=false, syncFolder=false, deleteMissing=false, ignoreFailures=false, maxMaps=20, sslConfigurationFile='null', copyStrategy='uniformsize', sourceFileListing=null, sourcePaths=[sftp://<<source ip>>:22/export/home/nz/samplefile], targetPath=hdfs:/user/bigsql/distcp, targetPathExists=true, preserveRawXattrs=false}
15/11/23 13:29:09 INFO impl.TimelineClientImpl: Timeline service address: http://bigdata.ibm.com:8188/ws/v1/timeline/
15/11/23 13:29:09 INFO client.RMProxy: Connecting to ResourceManager at bigdata.ibm.com/<<target ip>>:8050
15/11/23 13:29:10 ERROR tools.DistCp: Exception encountered
java.io.IOException: No FileSystem for scheme: sftp
        at org.apache.hadoop.fs.FileSystem.getFileSystemClass(FileSystem.java:2584)
        at org.apache.hadoop.fs.FileSystem.createFileSystem(FileSystem.java:2591)
        at org.apache.hadoop.fs.FileSystem.access$200(FileSystem.java:91)
        at org.apache.hadoop.fs.FileSystem$Cache.getInternal(FileSystem.java:2630)
        at org.apache.hadoop.fs.FileSystem$Cache.get(FileSystem.java:2612)
        at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:370)
        at org.apache.hadoop.fs.Path.getFileSystem(Path.java:296)
        at org.apache.hadoop.tools.GlobbedCopyListing.doBuildListing(GlobbedCopyListing.java:76)
        at org.apache.hadoop.tools.CopyListing.buildListing(CopyListing.java:84)
        at org.apache.hadoop.tools.DistCp.createInputFileListing(DistCp.java:353)
        at org.apache.hadoop.tools.DistCp.execute(DistCp.java:160)
        at org.apache.hadoop.tools.DistCp.run(DistCp.java:121)
        at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:70)
        at org.apache.hadoop.tools.DistCp.main(DistCp.java:401)

任何人都可以提出问题的原因吗?

Can anyone suggest what can be the cause of the problem.

推荐答案

该异常即将到来,因为Hadoop无法为以下方案找到文件系统实现:sftp.

The exception is coming, because Hadoop is not able to find a file system implementation for the scheme: sftp.

FileSystem.java中发生异常.该框架尝试查找配置参数fs.sftp.impl的值,当找不到该值时,它将引发此异常.

The exception occurs in FileSystem.java. The framework tries to find the value for configuration parameter fs.sftp.impl and when it does not find it, it throws this exception.

据我所知,Hadoop默认情况下不支持sftp文件系统.这张JIRA票证[添加SFTP文件系统] [ https://issues.apache.org/jira/browse/HADOOP-5732] ,表示Hadoop版本2.8.0中提供了SFTP.

As far as I know, Hadoop does not support sftp file system by default. This JIRA ticket [Add SFTP FileSystem][https://issues.apache.org/jira/browse/HADOOP-5732], indicates that, SFTP is available from Hadoop version 2.8.0.

要解决此问题,您需要做两件事:

To fix this, you need to do 2 things:

  1. 向您的HADOOP部署中添加一个包含sftp文件系统实现的jar.
  2. 将config参数fs.sftp.impl设置为sftp实现的完全限定的类名.
  1. Add a jar containing sftp file system implementation to your HADOOP deployment.
  2. Set the config parameter: fs.sftp.impl to a fully qualified class name of the sftp implementation.

我遇到了这个git存储库,其中包含Hadoop的sftp实现: https://github.com/wnagele/hadoop-filesystem-sftp .要使用此功能,您需要将属性fs.sftp.impl设置为org.apache.hadoop.fs.sftp.SFTPFileSystem.

I came across this git repository, which contains sftp implementation for Hadoop: https://github.com/wnagele/hadoop-filesystem-sftp. To use this, you need to set property fs.sftp.impl to org.apache.hadoop.fs.sftp.SFTPFileSystem.

这篇关于没有用于方案的文件系统:sftp的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆