递归打印文件和文件夹的所有键无法正常工作 [英] Printing all keys of files and folders recursively doesn't work as expected

查看:132
本文介绍了递归打印文件和文件夹的所有键无法正常工作的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我们已经在Amazon S3中的文件夹中存储了几个文件.

We have stored several files an folders in Amazon S3.

我们正在使用以下代码来迭代给定根文件夹的所有文件和文件夹

We are using the following code to iterate all the files and folders for the given root folder

ObjectListing listing = s3.listObjects( bucketName, prefix );
List<S3ObjectSummary> summaries = listing.getObjectSummaries();
while (listing.isTruncated()) {
   listing = s3.listNextBatchOfObjects (listing);
   summaries.addAll (listing.getObjectSummaries());
}

假定根文件夹有1000个文件和10个文件夹.其中一个文件夹有100个子文件夹,每个文件夹有500个文件.

Assume the root folder has 1000 files and 10 folders. One of the folder has 100 sub-folder and each has 500 files.

上面的程序可以正常工作并列出所有文件并遍历所有文件.

The above program works fine and list all the files and traverse all the files.

问题在于它不是打印所有子文件夹的键.

The problem is it is not printing the keys of all the sub-folder.

有趣的是它会打印第一个子文件夹

The interesting thing is it prints the first sub-folder

示例

Root Folder: Emp
Folder Under Root folder: FolderA, FolderB, FolderC
Sub-folder under FolderA: 0, 1, 2, 3, 4, 5 ... 100
Each 0 or 1 or 2...has 500 files each

可能是什么问题? AWS或文件夹"中的任何限制都不应为数字,还是有逻辑上的问题?

What could be the problem? Any limitation in AWS or Folder should not be numeric or is there is any logical issue?

使用上面的代码时.

FolderA/0/作为密钥出现,而FolderA/1 .... FolderA/10没有出现

FolderA/0/ is coming as key where as FolderA/1....FolderA/10 doesn't come

谢谢.

推荐答案

Amazon S3中没有诸如文件夹或目录之类的东西. Amazon S3是密钥数据存储.文件夹和子文件夹是对象键中"/"字符的人工解释. S3不了解也不关心它们.

There is no such thing as folders or directories in Amazon S3. Amazon S3 is a key-data store. Folders and sub-folders are a human interpretation of the "/" character in object keys. S3 doesn't know or care about them.

您可以通过创建一个以"/"字符结尾的0字节对象来伪造" S3中的空文件夹.

You can "fake" the creation of an empty folder in S3 by creating a 0-byte object that ends with the "/" character.

遍历对象列表时,将包含这些0字节的文件夹".

When iterating over the list of objects, these 0-byte "folders" will be included.

但是,您可能还拥有诸如"folder1/object1"之类的对象,在您的脑海中,"folder1"是根目录下的子文件夹.但是在S3中,可能不存在"folder1/"之类的对象.在这种情况下,您将不会在自己的结果列表中看到"folder1/"输出.

However, you may also have objects such as "folder1/object1" where in your mind, "folder1" is a sub-folder off the root. But in S3, there may not be such an object as "folder1/". In this case, you will not see "folder1/" outputted in your result list on it's own.

如果需要获取所有子文件夹"的列表,则不仅需要查找以"/"字符结尾的对象,还需要检查所有对象的"/"字符并从对象的键推断出一个子文件夹,因为文件夹本身可能没有那个0字节的对象.

If you need to get a list of all "sub-folders", then you need to not only look for objects that end with the "/" character, but you also need to examine all objects for a "/" character and infer a sub-folder from the object's key because there may not be that 0-byte object for the folder itself.

例如:

  • folder1/object1
  • 文件夹2/
  • folder2/object1

在此示例中,只有一个子文件夹对象,但是您可以说实际上有两个子文件夹.

In this example, there's only one sub-folder object, but you could say there are actually two sub-folders.

类似于Java的伪代码,以获取子文件夹:

function getSubFolders(bucketName, currentFolder)
{
  // Use the current folder as the S3 prefix
  var prefix = currentFolder;

  // Get all objects
  ObjectListing listing = s3.listObjects( bucketName, prefix );
  List<S3ObjectSummary> summaries = listing.getObjectSummaries();
  while (listing.isTruncated()) {
    listing = s3.listNextBatchOfObjects (listing);
    summaries.addAll (listing.getObjectSummaries());
  }

  // Split the list into files in the current folder and sub-folders
  List<string> subFolders = new List<string>();
  List<string> files = new List<string>();
  foreach (var summary in summaries)
  {
    var key = summary.key;

    // The key includes the prefix, so remove it
    key = key.subString(prefix.length);

    // If the key includes a / character, then
    // it's in a subfolder. Just save the subfolder part
    // of this object.
    // Otherwise, save the key in our list of files.
    var slashIndex = key.indexOf("/");
    if (slashIndex >= 0)
    {
      subFolders.add(key.subString(0, slashIndex));
    }
    else
    {
      files.add(key);
    }
  }

  // Remove duplicate entries from our subFolder list
  subFolders = subFolders.distinct();
}

这篇关于递归打印文件和文件夹的所有键无法正常工作的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆