Java - 像Windows资源管理器一样排序字符串 [英] Java - Sort Strings like Windows Explorer

查看:108
本文介绍了Java - 像Windows资源管理器一样排序字符串的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在尝试使用Sander Pham在另一个问题上提出的代码。我需要像Windows资源管理器那样对字符串名称的java ArrayList进行排序。他的代码适用于一切,但只针对一个问题。我本来希望对这个问题发表评论,但我需要更多的声誉点来评论。无论如何......他建议使用自定义比较器实现的类,并使用它来比较字符串名称。以下是该类的代码:

I am trying to use code suggested by Sander Pham on another question. I need my java ArrayList of string names to be sorted like Windows Explorer does. His code worked for everything but for one issue. I would have liked to comment onto that question, but I need more reputation points to comment. Anyways... He suggested to use a custom comparator implemented class and use that to compare the string names. Here is the code of that class:

class IntuitiveStringComparator implements Comparator<String>
{
    private String str1, str2;
    private int pos1, pos2, len1, len2;

    public int compare(String s1, String s2)
    {
        str1 = s1;
        str2 = s2;
        len1 = str1.length();
        len2 = str2.length();
        pos1 = pos2 = 0;

        int result = 0;
        while (result == 0 && pos1 < len1 && pos2 < len2)
        {
            char ch1 = str1.charAt(pos1);
            char ch2 = str2.charAt(pos2);

            if (Character.isDigit(ch1))
            {
                result = Character.isDigit(ch2) ? compareNumbers() : -1;
            }
            else if (Character.isLetter(ch1))
            {
                result = Character.isLetter(ch2) ? compareOther(true) : 1;
            }
            else
            {
                result = Character.isDigit(ch2) ? 1
                : Character.isLetter(ch2) ? -1
                : compareOther(false);
            }

            pos1++;
            pos2++;
        }

        return result == 0 ? len1 - len2 : result;
    }

    private int compareNumbers()
    {
        // Find out where the digit sequence ends, save its length for
        // later use, then skip past any leading zeroes.
        int end1 = pos1 + 1;
        while (end1 < len1 && Character.isDigit(str1.charAt(end1)))
        {
            end1++;
        }
        int fullLen1 = end1 - pos1;
        while (pos1 < end1 && str1.charAt(pos1) == '0')
        {
            pos1++;
        }

        // Do the same for the second digit sequence.
        int end2 = pos2 + 1;
        while (end2 < len2 && Character.isDigit(str2.charAt(end2)))
        {
            end2++;
        }
        int fullLen2 = end2 - pos2;
        while (pos2 < end2 && str2.charAt(pos2) == '0')
        {
            pos2++;
        }

        // If the remaining subsequences have different lengths,
        // they can't be numerically equal.
        int delta = (end1 - pos1) - (end2 - pos2);
        if (delta != 0)
        {
            return delta;
        }

        // We're looking at two equal-length digit runs; a sequential
        // character comparison will yield correct results.
        while (pos1 < end1 && pos2 < end2)
        {
            delta = str1.charAt(pos1++) - str2.charAt(pos2++);
            if (delta != 0)
            {
                return delta;
            }
        }

        pos1--;
        pos2--;

        // They're numerically equal, but they may have different
        // numbers of leading zeroes. A final length check will tell.
        return fullLen2 - fullLen1;
    }

    private int compareOther(boolean isLetters)
    {
        char ch1 = str1.charAt(pos1);
        char ch2 = str2.charAt(pos2);

        if (ch1 == ch2)
        {
            return 0;
        }

        if (isLetters)
        {
            ch1 = Character.toUpperCase(ch1);
            ch2 = Character.toUpperCase(ch2);
            if (ch1 != ch2)
            {
                ch1 = Character.toLowerCase(ch1);
                ch2 = Character.toLowerCase(ch2);
            }
        }

        return ch1 - ch2;
    }   
}

在使用它时,除非是字符串名称后面没有数字。如果它没有数字,则将其放在列表的末尾,这是错误的。如果它没有数字,它应该在开头。

In using this, it works great except for if the string name does not have a number after it. If it does not have a number, it is put at the end of the list, which is wrong. If it doesn't have a number, it should be at the beginning.

ie

filename.jpg
filename2.jpg
filename03.jpg
filename3.jpg

目前它排序......

Currently it sorts that...

filename2.jpg
filename03.jpg
filename3.jpg
filename.jpg

我需要更改代码才能纠正此行为?

What do I need to change in the code to correct this behavior?

谢谢

推荐答案

这是我第二次尝试回答这个问题。我使用了 http://www.interact-sw.co .uk / iangblog / 2007/12/13 /自然分类作为开始。不幸的是,我觉得我也发现了问题。但我认为在我的代码中这些问题得到了正确的解决。

This is my second try to answer this. I used http://www.interact-sw.co.uk/iangblog/2007/12/13/natural-sorting as a start. Unfortunatly I think I found there problems as well. But I think in my code these problems are correctly adressed.

信息:Windows资源管理器使用API​​函数 StrCmpLogicalW()函数进行排序。在那里它被称为自然排序顺序

Info: Windows Explorer uses the API function StrCmpLogicalW() function to do its sorting. There it is called natural sort order.

所以这是我对WindowsExplorerSort的解释 - 算法:

So here is my unterstanding of the WindowsExplorerSort - Algorithm:


  • 文件名按部分比较。至于现在,我确定了以下部分:数字,'',空格休息

  • 文件名中的每个数字都被认为是可能的数字比较。

  • 将数字作为数字进行比较,但如果它们相等,则较长的基本字符串首先出现。这会发生前导零。

    • filename00.txt,filename0.txt

    • Filenames are compared part wise. As for now I identified the following parts: numbers, '.', spaces and the rest.
    • Each number within the filename is considered for a possible number compare.
    • Numbers are compared as numbers but if they are equal, the longer base string comes first. This happens with leading zeros.
      • filename00.txt, filename0.txt

      此列表部分基于尝试和错误。我增加了测试文件名的数量,以便在评论中提到更多错误,并在Windows资源管理器中检查结果。

      This list is based partly on try and error. I increased the number of test filenames, to adress more of the in comments mentioned pitfalls and the result was checked against a Windows Explorer.

      所以这是输出:

      filename
      filename 00
      filename 0
      filename 01
      filename.jpg
      filename.txt
      filename00.jpg
      filename00a.jpg
      filename00a.txt
      filename0
      filename0.jpg
      filename0a.txt
      filename0b.jpg
      filename0b1.jpg
      filename0b02.jpg
      filename0c.jpg
      filename01.0hjh45-test.txt
      filename01.0hjh46
      filename01.1hjh45.txt
      filename01.hjh45.txt
      Filename01.jpg
      Filename1.jpg
      filename2.hjh45.txt
      filename2.jpg
      filename03.jpg
      filename3.jpg
      

      新比较器 WindowsExplorerComparator 在已经提到的部分中拆分文件名,并对两个文件名进行部分比较。为了正确,新的比较器使用字符串作为输入,因此必须创建一个适配器比较器,如

      The new comparator WindowsExplorerComparator splits the filename in the already mentioned parts and does a part wise comparing of two filenames. To be correct, the new comparator uses Strings as its input so one has to create an adaptor Comparator like

      new Comparator<File>() {
          private final Comparator<String> NATURAL_SORT = new WindowsExplorerComparator();
      
          @Override
          public int compare(File o1, File o2) {;
              return NATURAL_SORT.compare(o1.getName(), o2.getName());
          }
      }
      

      所以这是新的Comparators源代码及其测试:

      So here is the new Comparators source code and its test:

      import java.io.File;
      import java.util.ArrayList;
      import java.util.Arrays;
      import java.util.Collections;
      import java.util.Comparator;
      import java.util.Iterator;
      import java.util.List;
      import java.util.regex.Matcher;
      import java.util.regex.Pattern;
      
      public class WindowsSorter {
      
          public static void main(String args[]) {
              //huge test data set ;)
              List<File> filenames = Arrays.asList(new File[]{new File("Filename01.jpg"),
                  new File("filename"), new File("filename0"), new File("filename 0"),
                  new File("Filename1.jpg"), new File("filename.jpg"), new File("filename2.jpg"), 
                  new File("filename03.jpg"), new File("filename3.jpg"), new File("filename00.jpg"),
                  new File("filename0.jpg"), new File("filename0b.jpg"), new File("filename0b1.jpg"),
                  new File("filename0b02.jpg"), new File("filename0c.jpg"), new File("filename00a.jpg"),
                  new File("filename.txt"), new File("filename00a.txt"), new File("filename0a.txt"),
                  new File("filename01.0hjh45-test.txt"), new File("filename01.0hjh46"),
                  new File("filename2.hjh45.txt"), new File("filename01.1hjh45.txt"),
                  new File("filename01.hjh45.txt"), new File("filename 01"),
                  new File("filename 00")});
      
              //adaptor for comparing files
              Collections.sort(filenames, new Comparator<File>() {
                  private final Comparator<String> NATURAL_SORT = new WindowsExplorerComparator();
      
                  @Override
                  public int compare(File o1, File o2) {;
                      return NATURAL_SORT.compare(o1.getName(), o2.getName());
                  }
              });
      
              for (File f : filenames) {
                  System.out.println(f);
              }
          }
      
          public static class WindowsExplorerComparator implements Comparator<String> {
      
              private static final Pattern splitPattern = Pattern.compile("\\d+|\\.|\\s");
      
              @Override
              public int compare(String str1, String str2) {
                  Iterator<String> i1 = splitStringPreserveDelimiter(str1).iterator();
                  Iterator<String> i2 = splitStringPreserveDelimiter(str2).iterator();
                  while (true) {
                      //Til here all is equal.
                      if (!i1.hasNext() && !i2.hasNext()) {
                          return 0;
                      }
                      //first has no more parts -> comes first
                      if (!i1.hasNext() && i2.hasNext()) {
                          return -1;
                      }
                      //first has more parts than i2 -> comes after
                      if (i1.hasNext() && !i2.hasNext()) {
                          return 1;
                      }
      
                      String data1 = i1.next();
                      String data2 = i2.next();
                      int result;
                      try {
                          //If both datas are numbers, then compare numbers
                          result = Long.compare(Long.valueOf(data1), Long.valueOf(data2));
                          //If numbers are equal than longer comes first
                          if (result == 0) {
                              result = -Integer.compare(data1.length(), data2.length());
                          }
                      } catch (NumberFormatException ex) {
                          //compare text case insensitive
                          result = data1.compareToIgnoreCase(data2);
                      }
      
                      if (result != 0) {
                          return result;
                      }
                  }
              }
      
              private List<String> splitStringPreserveDelimiter(String str) {
                  Matcher matcher = splitPattern.matcher(str);
                  List<String> list = new ArrayList<String>();
                  int pos = 0;
                  while (matcher.find()) {
                      list.add(str.substring(pos, matcher.start()));
                      list.add(matcher.group());
                      pos = matcher.end();
                  }
                  list.add(str.substring(pos));
                  return list;
              }
          }
      }
      

      这篇关于Java - 像Windows资源管理器一样排序字符串的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆