标签上的问题“文件系统” [英] Problem on Tag "Filesystem"

查看:160
本文介绍了标签上的问题“文件系统”的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

为了娱乐的原因,我编写了一个PHP类,它用标签而不是分层的方式对文件进行分类,标签以+ tag1 + tag2 + tagN + MD5.EXTENSION的形式存储在文件名中,因此我被FS / OS施加的字符限制(255)所阻塞。这里是班:

 <?php 

类TagFS
{
public $ FS = null;

函数__construct($ FS)
{
if(is_dir($ FS)=== true)
{
$ this-> FS = $ this-> Path($ FS);




$ if $($ path $ == $ true)

$ $ b {
$ files = array_slice(scandir($ path),2);
$ b foreach($ files as $ file)
{
$ this-> Add($ this-> Path($ path)。$ file,$ tag);
}

return true;


else if(is_file($ path)=== true)
{
$ file = md5_file($ path);

if(is_file($ this-> FS。$ file)=== false)
{
if(copy($ path,$ this-> FS。 $ file)=== false)
{
return false;


$ b $ return $ this-> Link($ this-> FS。$ file,$ this-> FS。 >标记($ tag)。'+'。$ file。'。'。strtolower(pathinfo($ path,PATHINFO_EXTENSION)));
}

return false;


函数Get($ tag)
{
return glob($ this-> FS。'* +'。str_replace('+',' ($ tag))。'+ *',GLOB_BRACE);


函数链接($ source,$ destination)
{
if(is_file($ source)=== true)
{
if(function_exists('link')=== true)
{
return link($ source,$ destination);

$ b $ if(is_file($ destination)=== false)
{
exec('fsutil hardlink create''。$ destination。'' 。$ source。'');

if(is_file($ destination)=== true)
{
return true;
}
}
}

return false;


函数Path($ path)
{
if(file_exists($ path)=== true)
{
$ path = str_replace('\\','/',realpath($ path)); $($ is $ d $($ path)=== true)&&($ path [strlen($ path) - 1]!='/'))
{
$ path。='/';
}

return $ path;
}

return false;


函数标记($ string)
{
/ *
TODO:
删除(在Windows上):。 \ /:*? <> |
删除(在* nix):。/
删除(在TagFS上):+ * {}
删除(在TagFS - 可能!) -
Max Chars(在Windows中)255
Max char(in * nix)255
* /

$ result = array_unique(explode('',$ string))) ;

if(empty($ result)=== false)
{
if(natcasesort($ result)=== true)
{
返回strtolower(implode('+',$ result));
}
}

返回false;
}
}

?>

我相信这个系统对一些小标签来说效果很好,我的问题是,当整个文件名的大小超过255个字符。我应该采取什么方法来绕过文件名限制?我想在拆分标签在同一个文件的几个硬链接,但排列可能ki有没有其他方法可以解决这个问题?



编辑 - 一些使用示例:

 <?php 

$ images = new TagFS(' S:');

$ images-> Add('P:/xampplite/htdocs/tag/geoaki.png','geoaki logo');
$ images-> Add('P:/xampplite/htdocs/tag/cloud.jpg','geoaki云标记');
$ images-> Add('P:/xampplite/htdocs/tag/cloud.jpg','nuvem azul branco');
$ images-> Add('P:/xampplite/htdocs/tag/xml-full.gif','geoaki auto vin api service xml');
$ images-> Add('P:/xampplite/htdocs/tag/dunp3d-1.jpg','dunp logo');
$ images-> Add('P:/xampplite/htdocs/tag/d-proposta-04c.jpg','dunp logo');

/ *
[0] => S:/ + api + auto + geoaki + service + vin + xml + 29be189cbc98fcb36a44d77acad13e18.gif
[1] => S:/ + azul + branco + nuvem + 4151ae7900f33788d0bba5fc6c29bee3.jpg
[2] => S:/ +云+ geoaki +标记+ 4151ae7900f33788d0bba5fc6c29bee3.jpg
[3] => S:/ + dunp + logo + 0cedeb6f66cbfc3974c6b7ad86f4fbd3.jpg
[4] => S:/ + dunp + logo + 8b9fcb119246bb6dcac1906ef964d565.jpg
[5] => S:/ + geoaki + logo + 5f5174c498ffbfd9ae49975ddfa2f6eb.png
* /
echo'< pre>';
print_r($ images-> Get('*'));
echo'< / pre>';

/ *
[0] => S:/ + azul + branco + nuvem + 4151ae7900f33788d0bba5fc6c29bee3.jpg
* /
echo'< pre>';
print_r($ images-> Get('azul nuvem'));
echo'< / pre>';

/ *
[0] => S:/ + dunp + logo + 0cedeb6f66cbfc3974c6b7ad86f4fbd3.jpg
[1] => S:/ + dunp + logo + 8b9fcb119246bb6dcac1906ef964d565.jpg
[2] => S:/ + geoaki + logo + 5f5174c498ffbfd9ae49975ddfa2f6eb.png
* /
echo'< pre>';
print_r($ images-> Get('logo'));
echo'< / pre>';

/ *
[0] => S:/ + dunp + logo + 0cedeb6f66cbfc3974c6b7ad86f4fbd3.jpg
[1] => S:/ + dunp + logo + 8b9fcb119246bb6dcac1906ef964d565.jpg
* /
echo'< pre>';
print_r($ images-> Get('logo dunp'));
echo'< / pre>';

/ *
[0] => S:/ + geoaki + logo + 5f5174c498ffbfd9ae49975ddfa2f6eb.png
* /
echo'< pre>';
print_r($ images-> Get('geo * logo'));
echo'< / pre>';

?>

编辑:由于几个建议使用无服务器数据库或任何其他类型的查找表(XML,平面,键/值对等)我想澄清以下内容:虽然这个代码是用PHP编写的,但是它的想法是将它移植到Python并使其成为桌面应用程序 - 这一点还没有(除了当然的例子)与PHP。此外,如果我必须使用某种查找表,我一定会使用SQLite 3,但是我正在寻找的是一个解决方案,不涉及任何其他额外的技术文件系统(文件夹,文件和硬链接)。

您可以打电话给我,但我想在这里完成两个简单的目标: )保持系统垃圾空闲(喜欢Thumbs.db或DS_STORE的例子?)和2)保持文件容易识别,如果由于某种原因查找表(在这种情况下,SQLite)变得忙,损坏,丢失或忘记例如在备份中)。



PS:这应该在Linux,Mac和Windows上运行(在NTFS下)如果你使用硬/软链接,你可能会考虑给每个标签,它是自己的目录有一个链接为每个文件与标签。那么当你给出多个标签时,你可以比较两者中找到的标签。然后这些文件可以存储在一个单一的文件夹,并使其名称当然是唯一的。

我不知道如何这将不同于有一个元文件命名然后列出该标签中的所有文件。


For recreational reasons I wrote a PHP class that classifies files with tags instead of in a hierarchical way, the tags are stored in the filename itself in the form of +tag1+tag2+tagN+MD5.EXTENSION and thus I'm stucked with the chars limit (255) imposed by the FS/OS. Here is the class:

<?php

class TagFS
{
    public $FS = null;

    function __construct($FS)
    {
        if (is_dir($FS) === true)
        {
            $this->FS = $this->Path($FS);
        }
    }

    function Add($path, $tag)
    {
        if (is_dir($path) === true)
        {
            $files = array_slice(scandir($path), 2);

            foreach ($files as $file)
            {
                $this->Add($this->Path($path) . $file, $tag);
            }

            return true;
        }

        else if (is_file($path) === true)
        {
            $file = md5_file($path);

            if (is_file($this->FS . $file) === false)
            {
                if (copy($path, $this->FS . $file) === false)
                {
                    return false;
                }
            }

            return $this->Link($this->FS . $file, $this->FS . '+' . $this->Tag($tag) . '+' . $file . '.' . strtolower(pathinfo($path, PATHINFO_EXTENSION)));
        }

        return false;
    }

    function Get($tag)
    {
        return glob($this->FS . '*+' . str_replace('+', '{+,+*+}', $this->Tag($tag)) . '+*', GLOB_BRACE);
    }

    function Link($source, $destination)
    {
        if (is_file($source) === true)
        {
            if (function_exists('link') === true)
            {
                return link($source, $destination);
            }

            if (is_file($destination) === false)
            {
                exec('fsutil hardlink create "' . $destination . '" "' . $source . '"');

                if (is_file($destination) === true)
                {
                    return true;
                }
            }
        }

        return false;
    }

    function Path($path)
    {
        if (file_exists($path) === true)
        {
            $path = str_replace('\\', '/', realpath($path));

            if ((is_dir($path) === true) && ($path[strlen($path) - 1] != '/'))
            {
                $path .= '/';
            }

            return $path;
        }

        return false;
    }

    function Tag($string)
    {
        /*
        TODO:
        Remove (on Windows):            . \ / : * ? " < > |
        Remove (on *nix):               . /
        Remove (on TagFS):              + * { }
        Remove (on TagFS - Possibly!)   -
        Max Chars (in Windows)          255
        Max Char (in *nix)              255
        */

        $result = array_filter(array_unique(explode(' ', $string)));

        if (empty($result) === false)
        {
            if (natcasesort($result) === true)
            {
                return strtolower(implode('+', $result));
            }
        }

        return false;
    }
}

?>

I believe this system works well for a couple of small tags, but my problem is when the size of the whole filename exceeds 255 chars. What approach should I take in order to bypass the filename limit? I'm thinking in splitting tags on several hard links of the same file, but the permutations may kill the system.

Are there any other ways to solve this problem?

EDIT - Some usage examples:

<?php

$images = new TagFS('S:');

$images->Add('P:/xampplite/htdocs/tag/geoaki.png', 'geoaki logo');
$images->Add('P:/xampplite/htdocs/tag/cloud.jpg', 'geoaki cloud tag');
$images->Add('P:/xampplite/htdocs/tag/cloud.jpg', 'nuvem azul branco');
$images->Add('P:/xampplite/htdocs/tag/xml-full.gif', 'geoaki auto vin api service xml');
$images->Add('P:/xampplite/htdocs/tag/dunp3d-1.jpg', 'dunp logo');
$images->Add('P:/xampplite/htdocs/tag/d-proposta-04c.jpg', 'dunp logo');

/*
[0] => S:/+api+auto+geoaki+service+vin+xml+29be189cbc98fcb36a44d77acad13e18.gif
[1] => S:/+azul+branco+nuvem+4151ae7900f33788d0bba5fc6c29bee3.jpg
[2] => S:/+cloud+geoaki+tag+4151ae7900f33788d0bba5fc6c29bee3.jpg
[3] => S:/+dunp+logo+0cedeb6f66cbfc3974c6b7ad86f4fbd3.jpg
[4] => S:/+dunp+logo+8b9fcb119246bb6dcac1906ef964d565.jpg
[5] => S:/+geoaki+logo+5f5174c498ffbfd9ae49975ddfa2f6eb.png
*/
echo '<pre>';
print_r($images->Get('*'));
echo '</pre>';

/*
[0] => S:/+azul+branco+nuvem+4151ae7900f33788d0bba5fc6c29bee3.jpg
*/
echo '<pre>';
print_r($images->Get('azul nuvem'));
echo '</pre>';

/*
[0] => S:/+dunp+logo+0cedeb6f66cbfc3974c6b7ad86f4fbd3.jpg
[1] => S:/+dunp+logo+8b9fcb119246bb6dcac1906ef964d565.jpg
[2] => S:/+geoaki+logo+5f5174c498ffbfd9ae49975ddfa2f6eb.png
*/
echo '<pre>';
print_r($images->Get('logo'));
echo '</pre>';

/*
[0] => S:/+dunp+logo+0cedeb6f66cbfc3974c6b7ad86f4fbd3.jpg
[1] => S:/+dunp+logo+8b9fcb119246bb6dcac1906ef964d565.jpg
*/
echo '<pre>';
print_r($images->Get('logo dunp'));
echo '</pre>';

/*
[0] => S:/+geoaki+logo+5f5174c498ffbfd9ae49975ddfa2f6eb.png
*/
echo '<pre>';
print_r($images->Get('geo* logo'));
echo '</pre>';

?>

EDIT: Due to the several suggestions to use a serverless database or any other type of lookup table (XML, flat, key/value pairs, etc) I want to clarify the following: although this code is written in PHP, the idea is to port it to Python and make a desktop application out of it - this has noting to do (besides the example of course) with PHP. Furthermore, if I have to use some kind of lookup table I'll definitely go with SQLite 3, but what I'm looking for is a solution that doesn't involves any other additional "technology" besides the filesystem (folders, files and hardlinks).

You may call me nuts but I'm trying to accomplish two simple goals here: 1) keep the system "garbage" free (who likes Thumbs.db or DS_STORE for example?) and 2) keep the files easily identifiable if for some reason the lookup table (in this case SQLite) gets busy, corrupt, lost or forgot (in backups for instance).

PS: This is supposed to run on both Linux, Mac, and Windows (under NTFS).

解决方案

If you have use of hard/soft links than you might look into giving each tag it's own directory having a link for each file with that "tag." Then when you are given multiple tags you can compare those found in both. Then the files could be stored in a single folder and having them unique in name of course.

I don't know how this would be different from having a meta file named by the tag, then listing all files that exist in that tag.

这篇关于标签上的问题“文件系统”的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆