使用文件系统(而不是数据库!)无模式数据 - 最佳实践 [英] Using a Filesystem (Not a Database!) for Schemaless Data - Best Practices

查看:177
本文介绍了使用文件系统(而不是数据库!)无模式数据 - 最佳实践的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

阅读完我的其他问题后, 使用关系Schema-Less数据的数据库 ,我开始怀疑文件系统是否比关系数据库更适合用于存储和查询无模式数据。



为什么不直接将数据直接保存到文件系统,而不仅仅是在MySQL之上构建文件系统?索引需要计算出来,但是现代文件系统非常稳定,具有复制,快照和备份功能等强大功能,并且可以灵活地存储无模式数据。



但是,我找不到使用文件系统而不是数据库的人的任何示例。



在哪里可以找到更多资源,了解如何将无数据库(或面向文档的)数据库实现为文件系统之上的层?是否有人使用现代文件系统作为无模式数据库?

解决方案

是的文件系统可以作为一种特殊情况的NOSQL-像数据库系统。它可能有一些限制,应该在任何设计决策时考虑:



优点:$ b​​ $ b -
- 简单,直观。 b
$ b


  • 利用多年的调整和缓存算法

  • 轻松备份,可能容易聚类



要考虑的事项:




  • 您可以有
    层级或多值属性


  • 它是什么类型的
    数据存储,

    查询元数据的速度 - 并不是所有的
    fs都是非常优化的
    ,不包括大小,日期。


  • (尽管
    是NoSQL非常常见的)


  • 低效的存储使用(除非文件
    系统执行块子分配,
    ,通常每个项目会消耗4-16K,不论大小是多少,都会存储


  • 可能没有缓存算法

  • 备份解决方案可能会遇到问题
    取决于您的存储方式事物 -
    太深,每个节点太多的项目,
    等 - 这可能会消除这样的结构的明显的
    优点。
    锁定LOCAL文件系统的工作原理
    当然如果你调用
    正确的例程,但不一定
    为网络基本文件系统(那些
    问题已经解决了各种
    的方式,但它当然是一个设计
    问题)


After reading over my other question, Using a Relational Database for Schema-Less Data, I began to wonder if a filesystem is more appropriate than a relational database for storing and querying schemaless data.

Rather than just building a file system on top of MySQL, why not just save the data directly to the filesystem? Indexing needs to be figured out, but modern filesystems are very stable, have great features like replication, snapshot and backup facilities, and are flexible at storing schema-less data.

However, I can't find any examples of someone using a filesystem instead of a database.

Where can I find more resources on how to implement a schemaless (or "document-oriented") database as a layer on top of a filesystem? Is anyone using a modern filesystem as a schemaless database?

解决方案

Yes a filesystem could be taken as a special case of a NOSQL-like database system. It may have some limitations that should be considered during any design decisions:

pros: - - simple, intuitive.

  • takes advantage of years of tuning and caching algorithms
  • easy backup, potentially easy clustering

things to think about:

  • richness of metadata - what types of data does it store, how does it let you query them, can you have hierarchal or multivalued attributes

  • speed of querying metadata - not all fs's are particularly well optimized with anything other than size, dates.

  • inability to join queries (though that's pretty much common to NoSQL)

  • inefficient storage usage (unless the file system performs block suballocation, you'll typically blow 4-16K per item stored regardless of size)

  • May not have the kind of caching algorithm you want for it's directory structure
  • tends to be less tunable, etc.
  • backup solutions may have trouble depending on how you store things - too deep, too many items per node, etc - which might obviate an obvious advantage of such a structure. locking for a LOCAL filesystem works pretty well of course if you call the right routines, but not necessarily for a network base fileesytem (those problems have been solved in various ways, but it's certainly a design issue)

这篇关于使用文件系统(而不是数据库!)无模式数据 - 最佳实践的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆