然而,另一个动态数据模型的问题 [英] Yet another dynamic data model question

查看:159
本文介绍了然而,另一个动态数据模型的问题的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有一个需要在运行时特定对象的用户定义属性的项目(可以说在这个例子中的人对象)。该项目将有许多不同的用户(1000 +),每个定义自己的一组'人'的对象自己独特的属性。

(EG - 用户#1将有一组定义的属性,这将适用于所有的人反对该用户拥有Mutliply这个由1000个用户,这是用户应用程序将工作的底线最小数量。用)。这些属性会被用于查询人​​反对,并返回结果。

我觉得这些都是我可以用各种可能的方法。我将使用C#(和.NET 3.5任何版本或4),并有一个自由支配回复:用什么数据存储。 (我有MySQL和MSSQL提供,虽然有自由使用任何软件,只要将适合该法案)

我错过了什么,或者我的评估所做的任何不正确的假设?

在这些选择的 - 你会去什么解决办法


  1. 混合EAV对象模型。 (使用正常的关系模型定义数据库,并有Person表一属性包表)。

    缺点:很多元/查询连接。表现不佳。可以打一个查询中使用连接/表的数量的限制。

    我敲了一个快速的样本,具有亚音速2.x的'esqe接口:

     选择(),从()。如果...等

    生成正确的连接,然后过滤器+支点在C#中返回的数据,返回一个DataTable与正确键入数据集进行配置。

    我还没有负载测试此解决方案。它基于此Microsoft白皮书的EA建议:
    <一href=\"http://download.microsoft.com/download/d/9/4/d948f981-926e-40fa-a026-5bfcf076d9b9/BPSemanticDBModeling.docx\"相对=nofollow>的SQL Server 2008 RTM文件的最佳实践语义数据建模性能和可伸缩性


  2. 允许用户在运行时动态创建/修改对象的表。该解决方案是什么,我相信使用动态属性时,NHibernate的做的背景下,作为讨论的其中

    http://bartreyserhove.blogspot.com/2008/02/dynamic-domain-mode-using-nhibernate.html

    缺点:

    由于该系统的发展,所定义的列数会非常大,并且可能会碰到的列的最大数量。如果有1000个用户,每个用户为他们的'人'对象10个不同的属性,那么我们就需要一个表持有10K列。在这种情况下没有可扩展性。

    我想我可以让一个人的属性每个用户表,但如果有1000个用户开始,这是1000表格加上在应用程序中的其他10个怪。

    我不确定这是否会成为可扩展的 - 但它似乎不那么。有人请纠正我,如果我不正确的!


  3. 使用一个数据存储的NoSQL,比如CouchDB的/ MongoDB的

    从我已阅读,这些都是尚未证实的大规模应用,基于字符串,并在开发阶段的早期。如果我在这个评估是不正确,有人可以让我知道?

    http://www.eflorenzano.com/blog/post/why -couchdb-吸/


  4. 在百姓餐桌使用XML列来存储属性。

    缺点 - 在查询没有索引,所以每一列都需要进行检索和查询返回一个结果,导致查询性能不佳


  5. 序列化对象图到数据库中。

    缺点 - 在查询没有索引,所以每一列都需要进行检索和查询返回一个结果,导致查询性能不佳


  6. C#绑定berkelyDB

    从我读到这里: http://www.dinosaurtech.com / 2009 /伯克利-DB-C绑定/


      

    Berkeley DB的肯定已经被证明是有用的,但罗伯特指出的 - 有没有简单的界面。整个宇包装有专人codeD和所有的指数都是手工维护。它比SQL / LINQ到SQL要困难得多,但是这是你付出的荒谬速度的代价。


    似乎是一个很大的开销 - 但是,如果任何人都可以提供一个链接就如何保持在C#中指数的教程 - 它可能是一个郎中


  7. SQL / RDF的混合体。
    奇之前我没有想到这一点。类似的选项1,但不是一个属性包表,只是XREF一个RDF存储?
    查询将他们包括2个步骤 - 查询RDF店的人打正确的属性,返回Person对象(S),并使用该ID的这些人对象在SQL查询返回的关系数据。额外的开销,但可能是一个而去。


我真的AP preciate这里的任何输入!


解决方案

Windows上的ESENT数据库引擎大量用于这种半结构化数据的

。一个例子是微软的Exchange,像你的应用程序,有数以千计的每个用户都可以定义自己的一组属性(MAPI命名属性)的用户。 Exchange使用ESENT略加修改的版本。

ESENT有很多的功能,使应用程序与大元数据的要求:每个ESENT表可以有大约32K〜列上定义;表,索引和列可以在运行时添加;稀疏列不占用不设置任何时候记录空间;和模板表可以减小由元数据本身所使用的空间。这是很常见的大型应用程序有成千上万的表/索引的。

在这种情况下,你可以有每个用户一个表,在表中创建每个用户的列,在您想要查询任何列创建索引。这将是类似于交易所的某些版本中存储数据的方式。这种方法的缺点是ESENT没有一个查询引擎,所以你将不得不手工工艺查询作为MakeKey /搜索/ MoveNext的调用。

有关ESENT的托管包装是在这里:

HTTP://managedesent.$c$cplex.com/

I have a project that requires user-defined attributes for a particular object at runtime (Lets say a person object in this example). The project will have many different users (1000 +), each defining their own unique attributes for their own sets of 'Person' objects.

(Eg - user #1 will have a set of defined attributes, which will apply to all person objects 'owned' by this user. Mutliply this by 1000 users, and that's the bottom line minimum number of users the app will work with.) These attributes will be used to query the people object and return results.

I think these are the possible approaches I can use. I will be using C# (and any version of .NET 3.5 or 4), and have a free reign re: what to use for a datastore. (I have mysql and mssql available, although have the freedom to use any software, as long as it will fit the bill)

Have I missed anything, or made any incorrect assumptions in my assessment?

Out of these choices - what solution would you go for?

  1. Hybrid EAV object model. (Define the database using normal relational model, and have a 'property bag' table for the Person table).

    Downsides: many joins per / query. Poor performance. Can hit a limit of the number of joins / tables used in a query.

    I've knocked up a quick sample, that has a Subsonic 2.x 'esqe interface:

    Select().From().Where  ... etc
    

    Which generates the correct joins, then filters + pivots the returned data in c#, to return a datatable configured with the correctly typed data-set.

    I have yet to load test this solution. It's based on the EA advice in this Microsoft whitepaper: SQL Server 2008 RTM Documents Best Practices for Semantic Data Modeling for Performance and Scalability

  2. Allow the user to dynamically create / alter the object's table at run-time. This solution is what I believe NHibernate does in the background when using dynamic properties, as discussed where

    http://bartreyserhove.blogspot.com/2008/02/dynamic-domain-mode-using-nhibernate.html

    Downsides:

    As the system grows, the number of columns defined will get very large, and may hit the max number of columns. If there are 1000 users, each with 10 distinct attributes for their 'Person' objects, then we'd need a table holding 10k columns. Not scalable in this scenario.

    I guess I could allow a person attribute table per user, but if there are 1000 users to start, that's 1000 tables plus the other 10 odd in the app.

    I'm unsure if this would be scalable - but it doesn't seem so. Someone please correct me if I an incorrect!

  3. Use a NoSQL datastore, such as CouchDb / MongoDb

    From what I have read, these aren't yet proven in large scale apps, based on strings, and are very early in development phase. IF I am incorrect in this assessment, can someone let me know?

    http://www.eflorenzano.com/blog/post/why-couchdb-sucks/

  4. Using XML column in the people table to store attributes

    Drawbacks - no indexing on querying, so every column would need to be retrieved and queried to return a resultset, resulting in poor query performance.

  5. Serializing an object graph to the database.

    Drawbacks - no indexing on querying, so every column would need to be retrieved and queried to return a resultset, resulting in poor query performance.

  6. C# bindings for berkelyDB

    From what I read here: http://www.dinosaurtech.com/2009/berkeley-db-c-bindings/

    Berkeley Db has definitely proven to be useful, but as Robert pointed out – there is no easy interface. Your entire wOO wrapper has to be hand coded, and all of your indices are hand maintained. It is much more difficult than SQL / linq-to-sql, but that’s the price you pay for ridiculous speed.

    Seems a large overhead - however if anyone can provide a link to a tutorial on how to maintain the indices in C# - it could be a goer.

  7. [EDIT - just added this one] SQL / RDF hybrid. Odd I didn't think of this before. Similar to option 1, but instead of an "property bag" table, just XREF to a RDF store? Querying would them involve 2 steps - query the RDF store for people hitting the correct attributes, to return the person object(s), and use the ID's for these person object in the SQL query to return the relational data. Extra overhead, but could be a goer.

I'd really appreciate any input here!

解决方案

The ESENT database engine on Windows is used heavily for this kind of semi-structured data. One example is Microsoft Exchange which, like your application, has thousands of users where each user can define their own set of properties (MAPI named properties). Exchange uses a slightly modified version of ESENT.

ESENT has a lot of features that enable applications with large meta-data requirements: each ESENT table can have about ~32K columns defined; tables, indexes and columns can be added at runtime; sparse columns don't take up any record space when not set; and template tables can reduce the space used by the meta-data itself. It is common for large applications to have thousands of tables/indexes.

In this case you can have one table per user and create the per-user columns in the table, creating indexes on any columns that you want to query. That would be similar to the way that some versions of Exchange store their data. The downside of this approach is that ESENT doesn't have a query engine so you will have to hand-craft your queries as MakeKey/Seek/MoveNext calls.

A managed wrapper for ESENT is here:

http://managedesent.codeplex.com/

这篇关于然而,另一个动态数据模型的问题的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆