用于处理动态分类法的专用小平面搜索引擎 - 仅仅是表现还是灵活性? [英] Dedicated faceted search engine for dealing with dynamic taxonomies - helps just with performance or also flexibilty?

查看:155
本文介绍了用于处理动态分类法的专用小平面搜索引擎 - 仅仅是表现还是灵活性?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我一直在想一些关于使用类似ebay的分类法建立典型的电子商务网站,并且依赖于特定的产品类别。



首先尝试在EAV和Table Per Class db继承建模。我选择了后者,因为它的性能,但它是什么意思是为每个具体的类别树(具体的类别树叶)产品类别创建专用表,其中具有特定的类别属性(如电视的分辨率)被模型化为单独的列。



如果您需要向现有类别添加属性或添加新类别,则此设置不会很灵活。对于以下需要的每一个这样的更改:




  • 更改/创建表


  • 一些新的viewmodels / DTO和视图,用于展示新类别的产品



为了应对这种复杂性,我认为需要这种属性的一些元表示(即使在应用程序之外)在xml或甚至excel文件,以便每次更改所有提到的代码都可以自动生成(sql / orm查询,应用程序代码,模板)。所以它可以帮助开发,但仍然需要测试和额外的部署。



在这一点上,我了解到,ebay并没有真正使用关系数据库进行搜索,并且他们的分类法是如此灵活,他们可以很快地添加新的叶子类别。此外,他们的类别可能不是在关系数据库中建模的层次树中的类别,而只是搜索属性(facet)。



快速查看最有希望的专用faceted搜索设置(单独的Solr实例)我不确定它是否可以帮助我灵活地进行分类法更改,因为通常Solr只是镜像某种关系数据库,所以特定的类别属性仍然必须在DB中建模为DBMS元数据,所以例如。动态生成用于过滤属性的UI表单将很难,除非:



1)我将使用EAV fasion保留RDBMS中的数据,并克服了使用SOLR搜索的性能问题(但是仍然会有EAV的混乱,没有数据完整性的执行等问题)。



2)我将只保留属性字典(即他们的名字和类型)在RDBMS并将特定属性值存储在SOLR中,使用它作为除搜索设施之外的一种非关系数据存储。我也不相信这个解决方案(即使有可能),因为应用程序将与solr紧密相连(即产品版本管理员CRUD将直接与SOLR进行交互)。



你有什么想法?你认为对于任何类型的(性能)分类灵活性代码生成是不可避免的吗?你会如何处理?也许一些单独的数据字典在EAV时尚数据库中仅用于代码生成?我想我也可以使用像MongoDB这样的东西,但是UI代码生成(运行时还是不运行)仍然需要某种元数据。



这里有很多问题,但是我不想把它分解成较小的问题,因为我在处理一大类这样的问题时对一般设计方法感兴趣。

解决方案

我并不声称对所有这一切都有一个明确的答案(这是一个相当开放的问题,你应该尝试打破较小的部分,这取决于你的实际需求,实际上我'我试图投票关闭它),但我会评论一些事情:


  1. 我会忘记在RDBMS上建模。 分面搜索在关系模式中无效

  2. IMO这不是代码生成的正确场所。你应该设计你的代码,所以它不会随着数据的变化而改变(我不是在谈论 schema 更改)。

  3. 在Excel上存储元数据/属性电子表格似乎是一个非常糟糕的主意。我会建立一个UI来编辑这个,这将存储在Solr / MongoDB / CouchDB /无论你选择管理这个。

  4. Solr 只是镜像关系数据库。事实上,Solr完全独立于关系数据库。最常见的情况之一是将数据从RDBMS转储到Solr(在过程中对规范进行非规范化),但Solr具有足够的灵活性,无需任何关系数据源。

  5. Solr中的层次分析在研究中仍然是一个开放的问题。目前正在研究两种不同的方法( SOLR-64 SOLR-792


I've been thinking for a while about modeling typical ecommerce site with ebay-like taxonomy and attributes dependent on a particular product category.

First attempt was choosing between EAV and Table Per Class db inheritance modeling. I've chosen the latter because of the performance, but what it meant was creating dedicated table for each specific (leaf in the category tree) product category with specific category attributes (like resolution for TVs) modeled as a separate column.

While performant this setup is not flexible if you need adding attributes to the existing categories or adding new categories. For each such change following is needed:

  • Alter/create table
  • New form for filtering withing such category by specific attributes
  • New code for generating db queries for searching and filtering
  • Some new viewmodels/DTOs and views for presenting products from new categories

To cope with that complexity I think some kind of meta representation of those attributes is needed (even outside of the application) in xml or even excel file, so that on each change all mentioned code could be auto-generated (sql/orm queries, application code, templates). So it can help with development, but still testing and extra deployment is needed.

At that point I've learned that ebay doesn't really use relational db for search, and that their taxonomy is so flexible, that they can quite quickly add new leaf categories. Also their categories aren't probably categories from a hierarchical tree modeled in relational db, but just search attributes (facets).

After having a quick look into most promising dedicated faceted search setup (separate Solr instance) I'm not sure whether it could help me in being flexible to taxonomy changes since usually Solr just mirrors somehow relational DB, so specific category attributes would still have to be modelled in DB as DBMS metadata, so eg. dynamic generating UI forms for filtering attributes would be hard unless:

1) I would keep the data in RDBMS using EAV fasion and overcome its performance problems with using SOLR search (but there still would be problems with EAV messiness, no data integrity enforcement etc)

2) I would keep just the attributes dictionary (ie. just their names and types) in RDBMS and store the specific attribute values in SOLR using it as kind of non-relational data store apart from search facility. I'm not convinced to this solution either (even if it's possible) since application would be coupled to tight with solr (ie. product edition admin CRUD would interact with SOLR directly).

What are your thoughts? Do you think that for any kind of such (performant) taxonomy flexibility code generation is inevitable? How would you handle that? Maybe some separate data dictionary in EAV fashion in DB just for code generation purposes? I guess I could also use something like MongoDB, but the UI code generation (runtime or not) would still need some kind of metadata.

There's lot of question here, but I didn't want to break it up into smaller questions since I'm interested in a general design approach when dealing with a bigger class of such problems.

解决方案

I don't claim to have a definitive answer to all of this (it's a rather open-ended question which you should try to break into smaller parts and it depends on your actual requirements, in fact I'm tempted to vote to close it) but I will comment on a few things:

  1. I would forget about modelling this on a RDBMS. Faceted search just doesn't work in a relational schema.
  2. IMO this is not the right place for code generation. You should design your code so it doesn't change with data changes (I'm not talking about schema changes).
  3. Storing metadata / attributes on an Excel spreadsheet seems like a very bad idea. I'd build a UI to edit this, which would be stored on Solr / MongoDB / CouchDB / whatever you choose to manage this.
  4. Solr does not "just mirror relational DB". In fact, Solr is completely independent of relational databases. One of the most common cases is dumping data from a RDBMS to Solr (denormalizing data in the process), but Solr is flexible enough to work without any relational data source.
  5. Hierarchical faceting in Solr is still an open issue in research. Currently there are two separate approaches being researched (SOLR-64, SOLR-792)

这篇关于用于处理动态分类法的专用小平面搜索引擎 - 仅仅是表现还是灵活性?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆