如何将多级对象映射到indexedDB以获得最佳效率 [英] How to map mulit-level object to indexedDB for best efficiency

查看:140
本文介绍了如何将多级对象映射到indexedDB以获得最佳效率的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我的问题涉及在indexedDB中布置数据结构.我最初构建了一个小的网页功能,后来逐渐发展成为一种网络学习工具,现在更接近于独立的渐进式Web应用程序.使用localStorage的效果很好,但是由于该工具的发展,对于某些用户来说5MB的限制可能会成为一个问题.因此,需要切换到indexedDB.

My question concerns laying out a data structure within indexedDB. I started out building a small web page feature that grew into something more of a web learning tool to now more closely to a stand-alone progressive web application. Using localStorage has worked well but since the tool has grown, the 5MB limit may become a problem for some users; so, there is a need to switch to indexedDB.

该应用程序仅适用于台式机,并且允许用户构建模块组合并将数据作为JSON字符串保存到硬盘驱动器.当用户打开(上载)应用程序中的文件时,将解析该字符串,并将整个项目组合再次写入localStorage,但在任何一次仅将一个模块写入运行时对象.从通过不同的字段和索引搜索数据的角度来看,不需要真正的"数据库,而仅需要大量的存储,因为如果投资组合中的每个模块对用户来说都太混乱了必须是一个单独的文件.

The application is for desktops only and allows the user to build a portfolio of modules and save the data to the hard drive as a JSON string. When the user opens (uploads) the file in the application, the string is parsed and the entire portfolio written to localStorage again but only one module is written to a run-time object at any one time. There isn't a need for a "genuine" database from the perspective of searching for data by different fields and indexing, but only a need for a greater amount of storage because it would be too confusing for the user if each module in a portfolio had to be a separate file.

大多数保存到localStorage的数据来自一个三级对象,并且根据对象路径创建了一个密钥来保存和检索数据.例如,object.level_1 [key_1] .level_2 [key_2] .level_3 [key_3] .height = 10保存为localStorage.setItem('k1.k2.k3.h',10).

Most of the data saved to localStorage is from a three-level object, and a key is made based on the object path to save and retrieve the data. For example, object.level_1[key_1].level_2[key_2].level_3[key_3].height = 10 is saved as localStorage.setItem( 'k1.k2.k3.h', 10).

我的问题是,当使用indexedDB时,效率更高:单个objectStore非常类似于localStorage设置,还是针对投资组合的三个级别中的每个级别分别设置objectStore?

My question is, when moving to indexedDB, which is more efficent: a single objectStore much like the localStorage set up, or a separate objectStore for each of the three levels of the portfolio?

如果可以将单个objectStore视为类似于两列表,其中每个数据点都有一行(一个键和一个值),则行数将大于该行的总和.三个objectStore,其中每一行是一个键,并且是一个包含多个数据点的对象;但是,要更新三个objectStore之一中的单个数据点,必须将数据库对象写入临时对象,更新数据点,然后再写回到objectStore.

If a single objectStore can be viewed as being similar to a two-column table with one row (a key and a value) for each individual data point, the row count would be greater than the sum of the row counts for the three objectStores, where each row is a key and an object of multiple data points; but, to update an individual data point in one of the three objectStores, the database object has to be written to a temporary object, the data point updated, and then written back to the objectStore.

然后,问题就变得更加有效:在多行的单个表中搜索指向一个不太复杂的值的单个唯一键,或者在三个行较少的表中搜索其中之一,但是必须执行以下操作:我认为这等效于JSON解析,值更新和JSON字符串化以更新数据库中的相同值吗?

The question is, then, which is more efficient: searching through a single table of many rows for a single unique key pointing to one less-complex value, or searching through one of three tables with fewer rows but having to perform what I think is equivalent to a JSON parse, value update, and JSON stringify to update the same value in the database?

尽管没有明确设置限制,但是单个投资组合中的level_1对象的最大预期数量约为25,其中每个对象可能最多包含100个level_2对象,而每个对象最多可以包含约5个level_3对象.大于此值的任何内容都极有可能导致用户仅建立单独的投资组合.

Although no limit is explicitly set, an expected maximum number of level_1 objects in a single portfolio is about 25, where each could likely contain up to 100 level_2 objects, which in turn could each contain a maximum of around 5 level_3 objects. Anything larger than this would most likely lead the user to simply build separate portfolios.

因此,level_1 objectStore约25行,level_2 objectStore约2500行,level_3 objectStore约12,500行.每个level_1对象都有大约40个数据点;每个level_2对象都有大约100个数据点;每个level_3对象约有20个数据点.因此,我认为单个objectStore相当于(25)(40)+(2500)(100)+(12,500)(20)= 501,000行.

So, the level_1 objectStore would be about 25 rows, the level_2 objectStore about 2500 rows, and the level_3 objectStore about 12,500 rows. Each level_1 object has about 40 data points; each level_2 object has about 100 data points; and each level_3 object has about 20 data points. So, I think a single objectStore would have the equivalent of (25)(40) + (2500)(100) + (12,500)(20) = 501,000 rows.

我对使用SQL从大型数据库中提取数据有半经验,但是对如何设置数据库以通过键定位数据一无所知.如果必须从上到下进行搜索,检查501,000行中的每行,直到找到匹配的键,则对于三个objectStore来说,一个objectStore看起来是一个荒谬的选择.但是,如果indexedDB采用一种更有效的方法,则一个objectStore可能会更高效,这取决于更新三个objectStores之一的对象中的属性值的效率.

I'm semi-experienced at extracting data using SQL from very large databases but know absolutely nothing about how a database is setup to locate data by key. If it had to search from top to bottom checking each of the 501,000 rows until a matching key is found, then one objectStore appears rather a ridiculous choice to three objectStores. But, if indexedDB employs a more efficent method, then one objectStore could possibly be more efficient depending on how efficient it is to update a property value in an object of one of the three objectStores.

我不是行业程序员;因此,对于某些术语不准确,我深表歉意,并且我意识到我的问题是一个相当基本的水平;但我无法找到解决如何有效地将对象映射"到对象数据库的任何信息.

I am not a programmer by trade; so, I apologize if some of my terminolgy is inexact and I realize that my question is of a rather basic level; but I have been unable to locate any information addressing how to "map" an object to an object database in an efficent manner.

感谢您阅读我的问题以及可能提供的任何指导.

Thank you for reading my question and for any direction you may be able to provide.

编辑/更新:

谢谢乔什,抽出宝贵的时间回答我的问题,并提供了许多要考虑的问题.我还没有考虑过在应用程序中什么时候将不同类型的数据写入浏览器存储会如何影响对象存储数量的确定.

Thank you, Josh, for taking the time to respond to my question and for providing a number of items to think about. I had not yet considered how at what points during the application different types of data are written to browser storage influences the determination of the number of object stores.

有两个大的数据移动,通常在用户会话期间每次仅发生一次:从硬盘上载要解析并写入浏览器存储的JSON字符串,然后将浏览器存储读入要存储的对象中.字符串化并下载到硬盘.用户很可能希望这两个步骤至少需要足够的时间,以要求某种形式的简短进度指示器.重要的时间项目是存储数据编辑和创建新数据元素所花费的时间.

There are two large data movements that generally occur only once each during a user's session: the upload from hard disk of a JSON string to be parsed and written to browser storage, and then the reading of browser storage into an object to be stringified and downloaded to hard disk. Users, most likely, expect these two steps to involve at least enough time to require some form of a brief progresss indicator. The important time items are the time it takes to store data edits and create new data elements.

也许在遵循乔什(Josh)的评论之后,由于缺少更好的用语,所以考虑设置对象存储库的一种好方法是考虑何时以及何时将数据通过屏幕写入到浏览器存储库中.在我的应用程序中,任何时候都只能将一个模块(项目组合中的level_1对象)加载到运行时对象中.有一个屏幕可以显示模块级数据.退出该屏幕后,模块级数据中的所有更改都将写入存储.

Following Josh's comments, perhaps, a good way to set up object stores is to consider when and what data gets written to browser storage by screens, for lack of a better term. In my application, only one module (level_1 object in the portfolio) is ever loaded into a run time object at any one time. There is one screen for module-level data. When that screen is exited, any changes in the module-level data are written to storage.

模块中的每个level_2对象都有其自己的屏幕,并且当用户在level_2对象屏幕之间导航时,将针对运行时对象的更改值检查屏幕输入元素中的内容,并将所有更改写入存储

Each level_2 object in a module has its own screen, and as the user navigates between level_2 object screens, the content in the screen input elements are checked against the run-time object's values for changes, and any changes are written to storage.

在level_2对象屏幕上,用户通过调用出现在level_2屏幕顶部的窗口,将level_3对象添加到特定的level_2元素.关闭每个窗口后,将执行类似的检查,并将所有数据更改写入存储.

While on a level_2 object screen, a user adds level_3 objects to specific level_2 elements through calling a window that appears on top of the level_2 screen. When each window is closed, a similar check is performed and any data changes are written to storage.

创建与每个屏幕上显示和收集的数据对齐的对象存储似乎很有意义,并且当然与对象级别对齐.但是,它仍然无法回答哪种数据结构最终将是最有效的,从而可以在时间上为用户提供最佳的体验.

Creating object stores that align with the data displayed and collected on each screen appears to make sense and, of course, aligns with the object levels. However, it still doesn't answer which data structure would ultimately be the most efficient, providing the best user experience time-wise.

除了某种关于数据库效率的经验法则之外,针对我的特定问题和情况的最佳方法可能是两种方式都进行编码,用大于预期数量的最大模块以及level_2和level_3对象填充项目组合,并测试将数据写入indexedDB的性能.单个对象存储的第一种方法应该很容易编写代码,因为它的设置几乎与localStorage完全一样.使用至少三个对象存储库的第二种方法将花费更多时间,但是对于那些在这些领域具有有限背景的人来说,这可能是必要且有价值的学习经验.

Apart from some type of rule of thumb for database efficiency, the likely best approach for my particular question and circumstance is to code it both ways, fill the portfolio with a larger than expected number of maximum modules, and level_2 and level_3 objects, and test the performance of writing and reading data to indexedDB. The first method of a single object store ought to be fairly easy to code since it is set up almost exactly like localStorage. The second approach of using at least three object stores will take more time, but it will likely be a necessary and worthwhile learning experience for someone with my limited background in these areas.

如果我成功了,我将在不久的将来在这里分享结果.谢谢.

If I am successful, I will share the results here in the near future. Thank you.

感谢进一步的解释.我不会以这种方式查询数据库,而是存储数据以仅基于唯一键进行检索.但是,您先前关于在同一表中存储相同数据的评论最终在我的脑海中浮现出来,我认为这大大简化了我的整个问题和方法.从本地存储的角度来看,我考虑得太多了.

Thanks for the further explanation. I'm not going to be querying the database in that type of manner but am storing data for retrieval based on the unique key only. However, your earlier comments about storing the same data in multiple tables finally registered in my mind and I think greatly simplified my entire question and approach. I was thinking too much from a local storage perspective.

我认为最好的方法是使用多个对象存储:一个对象存储包含每个模块或投资组合中的level_1数据的一个完整对象,而三个或四个对象存储包含活动"或活动"数据的子集.仅加载模块.

What I think will work well is multiple object stores: one object store that contains one complete object for each module or level_1 data in the portfolio, and a three or four object stores that contain subsets of data for the "active" or loaded module only.

当用户选择要加载的模块时,将在一个步骤中从模块对象存储库中完全加载该模块,并且该模块的子集(不同的对象级别)将被写入许多不同的对象存储库中.当用户对模块数据进行任何级别的编辑时,这些编辑将存储在适当的子集对象存储中,因为这样做将更快.

When the user selects a module to load, it will be loaded in its entirety from the module object store in one step, and subsets (different object levels) of that module will be written to a number of different object stores. When the user makes edits to the module data at any level, the edits will be stored in the appropriate subset object store since that will be much quicker.

如果用户正确退出/关闭模块,则那时已加载的对象将全部写入模块对象存储中,而子集对象存储将被清空.子集对象存储在那里 在用户无法正确退出或电源或操作系统出现故障的情况下保留更改.

If the user properly exits/closes a module, then at that time the loaded object will be written in its entirety to the module object store, and the subset object stores will be emptied. The subset object stores are there to preserve the changes in the event that the user fails to exit properly or there is a power or OS failure.

打开应用程序时,将测试浏览器存储,以确定是否有数据库,如果存在,则确定子对象存储是否为空.如果为空,则执行模块的正确关闭和保存.如果不为空,则无论出于何种原因,对模块的编辑都不会使其进入模块对象存储,并且系统将提示用户恢复或放弃保存在子集对象存储中的编辑.如果用户选择恢复,则必须将子集对象存储中的数据收集到一个完整的模块中并写入模块对象存储中.

When the application is opened, browser storage will be tested to determine if there is a database and, if so, whether or not the subset object stores are empty. If empty, then a proper close and save of the module was performed. If not empty, then edits to the module did not make it into the module object store for whatever reason, and the user will me prompted to either recover or discard the edits saved in the subset object stores. If the user chooses to recover, then the data in the subset object stores must be gathered together into a complete module and written to the module object store.

对于此应用程序中任何单个模块的预期最大大小,这应该可以正常工作;但是,如果整个加载时模块的大小对于浏览器来说太大了,那么可以使用子集对象存储来填充屏幕;当用户退出模块时,可以将这些子集聚集在一起以构建完整的模块数据集,并将其写入模块对象存储中,就像进行恢复一样.

This ought to work fine for the anticipated maximum size of any single module in this application; but if the size of a module were to become too much for the browser when loaded in its entirety, then the subset object stores could be used to populate the screens; and when the user exits the module, the subsets could be gathered together to build a complete set of module data and written to the module object store, just as for a recover.

当然,没有方法可以在运行时测试是否由于模块太大而导致浏览器运行速度太慢,并在那时更改方法.我的意思是,如果在测试大型示例模块时发现浏览器运行太慢,则需要实现第二种方法.

Of course, there is no way to test during run time if the browser is running too slowly due to an overly large module and change the approach at that time. I just mean that if during my testing of large sample modules, it is observed that the browser runs too slowly, then the second approach will need to be implemented.

我意识到我的特定问题并不像答复中列出的项目那样有趣.但是,阅读这些基本概念有助于我更好地理解如何解决我不太感兴趣的indexedDB用法,并避免了将不必要的复杂性编码为一个简单问题的大量麻烦.再次感谢.

I realize that my particular question is not as interesting as the items listed in the response. However, reading about those general concepts helped me to better understand how to address my less interesting use of indexedDB and to avoid a considerable amount of messing about coding unnecessary complexity to a simple problem. Thanks again.

推荐答案

我认为您可以自己回答,所以我在这里的回答仅是为了推动您前进.

I think you are on to your own answer, so my response here is only intended to push you along.

nosql与传统sql数据库之间的主要区别是缺少查询计划.查询计划是sql数据库提供的功能,它在其中接受您的查询,对其进行解析,然后将其转换为查找匹配记录并将其返回给您的结果集中的算法.查询计划涉及选择最佳方法,通常是通过尽量减少所涉及的步骤数,所涉及的内存量或将要花费的时间量来进行.另一方面,您自己使用nosql.您必须成为通宵查询计划专家.

The main difference between nosql and a traditional sql database is the lack of query-planning. Query planning is the functionality provided by an sql database, where it accepts your query, parses it, and then converts it into an algorithm that finds matching records and returns them to you in a result set. Query planning involves choosing the most optimal approach, generally by trying to minimize the number of steps involved, the amount of memory involved, or the amount of time that will elapse. On the other hand, you are on your own with nosql. You have to become an overnight query-planning expert.

这既是福音,也是负担.对于某些人来说,查询计划是一个复杂性的悬崖,您可以很快发现自己正在阅读一些令人困惑的内容.但是,如果您正在寻找一个更技术性的答案,那么就会朝着这个方向发展,即了解有关数据库如何进行查询计划的更多信息.

That's both a boon and a burden. Query planning is a complexity cliff for some, and you can quickly find yourself reading some confusing stuff. But if you are looking for a more technical answer then it would be in this direction, of learning more about how databases do query planning.

为加快速度,我将应用有关归一化和非归一化的相同常规知识.博伊斯·科德(Boyce-Codd)和正常形式1-5等. nosql在极端非规范化方面.您存储的项目的逻辑"结构无关紧要.使用nosql,您的目标不是一个很好的传统直观的模式.您的目标是有效执行存储操作和查询.

To speed that up, I would apply the same conventional knowledge about normalization and denormalization. Boyce-Codd and normal forms 1-5 and all that. nosql is on the extreme denormalization end. The 'logical' structure of the items you store is irrelevant. With nosql your objective is not a nice traditional and intuitive schema. Your objective is to efficiently perform your storage operations, your queries.

因此,要回答这个问题,您必须先对操作进行简单分析.枚举您的应用程序执行的操作.哪些操作最频繁?您认为哪一个完成时间最长?通过操作,这里我不是在谈论低级查询,也不是在nosql/sql中谈论数据库的架构.那是一个太深的抽象层次.更抽象地思考.列举诸如为满足这些条件的所有人员加载信息",删除那里的那些人"之类的内容.我选择了您提到的一些查询,但没有选择一个明确的列表,该列表是正确答案的重要条件.

So to answer the question you have to start with a simple analysis of your operations. Enumerate the operations your app performs. Which are the most frequent operations? Which do you assume will take the longest to complete? By operations, I am not talking about low level queries here, nor the schema of your db in nosql/sql. That is a level too deep of abstraction. Think more abstractly. Enumerate things like "load the info for all the people that meet these conditions", "delete those people over there". I picked up on some of the queries you mention, but I didn't pick up on a clear list, and this list is important criteria in a proper answer.

一旦您列举了这些操作,那么我认为您将更接近回答您的问题.作为一个玩具示例,请考虑更新.更新频繁吗?频繁的更新将表明一个对象存储是不好的,因为您必须加载大量不相关的东西才能更改一个对象的一个​​属性.考虑粒度.您是否需要对象的所有属性,或仅需要其中一些属性?想想最频繁的手术是什么?是否根据某些条件加载对象列表?它是删除还是更新内容?考虑一下在同一时间(同一位置)装载什么东西.当您加载一个2级对象的实例时,其他实例通常也被加载吗?如果没有,那么为什么将它们存放在一起?摆脱标准化的架构,而不必理会它.您需要一种非规范化的架构,在其中以某种方式存储数据以优化查询.最终结果可能与您想像的不一样.

Once you have enumerated those operations, then I think you are closer to answering your question. As a toy example, think about updates. Are updates frequent? Frequent updates would suggest one object store is bad, because you have to load a ton of irrelevant things just to change one property of an object. Think about granularity. Do you need all of an object's properties, or only some? Think about what is the most frequent operation? Is it loading a list of objects according to some criteria? Is it deleting or updating things? Think about what things are loaded at the same time (co-location). When you load one instance of a level 2 object, are the other instances typically also loaded? If not, then why store them together? Step away from your normalized schema and just forget about it. You want a denormalized schema where you are storing data in a manner so as to optimize your queries. The end result may be nothing like what you imagine.

也许是一个很好的想法实验.伪代码将执行实际的繁重的功能.您将直接遇到问题,并确定功能中可能真的很慢的部分.那么,对于您的问题的答案实际上就是什么数据结构可以真正加快这些部分的速度,或者至少比其他数据结构减速的速度更慢.

Maybe a good thought experiment would be this. Pseudocode the function that would do the actual heavy lifting. You will run straight into the problems and identify the parts of the function that will probably be really slow. The answer to your question then is essentially what data structure would really speed those parts up, or at least slow them down less than other data structures.

一个小小的跟进. Nosql数据库和反规范化的一个相当违反直觉的功能是,您可能最终会多次存储数据.有时将相同的数据存储在多个位置很有意义.因为它可以加快查询速度.是的,它引入了不一致的空间,并且违反了sql的no-function-dependencies规则.但是,您可以通过使用多商店事务和一些注意来强制执行数据完整性(一致性).为了进一步详细说明,您想要的存储可能只是您计划执行的查询的字面结果.是的.为您计划执行的每个查询创建一个对象存储.在所有数据之间冗余存储数据.是的,这听起来很疯狂而且极端.这有点夸张.但是,在使用nosql时,这种方法很常见,并且得到了推广.

one little followup. A rather counterintuitive feature of nosql databases and denormalization is that you may end up storing data multiple times. Sometimes it makes sense to store the same data in multiple places. Because it speeds up queries. And yes it introduces room for inconsistencies, and violates the no-functional-dependencies rule of sql. But you can enforce data integrity (consistency) through the use of multi-store transactions and a bit of care. To elaborate further, the stores you want might just be the literal results of the queries you plan to perform. Yes. Create an object store for each query you plan to perform. Store data redundantly among all of them. Yes that sounds nutty and extreme. And it is a tad exaggerated. But this approach is common, and promoted, when using nosql.

这是一个粗略的第一次尝试,只是集思广益,这是在猜测您实际要做什么的基础上给您一个更具体答案的尝试

and here is a rough first attempt, just brainstorming a bit, this is an attempt to give you a more concrete answer based on guessing what you are trying to actually do

您想要的是一个称为设置"的对象存储.存储中的每个对象都代表一个设置"对象.单个设置对象具有以下属性:设置ID,设置属性名称,设置属性值,1级属性,2级属性,3级属性.

What you want is an object store called 'settings'. Each object in the store represents a Settings object. A single settings object has the properties like settings id, settings property name, settings property value, level 1 property, level 2 property, level 3 property.

您的基本读取查询可能看起来像SELECT * from Settings WHERE level1 = 'a' && level2 = 'b'.

Your basic read queries might look like SELECT * from Settings WHERE level1 = 'a' && level2 = 'b'.

进一步,您可以使用索引针对某些视图进行优化.我们可以在level1属性上创建索引,在level2属性上创建索引,并在level1 + level2属性上创建索引.

Taking this further, you could then optimize for certain views, using indices. We could create an index on the level1 property, and index on the level2 property, and an index on the level1+level2 properties combined.

比方说,最频繁的操作(最快的操作)是加载属于级别1,级别2和级别3的特定组合的所有设置.在所有级别3上创建索引,然后就可以了.遍历该索引.

Let's say your most frequent operation, that needs to be fastest, is to load all settings belonging to a particular combination of levels 1, 2, and 3. Create an index on all 3, and then it is just a matter of iterating over that index.

此集思广益示例中的模式是单个对象存储,以及一些索引以加快某些查询的速度.鉴于索引基本上是派生的对象存储,尽管实际上您实际上只使用一个存储,但您实际上可以使用多个存储作为概念上的参数.无论如何,可能会变得很书呆子.该示例的目的只是说明对象存储库的架构与如何概念化投资组合和级别的层次完全无关.它只与使您需要快速执行的查询有关.

The schema in this brainstorming example is a single object store, along with some indices to speed certain queries up. Given that indices are basically derived object stores, you could make the conceptual argument you are practically using multiple stores although you are actually only using one. Anyway that might be getting pedantic. The point of this example is just to demonstrate that the schema of your object store has nothing at all to do with how you conceptualize the hierarchy of portfolios and levels. It only has to do with making the queries you need to perform fast.

这篇关于如何将多级对象映射到indexedDB以获得最佳效率的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆