数据结构在过去时间更新值和查询值的状态 [英] Data structure for updating values and querying the state of values at a time in the past

查看:136
本文介绍了数据结构在过去时间更新值和查询值的状态的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

假设你有兴趣在一堆独立的随时间变化值,其中每一个重新presents事物的当前状态。该值不上任何固定时间表的变化和新的价值无法从旧的pdicted $ P $。为了一个具体的例子,假设你有一堆股票,你有兴趣在保持其价值的轨道,你会得到大约只要交易是由在该股票一只个股的最新情况。 (我实际的问题是不是股市,而是希望他们做什么,我得到的更多的理解。)

Suppose you're interested in a bunch of independent time-varying values, each of which represents the current state of something. The values don't change on any fixed schedule and new values can't be predicted from old ones. For the sake of a concrete example, let's say you have a bunch of stocks and you're interested in keeping track of their values, and you get an update about an individual stock whenever a trade is made on that stock. (My actual problem isn't about stocks, but hopefully they make what I'm getting at more understandable.)

除了只知道每只股票的当前价格,你也希望能够挑选一个任意时间点过去,得到一个快照这跟你有什么最新的交易价格为每只股票是那时候。因此,举例来说,你应该能够说那是每一个股票,我跟踪为下午4时53分,上周二的最新值?并获得precise有效回答。

In addition to just knowing the current price of each stock, you'd also like to be able to pick an arbitrary point in the past and get a "snapshot" that told you what the most recent trading price for each stock was at that time. So for instance, you should be able to say "What was the most recent value of every stock I'm tracking as of 4:53PM last Tuesday?" and get a precise answer efficiently.

我能想到的三种方式来做到这一点,但我不是很高兴与任何人。

I can think of three ways to do this, but I'm not very happy with any of them.

1。随身携带一本日记。维护时间序列顺序排列的所有交易的列表。更新只是加入到列表中,并且查询是线性扫描时间向后从第一个条目,其时间戳记上​​或早于指定时间标记。这将使更新了一定时间的操作,但你可能潜在地扫描整个杂志找到各行各业的值,所以更新为O(1)和快照是O(U)其中,u是更新的总数。内存要求为O(U),原因是显而易见的。

1. Keep a journal. Maintain a list of all trades in time-sequence order. Update is just adding to the list, and query is a linear scan backwards in time starting with the first entry whose timestamp is on or earlier than the specified time stamp. This would make update a constant time operation, but you might potentially have to scan the entire journal to find a value for all trades, so Update is O(1) and Snapshot is O(u) where u is the total number of updates. Memory required is O(u) for obvious reasons.

2。写检查点。 像以前一样保持一个单一的杂志,但不是包含的每个条目只新股的价格,更新包含在当前的价格(作为更新)进行的每次的股票。这是廉价的计算:自上一次更新了所有这些信息也是如此,你只是全国各地复制除一只股票的价格却变化。现在快照可以用O(日志U)操作来完成(使用日志二进制搜索来查找来之前或在指定的时间标记中的最后一项)。不过更新变成了O(S),其中S是股票的系统的数量,进而所需的总内存变为由O(U)的第一战略,O(S * U) - 这两者都是问题,如果s和妳大。

2. Write checkpoints. Maintain a single journal as before, but instead of each entry containing just new stock price, the update contains the current price (as of that update) for every stock. This is cheap to calculate: since the last update has all that information too, you just copy it all over except for the one stock whose price actually changed. Now Snapshot can be done with an O(log u) operation (using binary search on the journal to locate the last entry that comes before or on the specified timestamp). However Update becomes O(s) where s is the number of stocks in the system, and furthermore the total memory required goes from O(u) in the first strategy to O(s*u) --- both of which are problems if both s and u are large.

3。分居期刊。 为每一个股票的独立杂志,并按照时间顺序写的每只股票更新到自己的日记,再次。要快照,请在每个轴颈和使用二进制搜索找到正确的更新。它需要O(U)的内存,更新复杂度为O(1)操作,快照可以在O完成(S *登录U)的时间。这是我最喜欢的三个方法,但我觉得喜欢它也许可以加以改进,因为它忽略了在不同的股票更新的时间之间的关系。

3. Separated journals. Maintain a separate journal for each stock, and write updates for each stock into its own journal, again in chronological order. To snapshot, go over each journal and use a binary search to find the correct update. It requires O(u) memory, Update is an O(1) operation, and Snapshot can be done in O(s * log u) time. This is my favorite method of the three, but I feel like it could probably be improved upon, since it ignores any relationship between the timing of updates across different stocks.

有没有更好的办法,我失踪?这是已被研究,并具有普遍接受的解决方案?

Is there a better way I'm missing? Is this a problem that has been studied and has a generally-accepted solution?

推荐答案

有一个在文学上的永久数据结构的。特别是,这个早期纸的描述的施工持续二叉搜索树,它维护对数的操作,但可以在任何版本(在时间上如点)进行访问。访问未在某些特定版本更新的部分结构自然地看向最后preceding版本。所以,你将有你的自然行动为O(log S)的时间,并且结构可以占用O(U)空间,如果你知道所有你的钥匙了前面,从来没有再平衡,或O(U *日志S)的空间,如果每次更新修改为O(log S)的指针。

Have a look at the literature on Persistent Data Structures. In particular, this early paper describes the construction of a persistent binary search tree that maintains logarithmic operations, but can be accessed at any version (e.g. point in time). Accesses to parts of the structure that weren't updated in some particular version naturally look to the last preceding version. So, you would have your natural operations in O(log s) time, and the structure could occupy O(u) space if you know all your keys up front and never have to rebalance, or O(u * log s) space if every update modified O(log s) pointers.

<一个href="http://www.google.com/url?sa=t&source=web&cd=4&ved=0CC8QFjAD&url=http%3A%2F%2Fcourses.csail.mit.edu%2F6.854%2F06%2Fscribe%2Fs2-persistent.pdf&ei=cipWTL_sDZGmnQeVoIz8Ag&usg=AFQjCNE__Z_R6DaU4M0HqZ-OnVypwhSntg&sig2=TutzqPrsqm3MwDA6sy71VQ"相对=nofollow>这些课堂笔记似乎说明什么,你会需要实现相当简单的方面。

These class notes seem to describe what you would need to implement in fairly simple terms.

这篇关于数据结构在过去时间更新值和查询值的状态的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆