如何在一个时间范围内从队列中删除重复的项目? [英] How to remove duplicate items from a queue within a time frame?

查看:164
本文介绍了如何在一个时间范围内从队列中删除重复的项目?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我想以有效的方式从队列中删除重复的条目。
队列有一个带有DateTime和FullPath的自定义类,还有一些其他的东西

  private Queue< MyCustomClass> SharedQueue; 

类中的DateTime是插入队列时的时间戳。我想使用的逻辑如下:如果FullPath在4秒钟的窗口内相同(即如果在重复的fullpath的4秒内添加到队列中),则从队列中删除重复项。我有我想要观看的事件,但是几个重复项目仍然可以到达,这是可以的。



我正在使用c#2.0和FileSystemWatcher类和一个工作队列。



有一些方法可以做到这一点:$ b​​ $ b每次添加一个项目时修剪队列,或者当我在队列中跳过时处理当前的重复项目。



或者我应该使用'全局私有'变量Dictionary< String,DateTime>?所以我可以快速搜索吗?或队列的本地副本?也许在许多文件事件的情况下,最好将本地队列限制为100个项目?虽然在我的情况下,它应该是只有一个相对较少的文件来监视文件夹...但事情总是改变...



感谢任何帮助。 / p>

:编辑:Feb 10 8:54 EST:
所以我决定实现一个很好的简单解决方案,就我所知。
我不认为我坚持Dict键太长了...



:编辑:Feb 10 9:53 EST:更新为我的字典不能包含重复的值。

  public void QueueInput(HotSynchUnit.RcdFSWFile rcd)
//启动工作线程程序启动。
//在程序退出例程或关闭处理程序中调用Terminate.Set()
{
//锁共享队列
锁(SharedQueue)
{
if(!IsDuplicateQueueInput(rcd))//只添加唯一的值来排队
{
SharedQueue.Enqueue(rcd);
SomethingToDo.Set();
}
}
} // public void QueueInput

private bool IsDuplicateQueueInput(HotSynchUnit.RcdFSWFile rcd)
/ *如果对象为重复对象。
*伪代码:
*
* isDuplicate = false
*锁定字典
* - 如果lastTimeStamp> 4秒前那么//优化:保存lastTimeStamp
*如果Dict.Count> 0然后清除字典
*返回isDuplicate
* - 如果不是Dict.TryGetValue(sPath,dtTimeStamp)然后
* Dict.AddKey()
* -Else
*将密钥时间戳记与当前时间
*比较,如果密钥时间戳为< = 4秒前,则
* IsDuplicate = True
*
* Dict.RemoveKey()
* .AddKey()
*
* return isDuplicate
* /
{
//将真正代码放在
}


解决方案

我只是想到使用任何集合 >通用散列表 ...这样的东西:

 字典< string,YourClass> dict = new Dictionary< string,YourClass>(); 

///我们假设你想添加/检查c:\demo.txt

if(!dict.ContainsKey(@c:\\ \\ demo.txt))
{
///通过将fullPath作为键传递给dict,并将对象添加为值
dict.add(@c:\demo.txt ,obj1);
}
else if(dict [@c:\demo.txt]。CheckForIntervall())
{
///替换字典中的当前对象与新对象 - 如果你想..
///或只是做你想要的
}

编辑 - 你的自定义类可能有这样的功能:

  class YOURCUSTOMCLASS 
{
private DateTime creationTime;

public DateTime CreationTime
{get {return creationTime;

public YOURCUSTOMCLASS(parametersGoesHere xyz)
{
creationTime = DateTime.Now;
}

///在这种情况下,如果此对象和其他对象之间的timeSpan
///大于4,则此方法将返回true
///秒
public bool CheckForInterval(YOURCUSTOMCLASS otherObject)
{
TimeSpan diff = otherObj.CreationTime.Subtract(creationTime);

///你可以通过任何其他数字替换4,甚至更好地采取
///一个const /全局var / static ...
返回diff.TotalSeconds> ; 4;
}

///所有其​​他需要的东西...
}

当然,您将会松开队列的功能,但如果您的队列包含许多元素,您将获得运行时的大量增加。



hth


I would like to remove duplicate entries from a queue in an efficient way. The queue has a custom class with DateTime and FullPath and a few other things

private Queue<MyCustomClass> SharedQueue;

The DateTime in the class is the timestamp when inserted into the queue. The logic I would like to use is as following: Remove duplicates from the queue if the FullPath is identical within a 4 second window (i.e. if added to queue within 4 seconds of a duplicate fullpath). I have the events that I want to watch but a few duplicates will still arrive and that is OK.

I am using c# 2.0 and the FileSystemWatcher class and a worker queue.

There are a bunch of ways to do this: Trim the queue each time an item is added to it, or when I am working on the queue skip the processing of the current duplicate item.

Or should I use a 'global private' variable Dictionary< String, DateTime> ? So I can quickly search it? or a local copy of the queue ? Perhaps it is best to limit the local queue to 100 items in case of many file events? Though in my case it 'should be' only a relatively few files to monitor in a folder... but things always change...

Thanks for any help.

:Edit: Feb 10 8:54 EST: So I decided to implement a good simple solution as far as I can tell. I don't think I am holding on to the Dict keys too long...

:Edit: Feb 10 9:53 EST: Updated as my Dictionary cannot contain duplicate values.

   public void QueueInput(HotSynchUnit.RcdFSWFile rcd)
// start the worker thread when program starts.
// call Terminate.Set() in the programs exit routine or close handler etc.
{
  // lock shared queue
  lock (SharedQueue)
  {
    if (!IsDuplicateQueueInput(rcd))  // only add unique values to queue
    {
      SharedQueue.Enqueue(rcd);
      SomethingToDo.Set();
    }
  }
} // public void QueueInput

private bool IsDuplicateQueueInput(HotSynchUnit.RcdFSWFile rcd)
/* Return true if the object is a duplicate object.
 * Pseudo Code:
 * 
 * isDuplicate = false
 * Lock Dictionary
 * -If lastTimeStamp > 4 seconds ago then       // Optimization: save lastTimeStamp
 *    if Dict.Count > 0 then clear Dictionary
 *    return isDuplicate
 * -If not Dict.TryGetValue(sPath, dtTimeStamp) then
 *    Dict.AddKey()
 * -Else
 *    Compare key timestamp to Currenttime
 *    if key timestamp is <= 4 seconds ago then
 *       IsDuplicate = True
 *
 *    Dict.RemoveKey()
 *    Dict.AddKey()
 * 
 * return isDuplicate
*/
{
  // put real code here
}

解决方案

I just thought about using any collection similar to a generic hashtable... Something like this:

Dictionary<string, YourClass> dict = new Dictionary<string, YourClass>();

/// just let's assume you want to add/check for "c:\demo.txt"

if (!dict.ContainsKey(@"c:\demo.txt"))
{
   /// add items to dict by passing fullPath as key and your objects as value
   dict.add(@"c:\demo.txt", obj1);
} 
else if (dict[@"c:\demo.txt"].CheckForIntervall())
{
   /// replace current object in dictionary with new object - in case you want to..
   /// or just do what you want to 
}

edit - your custom class may have some functionality like this:

class YOURCUSTOMCLASS
{
    private DateTime creationTime;

    public DateTime CreationTime
    { get { return creationTime; } }

    public YOURCUSTOMCLASS(parametersGoesHere xyz)
    {
          creationTime = DateTime.Now;
    }

    /// in this case this method will return true
    /// if the timeSpan between this object and otherObject
    /// is greater than 4 seconds
    public bool CheckForInterval(YOURCUSTOMCLASS otherObject)
    {
         TimeSpan diff = otherObj.CreationTime.Subtract(creationTime);

         /// you may replace 4 through any other digit, or even better take
         /// a const/global var/static ...
         return diff.TotalSeconds > 4;
    }

    /// all the other stuff you need ...
}

Of course you will loose the functionality of a queue - but you will get an massive increase in runtime if your queue containts many elements.

hth

这篇关于如何在一个时间范围内从队列中删除重复的项目?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆