寻找类似表格的数据结构 [英] Looking for a table-like data structure

查看:76
本文介绍了寻找类似表格的数据结构的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有2组数据。
假设一个是一个民族,另一个是一个群体。
一个人可以在多个组中,而一个组可以有多个人。
我的操作基本上是针对团体和人员的CRUD。
以及一种可以确保一组人位于不同组中的方法(被称为很多)。

I have 2 sets of data. Let say one is a people, another is a group. A people can be in multiple groups while a group can have multiple people. My operations will basically be CRUD on group and people. As well as a method that makes sure a list of people are in different groups (which gets called alot).

现在我正在考虑一个二进制0和1的表,其中水平表示所有人员,垂直表示所有组。

Right now I'm thinking of making a table of binary 0's and 1's with horizontally representing all the people and vertically all the groups.

我可以在O(n)时间内通过添加每个列表来执行该方法二进制文件并与二进制文件列表的和操作进行比较。

I can perform the method in O(n) time by adding each list of binaries and compare with the "and" operation of the list of binaries.

例如

Group   A    B    C    D
ppl1    1    0    0    1
ppl2    0    1    1    0
ppl3    0    0    1    0
ppl4    0    1    0    0

check (ppl1, ppl2) = (1001 + 0110) == (1001 & 0110)
               = 1111 == 1111
               = true

check (ppl2, ppl3) = (0110 + 0010) == (0110+0010)
               = 1000 ==0110
               = false

我想知道是否已经有一个数据结构已经做了类似的事情,所以我不必写我自己并维护O(n)运行时。

I'm wondering if there is a data structure that does something similar already so I don't have to write my own and maintain O(n) runtime.

推荐答案

我不知道您问题的所有详细信息,但是我直觉是您可能在这里思考问题。您打算在此数据结构中存储多少个对象?如果您要在此处存储大量数据,我建议您使用实际的数据库而不是数据结构。您在此处描述的操作类型是关系数据库擅长的经典示例。 MySQL PostgreSQL 是大型关系数据库的示例,它们可以在睡眠中完成此类操作。如果您想要重量更轻的 SQLite ,可能会很有趣。

I don't know all of the details of your problem, but my gut instinct is that you may be over thinking things here. How many objects are you planning on storing in this data structure? If you have really large amounts of data to store here, I would recommend that you use an actual database instead of a data structure. The type of operations you are describing here are classical examples of things that relational databases are good at. MySQL and PostgreSQL are examples of large scale relational databases that could do this sort of thing in their sleep. If you'd like something lighter-weight SQLite would probably be of interest.

如果您不需要在此数据结构中存储大量数据,建议您保持简单,并且仅在确定不会出现这种情况时对其进行优化。足够快地满足您的需求。首先,我建议您使用Java的内置列表界面存储您的人员,并使用地图存储组。您可以执行以下操作:

If you do not have large amounts of data that you need to store in this data structure, I'd recommend keeping it simple, and only optimizing it when you are sure that it won't be fast enough for what you need to do. As a first shot, I'd just recommend using java's built in List interface to store your people and a Map to store groups. You could do something like this:

// Use a list to keep track of People
List<Person> myPeople = new ArrayList<Person>();
Person steve = new Person("Steve");
myPeople.add(steve);
myPeople.add(new Person("Bob"));


// Use a Map to track Groups
Map<String, List<Person>> groups = new HashMap<String, List<Person>>();
groups.put("Everybody", myPeople);
groups.put("Developers", Arrays.asList(steve));

// Does a group contain everybody?
groups.get("Everybody").containsAll(myPeople); // returns true
groups.get("Developers").containsAll(myPeople); // returns false

这绝对不是最快的选项,但是如果您没有要跟踪的大量人员,您甚至可能不会注意到任何性能问题。如果您确实有某些特殊条件会导致无法使用常规列表和地图,那么请发布它们,我们会根据这些条件提出建议。

This definitly isn't the fastest option available, but if you do not have a huge number of People to keep track of, you probably won't even notice any performance issues. If you do have some special conditions that would make the speed of using regular Lists and Maps unfeasible, please post them and we can make suggestions based on those.

编辑:

在阅读您的评论后,看来我在第一次运行时误读了您的问题。您似乎对将组映射到人员并没有太大兴趣,而是将人员映射到组。您可能想要的更像是这样:

After reading your comments, it appears that I misread your issue on the first run through. It looks like you're not so much interested in mapping groups to people, but instead mapping people to groups. What you probably want is something more like this:

Map<Person, List<String>> associations = new HashMap<Person, List<String>>();

Person steve = new Person("Steve");
Person ed = new Person("Ed");

associations.put(steve, Arrays.asList("Everybody", "Developers"));
associations.put(ed, Arrays.asList("Everybody"));

// This is the tricky part
boolean sharesGroups = checkForSharedGroups(associations, Arrays.asList(steve, ed));

那么您如何实现checkForSharedGroups方法?在您的情况下,由于周围的数字很低,我只想尝试一下幼稚的方法然后从那里开始。

So how do you implement the checkForSharedGroups method? In your case, since the numbers surrounding this are pretty low, I'd just try out the naive method and go from there.

public boolean checkForSharedGroups(
                    Map<Person, List<String>> associations, 
                    List<Person> peopleToCheck){
    List<String> groupsThatHaveMembers = new ArrayList<String>();
    for(Person p : peopleToCheck){
        List<String> groups = associations.get(p);
        for(String s : groups){
            if(groupsThatHaveMembers.contains(s)){
                // We've already seen this group, so we can return
                return false;
            } else {
                groupsThatHaveMembers.add(s);
            }
        }
    }
    // If we've made it to this point, nobody shares any groups.
    return true;
}

此方法在大型数据集上可能没有很好的性能,但是非常容易理解。由于它封装在自己的方法中,因此如果您需要更好的性能,也应该易于更新。如果您确实需要提高性能,请查看替代Person的equals方法,这会使关联映射中的查找更快。从那里,您还可以使用覆盖的equals方法查看自定义类型(而不是字符串)用于组。这将大大加快上面使用的contains方法。

This method probably doesn't have great performance on large datasets, but it is very easy to understand. Because it's encapsulated in it's own method, it should also be easy to update if it turns out you need better performance. If you do need to increase performance, I would look at overriding the equals method of Person, which would make lookups in the associations map faster. From there you could also look at a custom type instead of String for groups, also with an overridden equals method. This would considerably speed up the contains method used above.

我不太在意性能的原因是,就算法而言,您提到的数字并没有那么大。因为此方法在找到两个匹配的组后立即返回,所以在更糟糕的情况下,您将调用ArrayList.contains包含与存在的组数相等的次数。在最好的情况下,只需要调用两次即可。仅当您非常频繁地调用checkForSharedGroups时,性能才可能是一个问题,在这种情况下,最好找一种方法来减少调用频率,而不是优化方法本身。

The reason why I'm not too concerned about performance is that the numbers you've mentioned aren't really that big as far as algorithms are concerned. Because this method returns as soon as it finds two matching groups, in the very worse case you will call ArrayList.contains a number of times equal to the number of groups that exist. In the very best case scenario, it only needs to be called twice. Performance will likely only be an issue if you call the checkForSharedGroups very, very often, in which case you might be better off finding a way to call it less often instead of optimizing the method itself.

这篇关于寻找类似表格的数据结构的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆