我应该选择一个哈希,一个对象或一个数组重新present Perl中的数据实例? [英] Should I choose a hash, an object or an array to represent a data instance in Perl?

查看:103
本文介绍了我应该选择一个哈希,一个对象或一个数组重新present Perl中的数据实例?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我总是想知道这一点,但从来没有真正彻底地看着把它。

的情况是这样的:我有一个比较大的数据集的实例。每个实例都有相同的一组或属性,例如:

 #一个子实例
名称
年龄
高度
重量
发色
最喜欢的颜色
list_of_hobbies

通常我会重新present一个孩子作为哈希和哈希散列(或散列数组)的所有儿童一起保留。

什么一直困扰着我这个方法是,我真的不使用的事实,所有儿童(内哈希值)具有相同的结构。好像如果数据真的是很大,所以如果每一个内部散列从头存储现在看来,这可能是一种浪费内存明智在键名的名称可以远远超过数据sapce本身...
还要注意的是,当我建立这样的数据结构,我常常 nstore 他们到磁盘。

我不知道如果创建一个孩子的目标使得这个角度来看更有意义,尽管我并不真的需要OO。这将更加紧凑?它会更快查询?

也许再presenting每个孩子作为一个数组有道理?例如:

 我的($名称,$年龄,$高度,体重$,$ hair_color,$ favorite_color,$ list_of_hobbies)= 0..7;
我的$ children_h = {
  詹姆斯= GT; [詹姆斯,12,1.62,73,深棕色,蓝,[踢足球,吃冰淇淋]],
  诺拉= GT; [...],
  比利= GT; [...]
};
打印詹姆斯高度为$ children_h-> {}詹姆斯[$高度] \\ n;

回想我的主要问题是空间利用率(RAM或存储时盘),时间效率(即加载存储的数据集然后让从实例y属性x的值),并...方便(code可读性等等)。

谢谢!


解决方案

  1. Perl是足够聪明的哈希值之间共享的密钥。如果您拥有10,000哈希有相同的五个按键,perl保存这些五弦一次,并给他们十万次的引用。不必担心空间利用率是不值得你的时间。


  2. 基于散列的对象是最常见的一种和最简单的工作,所以,除非你有一个该死的好理由你应该使用他们不要。


  3. 您应该保存自己很多的麻烦,开始使用驼鹿,停止担心你的对象的内部(虽然,只有你和我之间,驼鹿对象是基于散列的,除非你使用特殊的扩展,否则让他们 - 再次,你不应该做的,如果没有一个很好的理由。)


I was always wondering about this, but never really looked thoroughly into it.

The situation is like this: I have a relatively large set of data instances. Each instance has the same set or properties, e.g:

# a child instance
name
age
height
weight
hair_color
favorite_color
list_of_hobbies

Usually I would represent a child as a hash and keep all children together in a hash of hashes (or an array of hashes).

What always bothered me with this approach is that I don't really use the fact that all children (inner hashes) have the same structure. It seems like it might be wasteful memory-wise if the data is really large, so if every inner hash is stored from scratch it seems that the names of the key names can take far more sapce than the data itself... Also note that when I build such data structures I often nstore them to disk.

I wonder if creating a child object makes more sense in that perspective, even though I don't really need OO. Will it be more compact? Will it be faster to query?

Or perhaps representing each child as an array makes sense? e.g.:

my ($name, $age, $height, $weight, $hair_color, $favorite_color, $list_of_hobbies) = 0..7; 
my $children_h = {
  James => ["James", 12, 1.62, 73, "dark brown", "blue", ["playing football", "eating ice-cream"]], 
  Norah => [...], 
  Billy => [...]
};
print "James height is $children_h->{James}[$height]\n";

Recall my main concerns are space efficiency (RAM or disk when stored), time efficiency (i.e. loading a stored data-set then getting the value of property x from instance y) and ... convenience (code readability etc.).

Thanks!

解决方案

  1. Perl is smart enough to share keys among hashes. If you have 100,000 hashes with the same five keys, perl stores those five strings once, and references to them a hundred thousand times. Worrying about the space efficiency is not worth your time.

  2. Hash-based objects are the most common kind and the easiest to work with, so you should use them unless you have a damn good reason not to.

  3. You should save yourself a lot of trouble, start using Moose, and stop worrying about the internals of your objects (although, just between you and me, Moose objects are hash-based unless you use special extensions to make them otherwise -- and once again, you shouldn't do that without a really good reason.)

这篇关于我应该选择一个哈希,一个对象或一个数组重新present Perl中的数据实例?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆