行和列的ndarray字段名称? [英] ndarray field names for both row and column?

查看:114
本文介绍了行和列的ndarray字段名称?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我是一名计算机科学老师,试图使用NumPy为自己创建一本小成绩簿.但是我认为,如果我可以创建一个使用行和列都使用字段名称的ndarray,这将使我的代码更易于编写.这是到目前为止我得到的:

I'm a computer science teacher trying to create a little gradebook for myself using NumPy. But I think it would make my code easier to write if I could create an ndarray that uses field names for both the rows and columns. Here's what I've got so far:

import numpy as np
num_stud = 23
num_assign = 2
grades = np.zeros(num_stud, dtype=[('assign 1','i2'), ('assign 2','i2')]) #etc
gv = grades.view(dtype='i2').reshape(num_stud,num_assign)

因此,如果我的第一个学生在"assign 1"上获得97分,我可以写:

So, if my first student gets a 97 on 'assign 1', I can write either of:

grades[0]['assign 1'] = 97
gv[0][0] = 97

此外,我可以执行以下操作:

Also, I can do the following:

np.mean( grades['assign 1'] ) # class average for assignment 1
np.sum( gv[0] ) # total points for student 1

这一切有效.但是我不能弄清楚该怎么做,是使用一个学生证号码来指代特定学生(假设我的两个学生都有学生证,如图所示):

This all works. But what I can't figure out how to do is use a student id number to refer to a particular student (assume that two of my students have student ids as shown):

grades['123456']['assign 2'] = 95
grades['314159']['assign 2'] = 83

...还是用不同的字段名称创建第二个视图?

...or maybe create a second view with the different field names?

np.sum( gview2['314159'] ) # total points for the student with the given id

我知道我可以创建一个将学生ID映射到索引的字典,但这似乎很脆弱而且很脆弱,我希望有一种比以下更好的方法:

I know that I could create a dict mapping student ids to indices, but that seems fragile and crufty, and I'm hoping there's a better way than:

id2i = { '123456': 0, '314159': 1 }
np.sum( gv[ id2i['314159'] ] )

如果设计更简洁,我也愿意重新设计架构.我是NumPy的新手,并且我还没有编写太多代码,因此,如果我做错了,那么重新开始并不是没有问题的.

I'm also willing to re-architect things if there's a cleaner design. I'm new to NumPy, and I haven't written much code yet, so starting over isn't out of the question if I'm Doing It Wrong.

am 每天需要对一百多名学生的所有作业分数进行求和,还需要计算标准差和其他统计数据.另外,我将等待结果,所以我希望它能在几秒钟内运行.

I am going to be needing to sum all the assignment points for over a hundred students once a day, as well as run standard deviations and other stats. Plus, I'll be waiting on the results, so I'd like it to run in only a couple of seconds.

提前感谢您的任何建议.

Thanks in advance for any suggestions.

推荐答案

根据您的描述,与标准numpy数组相比,使用其他数据结构会更好. ndarray不太适合此...它们不是电子表格.

From you description, you'd be better off using a different data structure than a standard numpy array. ndarrays aren't well suited to this... They're not spreadsheets.

但是,最近有大量有关 的numpy数组的工作. 这里是对DataArrays最近工作的描述.不过,要完全将其完全合并到numpy中还需要一段时间.

However, there has been extensive recent work on a type of numpy array that is well suited to this use. Here's a description of the recent work on DataArrays. It will be a while before this is fully incorporated into numpy, though...

即将到来的numpy DataArrays基于的项目之一是"larry" (标签阵列"的简称).这个项目听起来完全像您想要做的...(已命名行和列,但透明地充当一个numpy数组.)它应该足够稳定才能使用,(从我的有限尝试中可以看出)非常漂亮!),但请记住,它最终可能会被内置的numpy类取代.

One of the projects that the upcoming numpy DataArrays is (sort of) based on is "larry" (Short for "Labeled Array"). This project sounds like exactly what you're wanting to do... (Have named rows and columns but otherwise act transparently as a numpy array.) It should be stable enough to use, (and from my limited playing around with it, it's pretty slick!) but keep in mind that it will probably be replaced by a built-in numpy class eventually.

尽管如此,您可以充分利用这一事实,而不是对numpy数组进行(简单)索引会将视图返回到该数组,并创建一个提供两个接口的类……

Nonetheless, you can make good use of the fact than (simple) indexing of a numpy array returns a view, into that array, and make a class that provides both interfaces...

或者,如果您决定自己动手,@ unutbu的上述建议是另一种处理方式(更简单直接).

Alternatively, @unutbu's suggestion above is another (more simple and direct) way of handling it, if you decide to roll your own.

这篇关于行和列的ndarray字段名称?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆