从.m matlab文件中声明的矩阵创建numpy数组 [英] Create numpy array from matrix declared inside .m matlab file

查看:172
本文介绍了从.m matlab文件中声明的矩阵创建numpy数组的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

一个同事留下了一些我想用Numpy分析的数据文件.

A coworker left some data files I want to analyze with Numpy.

每个文件都是一个Matlab文件,例如data.m,格式如下(但具有更多的行和列):

Each file is a matlab file, say data.m, and have the following formatting (but with a lot more columns and rows):

values = [-24.92 -23.66 -22.55 ;
-24.77 -23.56 -22.45 ;
-24.54 -23.64 -22.56 ;
];

这是matlab使用的典型的显式矩阵创建语法.

which is the typical explicit matrix creation syntax used by matlab.

我的问题是:从这些文件创建numpy数组的最实用方法是什么?

My question is: what would be the most practical way to create a numpy array from these files?

我可以考虑一个强力"解决方案,或者一个快速而肮脏的"解决方案,但是如果有一个更直接的解决方案,我宁愿使用它,就像numpy甚至另一个模块中的标准函数一样.

I could think about a "brute force" or a "quick and dirty" solution, but if there would be a more straightforward one, I would much rather use it, like a standard function from numpy or even from another module.

我注意到我的文件可能包含NaN值,所以我很可能会改编给出的答案以使用numpy.genfromtxt而不是numpy.loadtxt.我计划在拥有最终代码后立即将其包含在内.

I noticed that my files may contain NaN values, so I most probably will adapt the answers given to use numpy.genfromtxt instead of numpy.loadtxt. I plan to include my final code as soon as I have it.

感谢您的帮助!

我得到了以下代码,在其中我使用正则表达式获取了[]之间的所有内容,并使用genfromtxt创建了一个numpy数组以处理NaN.一个更简短的解决方案是使用fromstring方法,该方法不需要StringIO,但这无法处理NaN,并且我的数据具有NaN:oP

I ended up with the following code, where I get everything between [] using regex, and create a numpy array using genfromtxt in order to handle NaN. A shorter solution could be to use fromstring method, which does not need StringIO, but this cannot handle NaN, and my data have NaN :oP

#!/usr/bin/env python
# coding: utf-8

import numpy, re, StringIO

with open('data.m') as f:
    s = re.search('\[(.*)\]', f.read(), re.DOTALL).group(1)
    buf = StringIO.StringIO(s)
    a = numpy.genfromtxt(buf, missing_values='NaN', filling_values=numpy.nan)

推荐答案

尽管没有内置选项,但这里有几个选项.

Here are a couple options, although neither is built in.

此解决方案可能属于您的快速且肮脏"类别,但可帮助您找到下一个解决方案.

This solution probably falls into your "quick and dirty" category, but it helps lead in to the next solution.

删除values = [,最后一行(];),并用任何内容全局替换所有;:

Remove the values = [, the last line (];), and globally replace all ; with nothing to get:

-24.92 -23.66 -22.55 
-24.77 -23.56 -22.45 
-24.54 -23.64 -22.56 

然后您可以按以下方式使用numpy的loadtxt.

Then you can use numpy's loadtxt as follows.

>>> import numpy as np
>>> A = np.loadtxt('data.m')

>>> A
array([[-24.92, -23.66, -22.55],
       [-24.77, -23.56, -22.45],
       [-24.54, -23.64, -22.56]])

您可能会接受的解决方案

在此解决方案中,我们创建一种方法将输入数据强制转换为numpy loadtxt喜欢的形式(实际上与上述形式相同).

A solution you might find acceptable

In this solution, we create a method to coerce the input data into a form that numpy loadtxt likes (the same form as above, actually).

import StringIO
import numpy as np

def convert_m(fname):
    with open(fname, 'r') as fin:
        arrstr = fin.read()
    arrstr = arrstr.split('[', 1)[-1] # remove the content up to the first '['
    arrstr = arrstr.rsplit(']', 1)[0] # remove the content after ']'
    arrstr = arrstr.replace(';', '\n') # replace ';' with newline
    return StringIO.StringIO(arrstr)

现在我们已经有了,请执行以下操作.

Now that we have that, do the following.

>>> np.loadtxt(convert_m('data.m'))
array([[-24.92, -23.66, -22.55],
       [-24.77, -23.56, -22.45],
       [-24.54, -23.64, -22.56]])

这篇关于从.m matlab文件中声明的矩阵创建numpy数组的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆