使用MATLAB提取“数据从下一行开始"之外的数据.在文本文件中 [英] Use MATLAB to extract data beyond "Data starts on next line:" in text-file

查看:243
本文介绍了使用MATLAB提取“数据从下一行开始"之外的数据.在文本文件中的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在尝试使用Matlab从.txt文件中提取4列数据(在这种情况下,使用Matlab是不可商议的);但是,在感兴趣的数据之前,标题文本的数量是可变的.数据上方的行始终显示为

I'm trying to extract 4 columns of data from a .txt file using Matlab (using Matlab is non-negotiable in this case); however, a variable amount of header text precedes the data of interest. The line just above the data always reads

Theta(deg) Phi(deg) Amp Phase Data starts on next line:

有关更多上下文,从标头到数据的过渡如下所示:

For more context, the transition from header to data looks like this...:

Amp/Phase drift   =  -1.11  dB,  2.7  deg


Theta(deg)  Phi(deg)    Amp     Phase   Data starts on next line:
 -180.000   -90.000    16.842  -116.986
 -179.000   -90.000    16.837  -126.651
 -178.000   -90.000    16.549  -137.274

什么是最好的方法?另外,是否有一种方法可以通过仅在短语的第一行(例如200行)中搜索词组Data starts on next line:来节省时间?

What is the best approach? Also, is there a method that might save time by only searching the first, say, 200 lines of text for the phrase Data starts on next line:?

推荐答案

您始终可以打开文件并循环浏览文件,直到找到Data starts on the next line:.到达那里后,您可以将这些值读入矩阵.您可以使用 fopen textscan cell2mat fclose 来帮助您做到这一点.

You can always open up the file and loop through the file until you find Data starts on the next line:. Once you're there, you can read in those values into a matrix. You can use a combination of fopen, strfind, fgetl, textscan, cell2mat and fclose to help you do that.

类似这样的东西:

f = fopen('data.txt', 'r'); %// Replace filename with whatever you're looking at

%// Go through each line in the text file until we find "Data starts on next line"
line = fgetl(f);
while isempty(strfind(line, 'Data starts on next line'))
    if line == -1 %// If we reach the end of the file, get out
        break;
    end
    line = fgetl(f);
end

%// File pointer is now advanced to this point.  Grab the data
if line ~= -1
    data = cell2mat(textscan(f, '%f %f %f %f'));
else
    disp('Could not find data to parse');
end

fclose(f); %// Close file

该代码不言自明.但是,要说得详细些,让我们逐行进行介绍.

The code speaks for itself. However, to be verbose, let's go through it line by line.

第一行打开您的数据文件以供读取.接下来,我们获取文本文件的第一行,然后继续从该点开始进行检查,直到在该行上找到'Data starts on next line'的实例为止.我们将此逻辑放在while循环中,然后strfind确定某些模式在某些文本中出现的位置.我们要搜索的文本是文本文件中的查询行,而我们想要的模式是'Data starts on next line'.如果找不到所需的内容,则strfind返回一个空数组,因此我们将使用while循环进行循环,直到strfind 返回一个空数组.

The first line opens up your data file for reading. Next, we grab the first line of the text file, then keep checking from that point onwards until we find an instance of 'Data starts on next line' on that line. We put this logic in a while loop and strfind determines the locations of where a particular pattern happens in some text. The text we're searching in is the queried line in the text file, and the pattern we want is 'Data starts on next line'. If we don't find what we're looking for, strfind returns an empty array, so we are looping with a while loop until strfind doesn't return an empty array.

我进行了一些其他检查,如果找不到'Data starts on next line',我们什么也不做.如果我们到达文件的末尾,fgetl将返回-1.如果遇到-1,则意味着没有要解析的数据,因此我们将按原样保留所有内容.

I've placed some additional checks where if we don't find 'Data starts on next line' we don't do anything. If we reach the end of the file, fgetl will return -1. If we encounter a -1, that means there is no data to be parsed, and so we'll just leave things the way they are.

如果我们确实找到了这个字符串,则文件指针已经前进到现在有有效数字数据的位置.我们使用textscan读取此点之后的文本行,并使用四列数据这一事实,我们使用%f用空格分隔以表示每行有4个浮点数.这样的结果将为您提供一个4元素的单元格数组,其中每个元素都是一列数据.要将结果转换为数值数组,您需要使用cell2mat进行此转换.此数据存储在名为data的变量中.我们终于关闭了文件,因为我们不再需要使用它.

If we do end up finding this string, the file pointer has advanced to the point where there is now valid numerical data. We use textscan to read in the lines of text past this point and using the fact that there are four columns of data, we use %f separated by spaces to denote that there are 4 floating point numbers per line. The result of this will give you a 4 element cell array where each clement is a column of data. To convert the results to a numerical array, you'll need to use cell2mat do this conversion. This data is stored in a variable called data. We finally close the file as we don't need to use it anymore.

当我运行上面的代码并将您的示例文本数据放入一个名为data.txt的文件中时,这就是我得到的:

When I run the above code and place your sample text data into a file called data.txt, this is what I get:

>> data

data =

 -180.0000  -90.0000   16.8420 -116.9860
 -179.0000  -90.0000   16.8370 -126.6510
 -178.0000  -90.0000   16.5490 -137.2740

这篇关于使用MATLAB提取“数据从下一行开始"之外的数据.在文本文件中的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆