MATLAB创建基于日期列插入NaN的表 [英] MATLAB Create table with NaN's inserted based on Date column
问题描述
我的数据每三天一次,但是在我的单元格阵列中,有时会丢失几天.当矩阵跳过一天并将NaN放入样品测量"单元中时,如何使矩阵添加日期?
My data is every three days, but in my cell array, there are sometimes missing days. How can I make the matrix add dates when it skips a day and put a NaN into the Sample Measurement cell?
这是一个例子.我从4个站点中的每个站点放了2行.在不同站点之间没有空行,只是为了清楚起见.
Here's an example. I put 2 lines from each of the 4 sites. There aren't any empty rows between the different sites - they are just there for clarity.
Latitude Longitude SiteID Date Local Sample Measurement
43.435 -88.527778 027-0007 4/12/2007 4.3
43.435 -88.527778 027-0007 4/15/2007 9.3
43.060975 -87.913504 079-0026 4/12/2007 7.9
43.060975 -87.913504 079-0026 4/15/2007 11.3
45.203885 -90.600123 119-8001 4/12/2007 3.3
45.203885 -90.600123 119-8001 4/18/2007 9.5
43.020075 -88.21507 133-0027 4/12/2007 7.3
43.020075 -88.21507 133-0027 4/18/2007 5.6
这是我想要的-NaN那里缺日子.如您所见,这里有不同的SiteID,所以我可能需要做unique
才能分别运行这些站点.
纬度经度SiteID日期本地样本测量
43.435 -88.527778 027-0007 2007年4月12日4.3
43.435 -88.527778 027-0007 2007年4月15日9.3
Here is sort of what I want - NaN's where there are missing days. As you can see, there are different SiteID's so I will need to maybe do unique
to run through the sites separately.
Latitude Longitude SiteID Date Local Sample Measurement
43.435 -88.527778 027-0007 4/12/2007 4.3
43.435 -88.527778 027-0007 4/15/2007 9.3
43.060975 -87.913504 079-0026 4/12/2007 7.9
43.060975 -87.913504 079-0026 4/15/2007 11.3
45.203885 -90.600123 119-8001 4/12/2007 3.3
45.203885 -90.600123 119-8001 4/15/2007 NaN
43.020075 -88.21507 133-0027 4/12/2007 7.3
43.020075 -88.21507 133-0027 4/15/2007 NaN
我开始了这样的事情:
Set = datenum(2007,4,12):2:datenum(2007,10,15);
B = cat(2,PM25data(:,1:2), PM25data(:,6), PM25data(:,12), PM25data(:,16)); % Pull out only the columns needed
% B = {'Lat', 'Lon', 'SiteID', 'Date', 'Data'};
E = zeros(63, 5);
i = 1;
j = 1;
k = 1;
while i <= length(PM25site) && j <= length(E) && k <= length(B) % i = 1:4, j = 1:63, k = 1:32
if datenum(B(j,4)) ~= datenum(Set(j))
C = datenum(Set(j));
D = NaN;
E(j,:) = cat(2, str2double(B(j,1:3)), C, D);
j = j+1;
else
E(j,:) = str2double(B(k,:));
k = k+1;
j = j+1;
end
E(:,3) = PM25site(i);
i = i+1;
end
此代码无法正确前进.它认为我没有正确为其建立索引,而else
是不正确的.它把我想要的东西放下来,但是只替换了前几行的零,然后一直把零放下来.
This code is not advancing correctly. It think I'm not indexing it correctly and the else
is not correct. It goes and puts what I want down, but only replaces the zeros for the first few rows and then keeps zeros all the way down.
这是一个示例部分:
45.203885 -90.600123 NaN 733144 3.3
45.203885 -90.600123 NaN 733146 NaN
45.203885 -90.600123 NaN 733148 NaN
45.203885 -90.600123 NaN 733150 NaN
0 0 0 0 0
0 0 0 0 0
0 0 0 0 0
0 0 0 0 0
我不知道这是否是解决问题的最佳方法.我只想在没有基于日期的数据的地方添加NaN.
I don't know if this is the best way to approach it. I just want to add NaN's where there is no data based on the dates.
推荐答案
我认为您不需要使用while循环进行迭代.这会很慢,并且不会利用MATLAB的矩阵功能.这就是我的做法.
I don't think you need to iterate through with a while-loop. It will be slow, and doesn't utilise MATLAB's matrix capabilities. Here's how I would do it.
all_dates = datenum(2007,4,12):2:datenum(2007,10,15);
% Note that we take the datenum of column 4 here now
B = cat(2,PM25data(:,1:2), PM25data(:,6), datenum(PM25data(:,12)), PM25data(:,16));
% First, generate a list of all siteIDs
[uID,ia] = unique(B(:,3));
% Now, preallocate the result matrix.
% Use NaNs, since we will overwrite all non-nan values in the final matrix
E = nan(length(all_dates)*length(uID),5);
% Set the date column
E(:,4) = repmat(all_dates,length(uID),1);
% Set the lat, long and ID columns
E(:,1) = reshape(repmat(B(ia,1)',length(all_dates),1),[],1);
E(:,2) = reshape(repmat(B(ia,2)',length(all_dates),1),[],1);
E(:,3) = reshape(repmat(uID',length(all_dates),1),[],1);
% Find the columns which we have data for
data_ind = ismember(E(:,3:4),B(:,3:4),'rows');
% And then set the data values
E(data_ind,5) = B(:,5);
其中大多数应该很清楚,但我只想澄清几点.
Most of this should be pretty clear, but I'll just clarify a few points.
unique
的第二个输出生成一个索引矩阵,该索引矩阵可用于在原始矩阵中查找唯一结果.我们的意思是B(ia,3)
会生成所有唯一站点ID的列表.另外,B(ia,1)
会为这些siteID以及类似地为经度生成一个纬度列表.
The second output of unique
generates an index matrix which can be used to find the unique results in the original matrix. We means that B(ia,3)
generates a list of all the unique siteIDs. Additionally, B(ia,1)
will generate a list of the latitudes for these siteIDs, and similarly for longitude.
repmat(all_dates,length(uID),1)
重复所有日期的列表与我们拥有siteID一样多.本质上,我们要确保我们有一个包含所有date + siteID组合的列表.
repmat(all_dates,length(uID),1)
repeats the list of all of the dates as many times as as we have siteIDs. Essentially, we're making sure that we have a list containing all date+siteID combinations.
reshape(repmat(uID',length(all_dates),1),[],1)
是一个简洁的小代码,它将生成重复的站点ID列表,就像[1;1;1;2;2;2;3;3;3;...]
而不是[1;2;3;1;2;3;1;2;3;...]
.
reshape(repmat(uID',length(all_dates),1),[],1)
is a neat little one-liner that will generate the list of siteIDs repeating like [1;1;1;2;2;2;3;3;3;...]
instead of [1;2;3;1;2;3;1;2;3;...]
.
最后,我们使用行"选项来获取ismember
来搜索日期和siteID的组合.使用此方法,我们确定我们拥有数据的日期和siteID组合,并将此数据复制到最终矩阵中.我们没有数据的任何date + siteID都将保留为NaN.
Finally, we use the 'rows' option to get ismember
to search for a combination of date and siteID. Using this, we determine which date and siteID combinations we have data for, and copy this data to our final matrix. Any date+siteIDs for which we do not have data will be left as NaNs.
这篇关于MATLAB创建基于日期列插入NaN的表的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!