最后一次观察结果在多个柱上进行 [英] Last observation carried forward by group over multiple columns
问题描述
我有一个数据集,观察多个患者及其随时间的诊断。有9个不同的虚拟变量,每个代表一个特定的诊断,命名为例如。 L40,L41,K50,M05等。
I have a dataset with observations of multiple patients and their diagnoses over time. There are 9 different dummy variables, each representing a specific diagnosis, named e.g. L40, L41, K50, M05 and so on.
如果虚拟变量中有缺失值,我想结束患者的最后一个非缺失值,使得一旦患者接收到诊断,其将遵循到随后的观察。
Where there are missing values in the dummy variables, I want to carry forward the last non-missing value by patient, so that once a patient receives a diagnosis, it will follow through to subsequent observations.
我开始使用动物园软件包中的na.locf函数。
I started with this, using the na.locf function from the zoo package.
diagdata <- originaldata[,grep("^patient|^ar|^edatum|^K|^L|^M",colnames(originaldata))]
require(zoo)
require(data.table)
diagnosis <- data.table(diagdata)
diagnosis[,L40:=na.locf(L40),by=patient]
这实现了我正在寻找的东西, L40)。是否有任何方法将上述应用于所有相关的诊断列,即以K,L和M开头的列?
This achieves what I am looking for, but only on the column in question (L40). Is there any way of applying the above to all the relevant diagnosis columns, i.e. columns starting with K, L and M?
推荐答案
cols = grep("^K|^L|^M", names(diagnosis), value = T)
diagnosis[, (cols) := na.locf(.SD, na.rm = F), by = patient, .SDcols = cols]
另请参阅按组中的有效locf单R数据表。
这篇关于最后一次观察结果在多个柱上进行的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!