从一个因子变量投射多个列 [英] Casting multiple columns from one factor variable

查看:200
本文介绍了从一个因子变量投射多个列的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我选择了一个可怕的公共数据集,需要大量的工作,使其有用。这是一个简化:

I have picked up an awful public data set that needs a lot of work to make it useful. Here is a simplification:

 Molten<-data.frame(ID=round(runif(100, 0, 50),0), Element=c(rep("Au", 20), rep("Fe", 10),
                                                rep("Al", 30),rep("Cu", 20),rep("Au", 20)),
                 Measure=rnorm(100), Units=c(rep("ppm",10), rep("pct",10), rep("ppb", 80)))

Molten$UnitElement<-paste(Molten$Element, Molten$Units, sep="_")

  Molten<-Molten[!duplicated(Molten[,c("ID", "Element")]),]

和使用dcast的每个元素的不同列:

I have arrived at a data frame with the IDs and a different column for each element using dcast:

library(reshape2)
Cast<-dcast(Molten, ID~Element, value.var="Measure" )

但是有不同的度量单位元件。所以我需要一个额外的列来表示每一个记录被测量的单位。例如,一个名为GoldUnit的列,每个条目没有黄金测量,每个条目有NA,每个填充黄金记录的测量单位。我不知道如何去做这个。任何帮助将不胜感激!

But there are different units of measure for the same element. So I will need an extra column for each element indicating what unit that record is measured in. For example a column called "GoldUnit" with NA for each entry without a gold measurement and the measured unit for each populated gold record. I'm not sure how to go about this. Any help would be appreciated!

我想要的示例

  ID, Al, Al_unit, Au, Au_unit, Cu, Cu_unit, Fe, Fe_unit
  5, NA, NA, NA, NA, 1, "ppb", NA, NA
  7, NA, NA, NA, NA, NA , NA, 6, "ppb"
  3, 3, "ppb", 4, "ppm", NA, NA, NA, NA


推荐答案

这应该返回您要查找的内容:

This should return what you're looking for:

library(reshape2)

Element <- c(rep("Au", 20), rep("Fe", 10),rep("Al", 30),rep("Cu", 20),rep("Au", 20))
Measure <- rnorm(100)
ID <- round(runif(100, 0, 50),0)
Units <- c(rep("ppm",10), rep("pct",10), rep("ppb", 80))

Molten <- cbind.data.frame(Element, Measure, ID, Units)
Molten <- Molten[!duplicated(Molten[,c("ID", "Element")]),]

Cast1 <- dcast(Molten, ID~Element, value.var="Measure" )
Cast2 <- dcast(Molten, ID~Element, value.var="Units" )
Cast2$ID <- NULL
names(Cast2) <- paste(names(Cast2), 'unit', sep='_')
Cast <- cbind(Cast1, Cast2)

这篇关于从一个因子变量投射多个列的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆