如何使所有列中的因子数据框架的级别相同? [英] How to make the levels of a factor data frame the same in all columns?
问题描述
我有一个包含5个不同列的数据框:
I have a data frame with 5 different columns:
Test1 Test2 Test3 Test4 Test5
Sample1 PASS PASS FAIL WARN WARN
Sample2 PASS PASS FAIL PASS WARN
Sample3 PASS FAIL FAIL PASS WARN
Sample4 PASS FAIL FAIL PASS WARN
Sample5 PASS WARN FAIL WARN WARN
在每一列中,每个级别都被分配一个不同的因素。
在第1列中,PASS为1.
在第2列中,PASS为2,FAIL为1。
在第3列中,FAIL为1.
在第4栏, PASS 为1且 WARN 是2
在5列中, WARN 为1。
In each column, each level is assigned a different factor. In column 1, "PASS" is 1. In column 2, "PASS" is 2 and "FAIL is 1. In column 3, "FAIL" is 1. In column 4, "PASS" is 1 and "WARN" is 2. In column 5, "WARN" IS 1.
它是这样做按字母顺序
我需要所有列中的PASS为1,所有列中的WARN为2,所有列中为FAIL3,因此可以将其转换为矩阵,并将其转换为热图。
It is doing it by alphabetical order I need "PASS" be 1 in all columns, "WARN" to be 2 in all columns, and "FAIL" 3 in all columns, so that I can then convert into a matrix and turn it into a heatmap.
目前它被分配给根据水平上哪些在一个特定的列显示出来,并且通过按字母顺序的因素。
Currently it is assigning the factors to the levels depending on which ones show up in a specific column, and by alphabetical order.
如何在整个数据框架中保持不变?
How can I keep it constant throughout the entire data frame?
推荐答案
您可以更改级别通过循环( lapply
)将数据集df按照相同的顺序,并再次转换为因子
指定的水平
并为其分配回相应的列。
You could change the levels of the dataset "df" to be in the same order by looping (lapply
) and convert to factor
again with the specified levels
and assign it back to the corresponding columns.
lvls <- c('PASS', 'WARN', 'FAIL')
df[] <- lapply(df, factor, levels=lvls)
str(df)
# 'data.frame': 5 obs. of 5 variables:
# $ Test1: Factor w/ 3 levels "PASS","WARN",..: 1 1 1 1 1
# $ Test2: Factor w/ 3 levels "PASS","WARN",..: 1 1 3 3 2
# $ Test3: Factor w/ 3 levels "PASS","WARN",..: 3 3 3 3 3
# $ Test4: Factor w/ 3 levels "PASS","WARN",..: 2 1 1 1 2
# $ Test5: Factor w/ 3 levels "PASS","WARN",..: 2 2 2 2 2
如果您选择使用 data.table
/ p>
If you opt to use data.table
library(data.table)
setDT(df)[, names(df):= lapply(.SD, factor, levels=lvls)]
setDT
转换成 data.frame 到 data.table,分配(:=
)的数据集的列名的重新转换因子列( lapply(..)
)。 .SD
表示Datatable of Datatable。
setDT
converts to "data.frame" to "data.table", assign (:=
) the column names of the dataset to the reconverted factor columns (lapply(..)
). .SD
denotes "Subset of Datatable".
df <- structure(list(Test1 = structure(c(1L, 1L, 1L, 1L, 1L),
.Label = "PASS", class = "factor"),
Test2 = structure(c(2L, 2L, 1L, 1L, 3L), .Label = c("FAIL",
"PASS", "WARN"), class = "factor"), Test3 = structure(c(1L,
1L, 1L, 1L, 1L), .Label = "FAIL", class = "factor"), Test4 =
structure(c(2L, 1L, 1L, 1L, 2L), .Label = c("PASS", "WARN", "FAIL"),
class = "factor"), Test5 = structure(c(1L, 1L, 1L, 1L, 1L), .Label =
"WARN", class = "factor")), .Names = c("Test1",
"Test2", "Test3", "Test4", "Test5"), row.names = c("Sample1",
"Sample2", "Sample3", "Sample4", "Sample5"), class = "data.frame")
这篇关于如何使所有列中的因子数据框架的级别相同?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!