如何使所有列中的因子数据框架的级别相同? [英] How to make the levels of a factor data frame the same in all columns?

查看:143
本文介绍了如何使所有列中的因子数据框架的级别相同?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有一个包含5个不同列的数据框:

I have a data frame with 5 different columns:

         Test1   Test2   Test3  Test4  Test5 
Sample1  PASS    PASS    FAIL    WARN   WARN
Sample2  PASS    PASS    FAIL    PASS   WARN
Sample3  PASS    FAIL    FAIL    PASS   WARN
Sample4  PASS    FAIL    FAIL    PASS   WARN
Sample5  PASS    WARN    FAIL    WARN   WARN

在每一列中,每个级别都被分配一个不同的因素。
在第1列中,PASS为1.
在第2列中,PASS为2,FAIL为1。
在第3列中,FAIL为1.
在第4栏, PASS 为1且 WARN 是2
在5列中, WARN 为1。

In each column, each level is assigned a different factor. In column 1, "PASS" is 1. In column 2, "PASS" is 2 and "FAIL is 1. In column 3, "FAIL" is 1. In column 4, "PASS" is 1 and "WARN" is 2. In column 5, "WARN" IS 1.

它是这样做按字母顺序
我需要所有列中的PASS为1,所有列中的WARN为2,所有列中为FAIL3,因此可以将其转换为矩阵,并将其转换为热图。

It is doing it by alphabetical order I need "PASS" be 1 in all columns, "WARN" to be 2 in all columns, and "FAIL" 3 in all columns, so that I can then convert into a matrix and turn it into a heatmap.

目前它被分配给根据水平上哪些在一个特定的列显示出来,并且通过按字母顺序的因素。

Currently it is assigning the factors to the levels depending on which ones show up in a specific column, and by alphabetical order.

如何在整个数据框架中保持不变?

How can I keep it constant throughout the entire data frame?

推荐答案

您可以更改级别通过循环( lapply )将数据集df按照相同的顺序,并再次转换为因子指定的水平并为其分配回相应的列。

You could change the levels of the dataset "df" to be in the same order by looping (lapply) and convert to factor again with the specified levels and assign it back to the corresponding columns.

lvls <- c('PASS', 'WARN', 'FAIL')
df[] <-  lapply(df, factor, levels=lvls)
str(df)
# 'data.frame': 5 obs. of  5 variables:
# $ Test1: Factor w/ 3 levels "PASS","WARN",..: 1 1 1 1 1
# $ Test2: Factor w/ 3 levels "PASS","WARN",..: 1 1 3 3 2
# $ Test3: Factor w/ 3 levels "PASS","WARN",..: 3 3 3 3 3
# $ Test4: Factor w/ 3 levels "PASS","WARN",..: 2 1 1 1 2
# $ Test5: Factor w/ 3 levels "PASS","WARN",..: 2 2 2 2 2

如果您选择使用 data.table / p>

If you opt to use data.table

library(data.table)
setDT(df)[, names(df):= lapply(.SD, factor, levels=lvls)]

setDT 转换成 data.frame 到 data.table,分配(:= )的数据集的列名的重新转换因子列( lapply(..))。 .SD 表示Datatable of Datatable。

setDT converts to "data.frame" to "data.table", assign (:=) the column names of the dataset to the reconverted factor columns (lapply(..)). .SD denotes "Subset of Datatable".

df <- structure(list(Test1 = structure(c(1L, 1L, 1L, 1L, 1L), 
.Label = "PASS", class = "factor"), 
  Test2 = structure(c(2L, 2L, 1L, 1L, 3L), .Label = c("FAIL", 
 "PASS", "WARN"), class = "factor"), Test3 = structure(c(1L, 
 1L, 1L, 1L, 1L), .Label = "FAIL", class = "factor"), Test4 = 
 structure(c(2L, 1L, 1L, 1L, 2L), .Label = c("PASS", "WARN", "FAIL"), 
 class = "factor"), Test5 = structure(c(1L, 1L, 1L, 1L, 1L), .Label = 
"WARN", class = "factor")), .Names = c("Test1", 
"Test2", "Test3", "Test4", "Test5"), row.names = c("Sample1", 
"Sample2", "Sample3", "Sample4", "Sample5"), class = "data.frame")

这篇关于如何使所有列中的因子数据框架的级别相同?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆