如何使用data.table删除表中未使用的级别？ [英] How to drop unused levels in table with data.table?

查看：62 发布时间：2020/10/15 19:36:55 r data.table

本文介绍了如何使用data.table删除表中未使用的级别？的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

请考虑以下 data.table ：

x <- data.table(
          x=sample(letters[1:5],10,rep=T), 
          y=factor(sample(letters[1:5],10,rep=T), levels=letters))

这种情况在使用 data.table时会出现几次 s，其中某些因子字段具有未使用的变量。

This situation arises several times while working with data.tables where some of the factor fields have unused variables.

现在，如果我们使用下表：

Now, if we use the following table:

table(x)

出现一个具有所有未使用级别的巨型表。
table 方法或 data.table 方法是否可以做到这一点？

A giant table with all unused levels shows up. Is there a way in table methods or data.table to do this?

我知道以下情况是可能的：

I know that following is possible:

x$y <- factor(x$y)

但这没有用，因为我不想保存每个子项，表到另一个变量。

But this is not useful because I don't want to save each of the sub-tables to a different variable.

您可以按以下方式使用 droplevel

You can use droplevel as follows

x[,y:=droplevels(y)]

此操作通过引用 droplevels（y）覆盖 y

结果

> table(x)
   y
x   b c d e
  a 1 1 1 2
  b 0 1 0 0
  c 1 0 0 0
  d 1 0 0 0
  e 0 0 2 0

这篇关于如何使用data.table删除表中未使用的级别？的文章就介绍到这了，希望我们推荐的答案对大家有所帮助，也希望大家多多支持IT屋！

查看全文