获取因子频率的直方图(摘要) [英] Get a histogram plot of factor frequencies (summary)

查看:148
本文介绍了获取因子频率的直方图(摘要)的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有一个具有许多不同值的因素.如果执行summary(factor),则输出为不同值及其频率的列表.像这样:

I've got a factor with many different values. If you execute summary(factor) the output is a list of the different values and their frequency. Like so:

A B C D
3 3 1 5

我想制作频率值的直方图,即X轴包含出现的不同频率,Y轴包含具有此特定频率的因子数量.完成这样的事情的最好方法是什么?

I'd like to make a histogram of the frequency values, i.e. X-axis contains the different frequencies that occur, Y-axis the number of factors that have this particular frequency. What's the best way to accomplish something like that?

edit:感谢下面的回答,我发现我能做的就是从表中获取频率的因数,在表中获取,然后绘制图形,这看起来像(如果是因素):

edit: thanks to the answer below I figured out that what I can do is get the factor of the frequencies out of the table, get that in a table and then graph that as well, which would look like (if f is the factor):

plot(factor(table(f)))

推荐答案

根据明确的问题进行更新

set.seed(1)
dat2 <- data.frame(fac = factor(sample(LETTERS, 100, replace = TRUE)))
hist(table(dat2), xlab = "Frequency of Level Occurrence", main = "")

给予:

在这里,我们只是将hist()直接应用于table(dat)的结果. table(dat)提供每级因子的频率,hist()提供这些数据的直方图.

Here we just apply hist() directly to the result of table(dat). table(dat) provides the frequencies per level of the factor and hist() produces the histogram of these data.

原始

有几种可能性.您的数据:

There are several possibilities. Your data:

dat <- data.frame(fac = rep(LETTERS[1:4], times = c(3,3,1,5)))

这是三个,从第一列开始,从上至下:

Here are three, from column one, top to bottom:

  • "table"的默认绘制方法,绘制数据和直方图条形图
  • 条形图-可能就是直方图的含义.请注意此处的墨水与信息比率低
  • 点状图或点状图;显示与其他图相同的信息,但每单位信息使用的墨水少得多.首选.
  • The default plot methods for class "table", plots the data and histogram-like bars
  • A bar plot - which is probably what you meant by histogram. Notice the low ink-to-information ratio here
  • A dot plot or dot chart; shows the same info as the other plots but uses far less ink per unit information. Preferred.

产生它们的代码:

layout(matrix(1:4, ncol = 2))
plot(table(dat), main = "plot method for class \"table\"")
barplot(table(dat), main = "barplot")
tab <- as.numeric(table(dat))
names(tab) <- names(table(dat))
dotchart(tab, main = "dotchart or dotplot")
## or just this
## dotchart(table(dat))
## and ignore the warning
layout(1)

这将产生:

如果您只是将数据包含在变量factor中(顺便说一句是错误的名称选择),那么在我的代码示例中,可以使用table(factor)而不是table(dat)table(dat$fac).

If you just have your data in variable factor (bad name choice by the way) then table(factor) can be used rather than table(dat) or table(dat$fac) in my code examples.

出于完整性考虑,包lattice在生成点图时更加灵活,因为我们可以获得所需的方向:

For completeness, package lattice is more flexible when it comes to producing the dot plot as we can get the orientation you want:

require(lattice)
with(dat, dotplot(fac, horizontal = FALSE))

给予:

ggplot2版本:

require(ggplot2)
p <- ggplot(data.frame(Freq = tab, fac = names(tab)), aes(fac, Freq)) + 
    geom_point()
p

给予:

这篇关于获取因子频率的直方图(摘要)的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆