R 中显示的摘要很短,许多术语显示为“其他". [英] Summary shown in R is short, many terms shown as "Other"
问题描述
如何显示完整的输出摘要,而不将任何值归类为其他"?
summary(d)销售城市部门产品的日期1/18/2015 : 149 A:5290 鞋类男装 : 538 13245 : 2551/25/2015 : 149 B:2078 家居:1937 15350 : 2552014 年 11 月 23 日:149 C:5088 婴儿 W-Wear:992 15352:2552014 年 11 月 30 日:149 位女士 下层:1735 15353:2552014 年 12 月 14 日:149 名女士鞋面:1805 15355:2552014 年 12 月 21 日:149 男士下装:2039 15356:255(其他):11562 男鞋面:3410 (其他):10926销售预测.销售标志0 :3963 0 :3279 预测:1341不可用:1341 1 :1951 历史:111151:1145 2:9462 : 797 3 : 7003 : 557 4 : 5724 : 498 5 : 438(其他):4155 (其他):4570
旁白:看起来您的数据有因子列,它们应该是数字.您可能想看看它,因为它可能会在以后的分析中给您带来问题.
<小时>就您对 summary()
的调用而言,您可以调整 maxsum
参数.我们在help(summary)
中发现这个可以用来改变summary中显示的信息量
maxsum - 整数,表示因子应显示多少级.
所以让我们用一个两列数据框的例子来看看这个 -
set.seed(12)df <- 数据.frame(a = 样本(字母 [1:8],1e3,真),b = 样本(字母 [1:10],1e3,真))
在没有其他参数的情况下调用 summary()
,我们会在每列摘要的底部列出其他".
summary(df)# a b# d :132 克 :118# c :131 b :108# f :131 e :106# a :123 f :104# g :123 d :103# e :122 j :103#(其他):238(其他):358
现在,如果我们将 maxsum
调整为所有列的最大唯一值数的长度,我们将得到所有列出的值.
summary(df, maxsum = max(lengths(lapply(df, unique))))# a b# a:123 a: 94# b:120 b:108# c:131 c:99# d:132 d:103# e:122 e:106# f:131 f:104# 克:123 克:118# 小时:118 小时:92#我:73# j:103
请注意,maxsum
也可以是 maxsum = length(Reduce(union, df))
,并且假设您正在使用数据框.>
How can I display the complete output summary, without classifying any values as "Other"?
summary(d)
Date.of.Sale City Department Product
1/18/2015 : 149 A:5290 Footwear Mens : 538 13245 : 255
1/25/2015 : 149 B:2078 Home Furnishing:1937 15350 : 255
11/23/2014: 149 C:5088 Infant W-Wear : 992 15352 : 255
11/30/2014: 149 Ladies Lower :1735 15353 : 255
12/14/2014: 149 Ladies Upper :1805 15355 : 255
12/21/2014: 149 Mens Lower :2039 15356 : 255
(Other) :11562 Mens Upper :3410 (Other):10926
Sale Predicted.Sale Flag
0 :3963 0 :3279 Forecast: 1341
Not Available:1341 1 :1951 History :11115
1 :1145 2 : 946
2 : 797 3 : 700
3 : 557 4 : 572
4 : 498 5 : 438
(Other) :4155 (Other):4570
Aside: It looks like your data has factor columns where they should be numeric. You may want to have a look at that as it may cause issues for you in later analysis.
As far as your call to summary()
goes, you can adjust the maxsum
argument. We find in help(summary)
that this can be used to change the amount of information shown in the summary
maxsum - integer, indicating how many levels should be shown for factors.
So let's have a look at this at work with a two-column data frame example -
set.seed(12)
df <- data.frame(
a = sample(letters[1:8], 1e3, TRUE),
b = sample(letters[1:10], 1e3, TRUE)
)
Calling summary()
with no other arguments, we get "Other" listed at the bottom of each column summary.
summary(df)
# a b
# d :132 g :118
# c :131 b :108
# f :131 e :106
# a :123 f :104
# g :123 d :103
# e :122 j :103
# (Other):238 (Other):358
Now if we adjust maxsum
to the length of the maximum number of unique values of all columns, we get all the values listed.
summary(df, maxsum = max(lengths(lapply(df, unique))))
# a b
# a:123 a: 94
# b:120 b:108
# c:131 c: 99
# d:132 d:103
# e:122 e:106
# f:131 f:104
# g:123 g:118
# h:118 h: 92
# i: 73
# j:103
Note that maxsum
could also be maxsum = length(Reduce(union, df))
, and that this assumes that you are working with a data frame.
这篇关于R 中显示的摘要很短,许多术语显示为“其他".的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!