如何找到R数据帧的分组汇总统计信息?
为了比较不同的组,我们需要每个组的摘要统计信息。它有助于我们观察两组之间的差异。摘要统计信息提供最小值,第一四分位数,中位数,第三四分位数和最大值。因此,我们可以比较每个组的这些值。要找到R数据帧的逐组汇总统计信息,我们可以使用tapply函数。
示例
请看以下数据帧-
> set.seed(99)
> x1<-sample(1:100,50,replace=TRUE)
> x2<-rep(c("G1","G2","G3","G4","G5"),times=10)
> df<-data.frame(x1,x2)
> head(df,20)
x1 x2
1 48 G1
2 33 G2
3 44 G3
4 22 G4
5 99 G5
6 62 G1
7 98 G2
8 32 G3
9 13 G4
10 20 G5
11 100 G1
12 31 G2
13 68 G3
14 9 G4
15 82 G5
16 88 G1
17 30 G2
18 86 G3
19 84 G4
20 32 G5查找每个组的x1的摘要统计量-
> tapply(df$x1, df$x2, summary) $G1 Min. 1st Qu. Median Mean 3rd Qu. Max. 14.0 55.0 72.0 67.8 86.5 100.0 $G2 Min. 1st Qu. Median Mean 3rd Qu. Max. 4.0 31.5 60.5 52.4 69.5 98.0 $G3 Min. 1st Qu. Median Mean 3rd Qu. Max. 14.0 33.5 41.0 46.9 64.5 86.0 $G4 Min. 1st Qu. Median Mean 3rd Qu. Max. 9.00 23.75 53.00 53.30 82.75 97.00 $G5 Min. 1st Qu. Median Mean 3rd Qu. Max. 7.00 31.25 32.00 42.40 44.75 99.00
让我们再看一个例子-
> y1<-rep(c(letters[1:5]),times=5) > y2<-rep(c(14,25,13,12,41,52,44,28,17,30),times=c(2,5,3,3,1,5,1,2,2,1)) > df_y<-data.frame(y1,y2) > head(df_y,20) y1 y2 1 a 14 2 b 14 3 c 25 4 d 25 5 e 25 6 a 25 7 b 25 8 c 13 9 d 13 10 e 13 11 a 12 12 b 12 13 c 12 14 d 41 15 e 52 16 a 52 17 b 52 18 c 52 19 d 52 20 e 44 > tapply(df_y$y2, df_y$y1, summary) $a Min. 1st Qu. Median Mean 3rd Qu. Max. 12.0 14.0 25.0 26.2 28.0 52.0 $b Min. 1st Qu. Median Mean 3rd Qu. Max. 12.0 14.0 25.0 26.2 28.0 52.0 $c Min. 1st Qu. Median Mean 3rd Qu. Max. 12.0 13.0 17.0 23.8 25.0 52.0 $d Min. 1st Qu. Median Mean 3rd Qu. Max. 13.0 17.0 25.0 29.6 41.0 52.0 $e Min. 1st Qu. Median Mean 3rd Qu. Max. 13.0 25.0 30.0 32.8 44.0 52.0
热门推荐
10 圣诞祝福语简短小学
11 祖国七十华诞简短祝福语
12 老师送的祝福语简短
13 生日祝福语大全女生简短
14 祝女性生日祝福语简短
15 牛年女神节祝福语简短
16 情人表白祝福语简短大气
17 老公开业祝福语简短
18 官宣新年祝福语简短