如何在 R 中根據兩個不同的字元列查詢整數列的總和?


基於兩個不同的字元列計算整數列的總和,簡單來說就是我們需要為可用資料建立一個列聯表。為此,我們可以使用 with 和 tapply 函式。例如,如果我們有一個數據框 df,其中包含兩個定義為性別和種族的分類列,以及一個定義為 Package 的整數列,則可以建立列聯表如下:

with(df,tapply(Package,list(gender,ethnicity),sum))

示例

考慮以下資料框 -

 即時演示

set.seed(777)
Class<−sample(c("First","Second","Third"),20,replace=TRUE)
Group<−sample(c("GP1","GP2","GP3","GP4"),20,replace=TRUE)
Rate<−sample(0:10,20,replace=TRUE)
df1<−data.frame(Class,Group,Rate)
df1

輸出

   Class Group Rate
1 First   GP1 7
2 Second  GP2 1
3 Second  GP4 1
4 Second  GP4 0
5 Third   GP2 10
6 Second  GP2 8
7 First   GP1 7
8 First   GP4 4
9 Second  GP1 4
10 Third  GP3 8
11 Second GP2 8
12 First  GP2 4
13 Third  GP2 6
14 Third  GP4 4
15 Third  GP4 5
16 Second GP1 2
17 Second GP1 9
18 Second GP3 2
19 Second GP3 1
20 Third  GP4 10

示例

str(df1)
'data.frame': 20 obs. of 3 variables:
$ Class: chr "First" "Second" "Second" "Second" ...
$ Group: chr "GP1" "GP2" "GP4" "GP4" ...
$ Rate : int 7 1 1 0 10 8 7 4 4 8 ...

根據 Class 和 Group 查詢 Rate 的總和 -

with(df1,tapply(Rate,list(Class,Group),sum))
GP1 GP2 GP3 GP4
First  14 4 NA 4
Second 15 17 3 1
Third  NA 16 8 19

讓我們看另一個例子 -

示例

 即時演示

Gender<−sample(c("Male","Female"),20,replace=TRUE)
Centering<−sample(c("Yes","No"),20,replace=TRUE)
Percentage<−sample(1:100,20)
df2<−data.frame(Gender,Centering,Percentage)
df2

輸出

Gender Centering Percentage
1 Male    No  28
2 Male    No  89
3 Female  Yes 38
4 Male    No  78
5 Male    Yes 19
6 Female  No  46
7 Female  Yes 94
8 Male    No   4
9 Male    Yes 92
10 Male   No  90
11 Male   Yes 66
12 Female No  57
13 Female No  74
14 Female No  48
15 Female Yes 20
16 Male   Yes 51
17 Male   No  82
18 Male   No   7
19 Male   No  53
20 Male   No  55

根據 Gender 和 Centering 查詢 Percentage 的總和 -

with(df2,tapply(Percentage,list(Gender,Centering),sum))
No Yes
Female 225 152
Male 486 228

更新於: 2020-10-17

68 次瀏覽

開啟你的 職業生涯

透過完成課程獲得認證

開始學習
廣告

© . All rights reserved.