如何在R中使用資料框列建立分類變數?


如果一個變數是數值型的,可以透過定義上下限將其轉換為分類變數。例如,年齡從21歲到25歲可以轉換為一個類別,例如21-25。要將R資料框列轉換為分類變數,可以使用`cut`函式。

示例1

 線上演示

考慮以下資料框:

set.seed(141)
x1<−rnorm(20,2,0.3)
x2<−LETTERS[1:20]
df1<−data.frame(x1,x2)
df1

輸出

x1 x2
1 2.154308 A
2 1.966167 B
3 2.019302 C
4 1.803427 D
5 2.150517 E
6 1.749425 F
7 1.797508 G
8 1.949084 H
9 2.147742 I
10 1.895026 J
11 1.922780 K
12 1.755871 L
13 2.410873 M
14 2.580489 N
15 1.910219 O
16 1.805713 P
17 2.166996 Q
18 2.074431 R
19 1.749257 S
20 2.004867 T

為df1中的x1建立分類列:

示例

df1$x1_category<−cut(df1$x1,c(0,1,2,3))
df1

輸出

x1 x2 x1_category
1 2.154308 A (2,3]
2 1.966167 B (1,2]
3 2.019302 C (2,3]
4 1.803427 D (1,2]
5 2.150517 E (2,3]
6 1.749425 F (1,2]
7 1.797508 G (1,2]
8 1.949084 H (1,2]
9 2.147742 I (2,3]
10 1.895026 J (1,2]
11 1.922780 K (1,2]
12 1.755871 L (1,2]
13 2.410873 M (2,3]
14 2.580489 N (2,3]
15 1.910219 O (1,2]
16 1.805713 P (1,2]
17 2.166996 Q (2,3]
18 2.074431 R (2,3]
19 1.749257 S (1,2]
20 2.004867 T (2,3]

示例2

 線上演示

y1<−sample(c("Child","Teen","Adult","Old"),20,replace=TRUE)
y2<−rpois(20,5)
df2<−data.frame(y1,y2)
df2

輸出

y1 y2
1 Old 6
2 Teen 3
3 Old 2
4 Teen 5
5 Adult 6
6 Teen 6
7 Old 5
8 Adult 6
9 Child 5
10 Child 3
11 Child 9
12 Old 8
13 Teen 2
14 Teen 2
15 Teen 5
16 Adult 7
17 Adult 4
18 Teen 4
19 Adult 2
20 Child 8

為df2中的x2建立分類列:

示例

df2$y2_category<−cut(df2$y2,c(0,1,2,3,4,5,6,7,8,9,10))
df2

輸出

y1 y2 y2_category
1 Old 6 (5,6]
2 Teen 3 (2,3]
3 Old 2 (1,2]
4 Teen 5 (4,5]
5 Adult 6 (5,6]
6 Teen 6 (5,6]
7 Old 5 (4,5]
8 Adult 6 (5,6]
9 Child 5 (4,5]
10 Child 3 (2,3]
11 Child 9 (8,9]
12 Old 8 (7,8]
13 Teen 2 (1,2]
14 Teen 2 (1,2]
15 Teen 5 (4,5]
16 Adult 7 (6,7]
17 Adult 4 (3,4]
18 Teen 4 (3,4]
19 Adult 2 (1,2]
20 Child 8 (7,8]

更新於:2021年2月9日

1K+ 次瀏覽

啟動您的職業生涯

完成課程獲得認證

開始
廣告
© . All rights reserved.