如何從 R 資料框列中提取不以特定字元開頭和結尾的值?


有時我們只想根據列的初始值和結束值提取資料列的值,該列包含字串,或者有時包含字串的列的值以一些額外的字元記錄,我們想提取這些值。為此,我們可以使用帶單個方括號的 grepl 的否定。

示例

考慮以下資料框 -

> x2<-c("Alabama", "Alaska", "American Samoa", "Arizona", "Arkansas", "California",
"Colorado", "Connecticut", "Delaware", "District of Columbia", "Florida", "Georgia",
"Guam", "Hawaii", "Idaho", "Illinois", "Indiana", "Iowa", "Kansas", "Kentucky",
"Louisiana", "Maine", "Maryland", "Massachusetts", "Michigan", "Minnesota", "Minor
Outlying Islands", "Mississippi", "Missouri", "Montana", "Nebraska", "Nevada", "New
Hampshire", "New Jersey", "New Mexico", "New York", "North Carolina", "North
Dakota", "Northern Mariana Islands", "Ohio", "Oklahoma", "Oregon", "Pennsylvania",
"Puerto Rico", "Rhode Island", "South Carolina", "South Dakota", "Tennessee", "Texas",
"U.S. Virgin Islands", "Utah", "Vermont", "Virginia", "Washington", "West Virginia",
"Wisconsin", "Wyoming")
> df2<-data.frame(x2)
> head(df2,20)

輸出

x2
1 Alabama
2 Alaska
3 American Samoa
4 Arizona
5 Arkansas
6 California
7 Colorado
8 Connecticut
9 Delaware
10 District of Columbia
11 Florida
12 Georgia
13 Guam
14 Hawaii
15 Idaho
16 Illinois
17 Indiana
18 Iowa
19 Kansas
20 Kentucky

查詢既不以 A 開頭也不以 a 結尾的州 -

> df2[!grepl("^A|a$",df2$x2),]

輸出

[1] Colorado Connecticut Delaware
[4] Guam Hawaii Idaho
[7] Illinois Kansas Kentucky
[10] Maine Maryland Massachusetts
[13] Michigan Minor Outlying Islands Mississippi
[16] Missouri New Hampshire New Jersey
[19] New Mexico New York Northern Mariana Islands
[22] Ohio Oregon Puerto Rico
[25] Rhode Island Tennessee Texas
[28] U.S. Virgin Islands Utah Vermont
[31] Washington Wisconsin Wyoming
57 Levels: Alabama Alaska American Samoa Arizona Arkansas ... Wyoming

讓我們看看另一個例子 -

> x1<-
c("Indiaaa","Chinaaa","Russiaa","Canadaaa","Indonesiaaa","Croatiaaa","Mauritaniaaa","
Albaniaaa","Angolaaa","Armeniaaa","Malaysiaaa","Maltaaa","Boliviaaa","Burmaaa","Pa
nama","Romaniaa","Saudi-Arabia","Serbiaaa","Syriaaa","Tongaaa","Koreaaa","Libya")
> y1<-sample(1:10,22,replace=TRUE)
> df1<-data.frame(x1,y1)
> df1

輸出

x1 y1
1 Indiaaa 6
2 Chinaaa 1
3 Russiaa 9
4 Canadaaa 7
5 Indonesiaaa 7
6 Croatiaaa 3
7 Mauritaniaaa 6
8 Albaniaaa 2
9 Angolaaa 10
10 Armeniaaa 10
11 Malaysiaaa 7
12 Maltaaa 3
13 Boliviaaa 2
14 Burmaaa 10
15 Panama 1
16 Romaniaa 10
17 Saudi-Arabia 10
18 Serbiaaa 8
19 Syriaaa 10
20 Tongaaa 5
21 Koreaaa 7
22 Libya 8
> df1[!grepl("^A|aa$",df1$x1),]

輸出

x1 y1
15 Panama 1
17 Saudi-Arabia 10
22 Libya 8
> df1[!grepl("^S|aa$",df1$x1),]

輸出

x1 y1
15 Panama 1
22 Libya 8
> df1[!grepl("^B|aa$",df1$x1),]

輸出

x1 y1
15 Panama 1
17 Saudi-Arabia 10
22 Libya 8
> df1[!grepl("^P|aa$",df1$x1),]

輸出

x1 y1
17 Saudi-Arabia 10
22 Libya 8
> df1[!grepl("^L|aa$",df1$x1),]

輸出

x1 y1
15 Panama 1
17 Saudi-Arabia 10

更新於: 2020-09-07

75 次瀏覽

啟動您的 職業生涯

透過完成課程獲得認證

開始
廣告