R 中如何在特殊字元後移除部分字串?


有時我們並不需要全部的字串來進行分析,尤其是在該字串使分析變得複雜或變得沒有意義時。在這種型別的情況下,我們可以從完整字串中移除那些我們認為不必要的字串部分。例如,假設我們有一個字串 ID:00001-1 但是我們不希望在這個字串中有 -1,那麼我們可以移除它,而這是藉助於 gsub 函式來完成的。

示例

> x1<-c("ID:00001-1","ID:00100-1","ID:00201-4","ID:014700-3","ID:12045-5","ID:00012-2","ID:10078-3")
> gsub("\-.*","",x1)
[1] "ID:00001" "ID:00100" "ID:00201" "ID:014700" "ID:12045" "ID:00012" "ID:10078"
> x2<-c("ID:00001/1","ID:00100/1","ID:00201/4","ID:014700/3","ID:12045/5","ID:00012/2","ID:10078/3")
> gsub("\/.*","",x2)
[1] "ID:00001" "ID:00100" "ID:00201" "ID:014700" "ID:12045" "ID:00012" "ID:10078"
> x3<-c("ID:00001_1","ID:00100_1","ID:00201_4","ID:014700_3","ID:12045_5","ID:00012_2","ID:10078_3")
> gsub("\_.*","",x3)
[1] "ID:00001" "ID:00100" "ID:00201" "ID:014700" "ID:12045" "ID:00012" "ID:10078"
> x4<-c("ID:00001@1","ID:00100@1","ID:00201@4","ID:014700@3","ID:12045@5","ID:00012@2","ID:10078@3")
> gsub("\@.*","",x4)
[1] "ID:00001" "ID:00100" "ID:00201" "ID:014700" "ID:12045" "ID:00012" "ID:10078"
> x5<-c("ID:00001*1","ID:00100*1","ID:00201*4","ID:014700*3","ID:12045*5","ID:00012*2","ID:10078*3")
> gsub("\*.*","",x5)
[1] "ID:00001" "ID:00100" "ID:00201" "ID:014700" "ID:12045" "ID:00012" "ID:10078"
> x6<-c("ID:00001#1","ID:00100#1","ID:00201#4","ID:014700#3","ID:12045#5","ID:00012#2","ID:10078#3")
> gsub("\#.*","",x6)
[1] "ID:00001" "ID:00100" "ID:00201" "ID:014700" "ID:12045" "ID:00012" "ID:10078"
> x7<-c("ID:00001()1","ID:00100()1","ID:00201()4","ID:014700()3","ID:12045()5","ID:00012()2","ID:10078()3")
> gsub("\().*","",x7)
[1] "ID:00001" "ID:00100" "ID:00201" "ID:014700" "ID:12045" "ID:00012" "ID:10078"
> x8<-c("ID:00001<>1","ID:00100<>1","ID:00201<>4","ID:014700<>3","ID:12045<>5","ID:00012<>2","ID:10078<>3")
> gsub("\<>.*","",x8)
[1] "ID:00001<>1" "ID:00100<>1" "ID:00201<>4" "ID:014700<>3" "ID:12045<>5" "ID:00012<>2" "ID:10078<>3"
> x9<-c("ID:00001&1","ID:00100&1","ID:00201&4","ID:014700&3","ID:12045&5","ID:00012&2","ID:10078&3")
> gsub("\&.*","",x9)
[1] "ID:00001" "ID:00100" "ID:00201" "ID:014700" "ID:12045" "ID:00012" "ID:10078"
> x10<-c("ID:00001;1","ID:00100;1","ID:00201;4","ID:014700;3","ID:12045;5","ID:00012;2","ID:10078;3")
> gsub("\;.*","",x10)
[1] "ID:00001" "ID:00100" "ID:00201" "ID:014700" "ID:12045" "ID:00012" "ID:10078"

更新於:2020-08-12

525 次觀看

開啟你的 職業生涯

完成課程並獲得認證

開始
廣告
© . All rights reserved.