如何將 R 中的長字串拆分成大小相等的子字串向量?


如果一個向量被錯誤地記錄為一個單一的字串,或者包含資料的檔案沒有以適當的方式分隔字串,那麼我們可能需要以正確的格式進行分隔,以便進行進一步分析。當因子變數的級別具有相等的名稱長度且未分隔時,可能會發生這種情況。在這種情況下,我們可以使用 substring 函式將字串拆分為包含大小相等子字串的向量。

示例

看看這些示例,瞭解 substring 函式如何幫助我們將字串拆分為子字串向量 −

 實際演示

Factor<-"aabbccddabacadbabcbdcacbcddadbdc"
substring(Factor,seq(1,nchar(Factor),2),seq(2,nchar(Factor), 2))

輸出

[1] "aa" "bb" "cc" "dd" "ab" "ac" "ad" "ba" "bc" "bd" "ca" "cb" "cd" "da" "db"
[16] "dc"
x1<-"abcdefghijklmopqrstuvwxyz"
substring(x1,seq(1,nchar(x1),2),seq(2,nchar(x1), 2))
[1] "ab" "cd" "ef" "gh" "ij" "kl" "mo" "pq" "rs" "tu" "vw" "xy" ""
substring(x1,seq(1,nchar(x1),2),seq(3,nchar(x1), 2))
[1] "abc" "cde" "efg" "ghi" "ijk" "klm" "mop" "pqr" "rst" "tuv" "vwx" "xyz"
[13] ""
substring(x1,seq(1,nchar(x1),3),seq(3,nchar(x1), 3))
[1] "abc" "def" "ghi" "jkl" "mop" "qrs" "tuv" "wxy" ""
substring(x1,seq(1,nchar(x1),4),seq(3,nchar(x1), 4))
[1] "abc" "efg" "ijk" "mop" "rst" "vwx" ""
substring(x1,seq(1,nchar(x1),4),seq(4,nchar(x1), 4))
[1] "abcd" "efgh" "ijkl" "mopq" "rstu" "vwxy" ""
substring(x1,seq(1,nchar(x1),4),seq(5,nchar(x1), 4))
[1] "abcde" "efghi" "ijklm" "mopqr" "rstuv" "vwxyz" ""
substring(x1,seq(1,nchar(x1),5),seq(5,nchar(x1), 5))
[1] "abcde" "fghij" "klmop" "qrstu" "vwxyz"
substring(x1,seq(1,nchar(x1),10),seq(5,nchar(x1), 10))
[1] "abcde" "klmop" "vwxyz"
substring(x1,seq(1,nchar(x1),10),seq(10,nchar(x1), 10))
[1] "abcdefghij" "klmopqrstu" ""
substring(x1,seq(1,nchar(x1),10),seq(2,nchar(x1), 10))
[1] "ab" "kl" "vw"
substring(x1,seq(1,nchar(x1),10),seq(3,nchar(x1), 10))
[1] "abc" "klm" "vwx"
substring(x1,seq(1,nchar(x1),10),seq(5,nchar(x1), 10))
[1] "abcde" "klmop" "vwxyz"
substring(x1,seq(1,nchar(x1),2),seq(2,nchar(x1)+2-1, 2))
[1] "ab" "cd" "ef" "gh" "ij" "kl" "mo" "pq" "rs" "tu" "vw" "xy" "z"
substring(x1,seq(1,nchar(x1),4),seq(4,nchar(x1)+4-1, 4))
[1] "abcd" "efgh" "ijkl" "mopq" "rstu" "vwxy" "z"
substring(x1,seq(1,nchar(x1),3),seq(4,nchar(x1)+4-1, 3))
[1] "abcd" "defg" "ghij" "jklm" "mopq" "qrst" "tuvw" "wxyz" "z"
substring(x1,seq(1,nchar(x1),5),seq(4,nchar(x1)+4-1, 5))
[1] "abcd" "fghi" "klmo" "qrst" "vwxy"
substring(x1,seq(1,nchar(x1),2),seq(4,nchar(x1)+4-1, 2))
[1] "abcd" "cdef" "efgh" "ghij" "ijkl" "klmo" "mopq" "pqrs" "rstu" "tuvw"
[11] "vwxy" "xyz" "z"
substring(x1,seq(1,nchar(x1),2),seq(5,nchar(x1)+5-1, 2))
[1] "abcde" "cdefg" "efghi" "ghijk" "ijklm" "klmop" "mopqr" "pqrst" "rstuv"
[10] "tuvwx" "vwxyz" "xyz" "z"

更新於: 2020 年 8 月 21 日

349 次瀏覽

開啟你的 職業生涯

完成課程認證

立即開始
廣告