如何在R語言中建立姓名和姓氏的單獨列?
在資料分析中,很多時候人們的姓名和姓氏是合併在一起的,或者說儲存在一個單獨的欄位中,因此我們需要將它們分開,以便更容易閱讀資料。為了在R語言中建立姓名和姓氏的單獨列,我們可以使用tidyr包的extract函式。
檢視以下示例以瞭解如何操作。
示例1
以下程式碼片段建立一個示例資料框:
Names<-c("John Jones","Steve Smith","Pat Cummins","David Warner","Andrew Flintoff","Aaron Finch","Mitchell Starc","Nathan Lyon","Mathew Wade","Adam Zampa","Adam Gilchrist","Ricky Ponting","Glenn McGrath","Ben Cutting","John Cena","Brock Williams","Rubel Hussain","Soumya Sarkar","Mehidy Hasan","Liton Das")
df1<-data.frame(Names)
df1建立了以下資料框:
Names 1 John Jones 2 Steve Smith 3 Pat Cummins 4 David Warner 5 Andrew Flintoff 6 Aaron Finch 7 Mitchell Starc 8 Nathan Lyon 9 Mathew Wade 10 Adam Zampa 11 Adam Gilchrist 12 Ricky Ponting 13 Glenn McGrath 14 Ben Cutting 15 John Cena 16 Brock Williams 17 Rubel Hussain 18 Soumya Sarkar 19 Mehidy Hasan 20 Liton Das
要載入tidyr包並在df1中為姓名和姓氏建立單獨的列,請將以下程式碼新增到上面的程式碼片段中:
library(tidyr)
extract(df1,Names,c("First_Name","Last_Name"), "([^ ]+) (.*)")輸出
如果您將以上所有程式碼片段作為一個程式執行,則會生成以下輸出:
First_Name Last_Name 1 John Jones 2 Steve Smith 3 Pat Cummins 4 David Warner 5 Andrew Flintoff 6 Aaron Finch 7 Mitchell Starc 8 Nathan Lyon 9 Mathew Wade 10 Adam Zampa 11 Adam Gilchrist 12 Ricky Ponting 13 Glenn McGrath 14 Ben Cutting 15 John Cena 16 Brock Williams 17 Rubel Hussain 18 Soumya Sarkar 19 Mehidy Hasan 20 Liton Das
示例2
以下程式碼片段建立一個示例資料框:
Names<-c("Kane Williamson","Devon Conway","Trent Boult","Ross Taylor","Martin Guptill","Tim Southee","James Neesham","Lockie Ferguson","Ish Sodhi","Matt Henry","Tom Latham","Mark Chapman","Henry Nicholos","Tom Bundell","Sachin Tendulkar","Rahul Dravid","Chris Gayle","Tabraiz Shamsi","Aiden Makram","David Miller")
df2<-data.frame(Names)
df2建立了以下資料框:
Names 1 Kane Williamson 2 Devon Conway 3 Trent Boult 4 Ross Taylor 5 Martin Guptill 6 Tim Southee 7 James Neesham 8 Lockie Ferguson 9 Ish Sodhi 10 Matt Henry 11 Tom Latham 12 Mark Chapman 13 Henry Nicholos 14 Tom Bundell 15 Sachin Tendulkar 16 Rahul Dravid 17 Chris Gayle 18 Tabraiz Shamsi 19 Aiden Makram 20 David Miller
要在df2中為姓名和姓氏建立單獨的列,請將以下程式碼新增到上面的程式碼片段中:
extract(df2,Names,c("First_Name","Last_Name"), "([^ ]+) (.*)")
輸出
如果您將以上所有程式碼片段作為一個程式執行,則會生成以下輸出:
First_Name Last_Name 1 Kane Williamson 2 Devon Conway 3 Trent Boult 4 Ross Taylor 5 Martin Guptill 6 Tim Southee 7 James Neesham 8 Lockie Ferguson 9 Ish Sodhi 10 Matt Henry 11 Tom Latham 12 Mark Chapman 13 Henry Nicholos 14 Tom Bundell 15 Sachin Tendulkar 16 Rahul Dravid 17 Chris Gayle 18 Tabraiz Shamsi 19 Aiden Makram 20 David Miller
廣告
資料結構
網路
關係資料庫管理系統 (RDBMS)
作業系統
Java
iOS
HTML
CSS
Android
Python
C語言程式設計
C++
C#
MongoDB
MySQL
Javascript
PHP