MySQL - 刪除重複記錄



MySQL 刪除重複記錄

資料庫(包括 MySQL)中的重複記錄非常常見。MySQL 資料庫以包含行和列的表的形式儲存資料。現在,當資料庫表中的兩行或多行具有相同的值時,該記錄被認為是重複的。

這種冗餘可能由於各種原因而發生:

  • 該行可能被插入兩次。
  • 從外部來源匯入原始資料時。
  • 資料庫應用程式中可能存在錯誤。

無論原因是什麼,刪除這種冗餘對於提高資料準確性、減少錯誤或提高資料庫效能效率都非常重要。

查詢重複值

在刪除重複記錄之前,我們必須找出它們是否存在於表中。可以使用以下方法:

  • GROUP BY 子句

  • COUNT() 方法

示例

讓我們首先建立一個名為“CUSTOMERS”的表,其中包含重複值:

CREATE TABLE CUSTOMERS(
   ID int,
   NAME varchar(100)
);

使用以下 INSERT 查詢,將一些記錄插入到“CUSTOMERS”表中。在這裡,我們添加了“John”作為重複記錄 3 次:

INSERT INTO CUSTOMERS 
VALUES (1,'John'), (2,'Johnson'), (3,'John'), (4,'John');

獲得的 CUSTOMERS 表如下所示:

id name
1 John
2 Johnson
3 John
4 John

現在,我們使用 COUNT() 方法和 GROUP BY 子句檢索表中重複的記錄,如下面的查詢所示:

SELECT NAME, COUNT(NAME) FROM CUSTOMERS
GROUP BY NAME HAVING COUNT(NAME) > 1;

輸出

獲得的輸出如下所示:

NAME COUNT(NAME)
John 3

刪除重複記錄

要從資料庫表中刪除重複記錄,我們可以使用 DELETE 命令。但是,此 DELETE 命令可以使用兩種方法從表中刪除重複項:

  • 使用 DELETE... JOIN

  • 使用 ROW_NUMBER() 函式

使用 DELETE... JOIN

為了使用 DELETE... JOIN 命令從表中刪除重複記錄,我們對其自身執行內部連線。這適用於並非完全相同的案例。

例如,假設客戶記錄中存在客戶詳細資訊的重複,但序列號不斷遞增。在這裡,即使 ID 不相同,記錄也是重複的。

示例

在下面的查詢中,我們使用前面建立的 CUSTOMERS 表來使用 DELETE... JOIN 命令刪除重複記錄:

DELETE t1 FROM CUSTOMERS t1
INNER JOIN CUSTOMERS t2
WHERE t1.id < t2.id AND t1.name = t2.name;

輸出

獲得的輸出如下所示:

Query OK, 2 rows affected (0.01 sec)

驗證

我們可以使用以下 SELECT 語句驗證是否已刪除重複記錄:

SELECT * FROM CUSTOMERS;

我們可以從獲得的表中看到,該查詢刪除了重複項,並在表中保留了不同的記錄:

ID NAME
2 Johnson
4 John

使用 ROW_NUMBER() 函式

MySQL 中的 ROW_NUMBER() 函式用於為從查詢獲得的結果集中的每一行分配一個從 1 開始的順序號。

使用此函式,MySQL 允許您檢測重複行,可以使用 DELETE 語句將其刪除。

示例

在這裡,我們將 ROW_NUMBER() 函式應用於在“NAME”列中具有重複值的 CUSTOMERS 表。我們將使用以下查詢基於“NAME”列在分割槽內分配行號:

SELECT id, ROW_NUMBER()
OVER (PARTITION BY name ORDER BY name) AS row_num
FROM CUSTOMERS;

獲得的輸出如下所示:

id row_num
1 1
3 2
4 3
2 1

現在,使用以下語句刪除重複行(行號大於 1 的行):

DELETE FROM CUSTOMERS WHERE id IN(
   SELECT id FROM (SELECT id, ROW_NUMBER()
   OVER (PARTITION BY name ORDER BY name) AS row_num
   FROM CUSTOMERS) AS temp_table WHERE row_num>1
);

我們得到如下所示的輸出:

Query OK, 2 rows affected (0.00 sec)

要驗證是否已刪除重複記錄,請使用以下 SELECT 查詢:

SELECT * FROM CUSTOMERS;

產生的結果如下所示:

ID NAME
1 John
2 Johnson

使用客戶端程式刪除重複記錄

我們還可以使用客戶端程式刪除重複記錄。

語法

要透過PHP程式刪除重複記錄,需要使用**mysqli**函式**query()**執行包含“DELETE”命令的內連線,如下所示:

$sql = "DELETE t1 FROM DuplicateDeleteDemo t1 INNER JOIN DuplicateDeleteDemo t2 WHERE t1.id < t2.id AND t1.name = t2.name";
$mysqli->query($sql);

要透過JavaScript程式刪除重複記錄,需要使用**mysql2**庫的**query()**函式執行包含“DELETE”命令的內連線,如下所示:

sql = "DELETE t1 FROM DuplicateDeleteDemo t1 INNER JOIN DuplicateDeleteDemo t2 WHERE t1.id < t2.id AND t1.name = t2.name";
con.query(sql)

要透過Java程式刪除重複記錄,需要使用**JDBC**函式**execute()**執行包含“DELETE”命令的內連線,如下所示:

String sql = "DELETE t1 FROM DuplicateDeleteDemo t1 INNER JOIN DuplicateDeleteDemo t2 WHERE t1.id < t2.id AND t1.name = t2.name";
statement.execute(sql);

要透過Python程式刪除重複記錄,需要使用**MySQL Connector/Python**的**execute()**函式執行包含“DELETE”命令的內連線,如下所示:

delete_query = "DELETE t1 FROM DuplicateDeleteDemo t1 INNER JOIN DuplicateDeleteDemo t2 WHERE t1.id < t2.id AND t1.name = t2.name"
cursorObj.execute(delete_query)

示例

以下是程式示例:

$dbhost = 'localhost';
$dbuser = 'root';
$dbpass = 'password';
$db = 'TUTORIALS';
$mysqli = new mysqli($dbhost, $dbuser, $dbpass, $db);
if ($mysqli->connect_errno) {
    printf("Connect failed: %s
", $mysqli->connect_error); exit(); } //printf('Connected successfully.
'); //let's create a table $sql = "CREATE TABLE DuplicateDeleteDemo(ID int,NAME varchar(100))"; if($mysqli->query($sql)){ printf("DuplicateDeleteDemo table created successfully...!\n"); } //now lets insert some duplicate records; $sql = "INSERT INTO DuplicateDeleteDemo VALUES(1,'John')"; if($mysqli->query($sql)){ printf("First record inserted successfully...!\n"); } $sql = "INSERT INTO DuplicateDeleteDemo VALUES(2,'Johnson')"; if($mysqli->query($sql)){ printf("Second record inserted successfully...!\n"); } $sql = "INSERT INTO DuplicateDeleteDemo VALUES(3,'John')"; if($mysqli->query($sql)){ printf("Third records inserted successfully...!\n"); } $sql = "INSERT INTO DuplicateDeleteDemo VALUES(4,'John')"; if($mysqli->query($sql)){ printf("Fourth record inserted successfully...!\n"); } //display the table records $sql = "SELECT * FROM DuplicateDeleteDemo"; if($result = $mysqli->query($sql)){ printf("Table records(before deleting): \n"); while($row = mysqli_fetch_array($result)){ printf("ID: %d, NAME %s", $row['ID'], $row['NAME']); printf("\n"); } } //now lets count duplicate records $sql = "SELECT NAME, COUNT(NAME) FROM DuplicateDeleteDemo GROUP BY NAME HAVING COUNT(NAME) > 1"; if($result = $mysqli->query($sql)){ printf("Duplicate records: \n"); while($row = mysqli_fetch_array($result)){ print_r($row); } } //lets delete dupliacte records $sql = "DELETE t1 FROM DuplicateDeleteDemo t1 INNER JOIN DuplicateDeleteDemo t2 WHERE t1.id < t2.id AND t1.name = t2.name"; if($mysqli->query($sql)){ printf("Duplicate records deleted successfully...!\n"); } $sql = "SELECT ID, NAME FROM DuplicateDeleteDemo"; if($result = $mysqli->query($sql)){ printf("Table records after deleting: \n"); while($row = mysqli_fetch_row($result)){ print_r($row); } } if($mysqli->error){ printf("Error message: ", $mysqli->error); } $mysqli->close();

輸出

獲得的輸出結果如下所示:

DuplicateDeleteDemo table created successfully...!
First record inserted successfully...!
Second record inserted successfully...!
Third records inserted successfully...!
Fourth record inserted successfully...!
Table records(before deleting):
ID: 1, NAME John
ID: 2, NAME Johnson
ID: 3, NAME John
ID: 4, NAME John
Duplicate records:
Array
(
    [0] => John
    [NAME] => John
    [1] => 3
    [COUNT(NAME)] => 3
)
Duplicate records deleted successfully...!
Table records after deleting:
Array
(
    [0] => 2
    [1] => Johnson
)
Array
(
    [0] => 4
    [1] => John
)    

var mysql = require('mysql2');
var con = mysql.createConnection({
    host: "localhost",
    user: "root",
    password: "Nr5a0204@123"
});

// Connecting to MySQL
con.connect(function (err) {
    if (err) throw err;
    console.log("Connected!");
    console.log("--------------------------");

    // Create a new database
    sql = "Create Database TUTORIALS";
    con.query(sql);

    sql = "USE TUTORIALS";
    con.query(sql);

    sql = "CREATE TABLE DuplicateDeleteDemo(ID int,NAME varchar(100));"
    con.query(sql);

    sql = "INSERT INTO DuplicateDeleteDemo VALUES(1,'John'),(2,'Johnson'),(3,'John'),(4,'John');"
    con.query(sql);

    sql = "SELECT * FROM DuplicateDeleteDemo;"
    con.query(sql, function(err, result){
      if (err) throw err
      console.log("**Records of DuplicateDeleteDemo Table:**");
      console.log(result);
      console.log("--------------------------");
    });

    //Fetching records that are duplicated in the table
    sql = "SELECT NAME, COUNT(NAME) FROM DuplicateDeleteDemo GROUP BY NAME HAVING COUNT(NAME) > 1;"
    con.query(sql, function(err, result){
      if (err) throw err
      console.log("**Records that are duplicated in the table:**");
      console.log(result);
      console.log("--------------------------");
    });

    sql = "DELETE t1 FROM DuplicateDeleteDemo t1 INNER JOIN DuplicateDeleteDemo t2 WHERE t1.id < t2.id AND t1.name = t2.name";
    con.query(sql);

    sql = "SELECT * FROM DuplicateDeleteDemo;"
    con.query(sql, function(err, result){
      if (err) throw err
      console.log("**Records after deleting Duplicates:**");
      console.log(result);
    });
});    

輸出

獲得的輸出結果如下所示:

 
Connected!
--------------------------
**Records of DuplicateDeleteDemo Table:**
[
  { ID: 1, NAME: 'John' },
  { ID: 2, NAME: 'Johnson' },
  { ID: 3, NAME: 'John' },
  { ID: 4, NAME: 'John' }
]
--------------------------
**Records that are duplicated in the table:**
[ { NAME: 'John', 'COUNT(NAME)': 3 } ]
--------------------------
**Records after deleting Duplicates:**
[ { ID: 2, NAME: 'Johnson' }, { ID: 4, NAME: 'John' } ]
import java.sql.Connection;
import java.sql.DriverManager;
import java.sql.ResultSet;
import java.sql.Statement;
public class DeleteDuplicates {
    public static void main(String[] args) {
        String url = "jdbc:mysql://:3306/TUTORIALS";
        String user = "root";
        String password = "password";
        ResultSet rs;
        try {
            Class.forName("com.mysql.cj.jdbc.Driver");
            Connection con = DriverManager.getConnection(url, user, password);
            Statement st = con.createStatement();
            //System.out.println("Database connected successfully...!");
            String sql = "CREATE TABLE DuplicateDeleteDemo(ID int,NAME varchar(100))";
            st.execute(sql);
            System.out.println("Table DuplicateDeleteDemo created successfully...!");
            //let's insert some records into it...
            String sql1 = "INSERT INTO DuplicateDeleteDemo VALUES (1,'John'),  (2,'Johnson'), (3,'John'), (4,'John')";
            st.execute(sql1);
            System.out.println("Records inserted successfully....!");
            //print table records
            String sql2 = "SELECT * FROM DuplicateDeleteDemo";
            rs = st.executeQuery(sql2);
            System.out.println("Table records(before deleting the duplicate rcords): ");
            while(rs.next()) {
                String id = rs.getString("id");
                String name = rs.getString("name");
                System.out.println("Id: " + id + ", Name: " + name);
            }
            //let delete duplicate records using delete join
            String sql3 = "DELETE t1 FROM DuplicateDeleteDemo t1 INNER JOIN DuplicateDeleteDemo t2 WHERE t1.id < t2.id AND t1.name = t2.name";
            st.execute(sql3);
            System.out.println("Duplicate records deleted successfully....!");
            String sql4 = "SELECT * FROM DuplicateDeleteDemo";
            rs = st.executeQuery(sql4);
            System.out.println("Table records(after deleting the duplicate rcords): ");
            while(rs.next()) {
                String id = rs.getString("id");
                String name = rs.getString("name");
                System.out.println("Id: " + id + ", Name: " + name);
            }
        }catch(Exception e) {
            e.printStackTrace();
        }
    }
}    

輸出

獲得的輸出結果如下所示:

Table DuplicateDeleteDemo created successfully...!
Records inserted successfully....!
Table records(before deleting the duplicate rcords): 
Id: 1, Name: John
Id: 2, Name: Johnson
Id: 3, Name: John
Id: 4, Name: John
Duplicate records deleted successfully....!
Table records(after deleting the duplicate rcords): 
Id: 2, Name: Johnson
Id: 4, Name: John
import mysql.connector
# Establishing the connection
connection = mysql.connector.connect(
    host='localhost',
    user='root',
    password='password',
    database='tut'
)
# Creating a cursor object
cursorObj = connection.cursor()
# Creating the table 'DuplicateDeleteDemo'
create_table_query = '''CREATE TABLE DuplicateDeleteDemo(ID int, NAME varchar(100))'''
cursorObj.execute(create_table_query)
print("Table 'DuplicateDeleteDemo' is created successfully!")
# Inserting records into 'DuplicateDeleteDemo' table
sql = "INSERT INTO DuplicateDeleteDemo (ID, NAME) VALUES (%s, %s);"
values = [(1, 'John'), (2, 'Johnson'), (3, 'John'), (4, 'John')]
cursorObj.executemany(sql, values)
print("Values inserted successfully")
# Display table
display_table = "SELECT * FROM DuplicateDeleteDemo;"
cursorObj.execute(display_table)
# Printing the table 'DuplicateDeleteDemo'
results = cursorObj.fetchall()
print("\nDuplicateDeleteDemo Table:")
for result in results:
    print(result)
# Retrieve the duplicate records
duplicate_records_query = """
SELECT NAME,
COUNT(NAME)
FROM DuplicateDeleteDemo
GROUP BY NAME
HAVING COUNT(NAME) > 1;
"""
cursorObj.execute(duplicate_records_query)
dup_rec = cursorObj.fetchall()
print("\nDuplicate records:")
for record in dup_rec:
    print(record)
# Delete duplicate records
delete_query = "DELETE t1 FROM DuplicateDeleteDemo t1 INNER JOIN DuplicateDeleteDemo t2 WHERE t1.id < t2.id AND t1.name = t2.name"
cursorObj.execute(delete_query)
print("Duplicate records deleted successfully")
# Verification
display_table_after_delete = "SELECT * FROM DuplicateDeleteDemo;"
cursorObj.execute(display_table_after_delete)
results_after_delete = cursorObj.fetchall()
print("\nDuplicateDeleteDemo Table (After Delete):")
for result in results_after_delete:
    print(result)
# Closing the cursor and connection
cursorObj.close()
connection.close()

輸出

獲得的輸出結果如下所示:

Table 'DuplicateDeleteDemo' is created successfully!
Values inserted successfully

DuplicateDeleteDemo Table:
(1, 'John')
(2, 'Johnson')
(3, 'John')
(4, 'John')

Duplicate records:
('John', 3)
Duplicate records deleted successfully

DuplicateDeleteDemo Table (After Delete):
(2, 'Johnson')
(4, 'John')    
廣告