MySQL - 查詢重複記錄



表中的重複記錄會降低 MySQL 資料庫的效率(透過增加執行時間、使用不必要的空間等)。因此,定位重複項對於有效地使用資料庫變得必要。

但是,我們也可以透過在所需的列上新增約束(例如 PRIMARY KEY 和 UNIQUE 約束)來防止使用者在表中輸入重複值。

但是,由於各種原因,例如人為錯誤、應用程式錯誤或從外部資源提取的資料,如果重複項仍然輸入到資料庫中,則有各種方法可以找到這些記錄。使用 **SQL GROUP BY** 和 **HAVING** 子句是過濾包含重複記錄的常用方法之一。

查詢重複記錄

在查詢表中的重複記錄之前,我們需要定義需要重複記錄的標準。您可以分兩步完成此操作:

  • 首先,我們需要使用 GROUPBY 子句按要檢查重複性的列對所有行進行分組。

  • 然後使用 Having 子句和 count 函式,我們需要驗證上述任何形成的組是否具有多個實體。

示例

首先,讓我們使用以下查詢建立一個名為 CUSTOMERS 的表:

CREATE TABLE CUSTOMERS (
   ID INT NOT NULL,
   NAME VARCHAR (20) NOT NULL,
   AGE INT NOT NULL,
   ADDRESS CHAR (25),
   SALARY DECIMAL (18, 2),
   PRIMARY KEY (ID)
);

現在,讓我們使用 INSERT IGNORE INTO 語句將一些重複記錄插入到上面建立的表中,如下所示:

INSERT INTO CUSTOMERS VALUES
(1, 'Ramesh', 32, 'Ahmedabad', 2000.00),
(2, 'Khilan', 25, 'Delhi', 1500.00),
(3, 'Kaushik', 23, 'Kota', 2000.00),
(4, 'Chaitali', 25, 'Mumbai', 6500.00),
(5, 'Hardik', 27, 'Bhopal', 8500.00),
(6, 'Komal', 22, 'Hyderabad', 4500.00),
(7, 'Muffy', 24, 'Indore', 10000.00);

表建立如下:

ID 姓名 年齡 地址 薪資
1 Ramesh 32 Ahmedabad 2000.00
2 Khilan 25 Delhi 1500.00
3 Kaushik 23 Kota 2000.00
4 Chaitali 25 Mumbai 6500.00
5 Hardik 27 Bhopal 8500.00
6 Komal 22 Hyderabad 4500.00
7 Muffy 24 Indore 10000.00

在以下查詢中,我們嘗試使用 MySQL COUNT() 函式返回重複記錄的數量:

SELECT SALARY, COUNT(SALARY) 
AS "COUNT" FROM CUSTOMERS
GROUP BY SALARY 
ORDER BY SALARY;

輸出

上面查詢的輸出如下所示:

薪資 計數
1500.00 1
2000.00 2
4500.00 1
6500.00 1
8500.00 1
10000.00 1

使用 Having 子句

MySQL 中的 **HAVING** 子句可用於過濾表中一組行的條件。在這裡,我們將使用 HAVING 子句與 COUNT() 函式一起查詢表中一個或多個列中的重複值。

單列中的重複值

以下是查詢表中單列中重複值的方法

**步驟 1:**首先,我們需要使用 GROUP BY 子句對要檢查重複項的列中的所有行進行分組。

**步驟 2:**然後,要查詢重複組,請在 HAVING 子句中使用 COUNT() 函式檢查是否有任何組的元素超過一個。

示例

使用以下查詢,我們可以找到 PETS 表中所有具有重複 DOG_NAMES 的行:

SELECT SALARY, COUNT(SALARY) 
FROM CUSTOMERS
GROUP BY SALARY
HAVING COUNT(SALARY) > 1;

輸出

輸出如下:

薪資 計數
2000.00 2

多列中的重複值

我們可以在 HAVING 子句中使用 AND 運算子來查詢多列中的重複行。只有當列的組合是重複的時,才會認為行是重複的。

示例

在以下查詢中,我們正在查詢 PETS 表中在 DOG_NAME、AGE、OWNER_NAME 列中具有重複記錄的行:

SELECT SALARY, COUNT(SALARY),
AGE, COUNT(AGE)
FROM CUSTOMERS
GROUP BY SALARY, AGE
HAVING  COUNT(SALARY) > 1
AND COUNT(AGE) > 1;

輸出

輸出如下:

薪資 計數 年齡 計數
2000.00 2 23 2

ROW_NUMBER() 函式與 PARTITION BY

在 MySQL 中,ROW_NUMBER() 函式和 PARTITION BY 子句可用於查詢表中的重複記錄。分割槽子句根據特定列或多列劃分表,然後 ROW_NUMBER() 函式為每個分割槽中的每行分配一個唯一的行號。具有相同分割槽和行號的行被視為重複行。

示例

在以下查詢中,我們正在分配一個

SELECT *, ROW_NUMBER() OVER (
   PARTITION BY SALARY, AGE
   ORDER BY SALARY, AGE
) AS row_numbers
FROM CUSTOMERS;

輸出

上面查詢的輸出如下所示:

ID 姓名 年齡 地址 薪資 行號
2 Khilan 25 Delhi 1500.00 1
1 Ramesh 23 Ahmedabad 2000.00 1
3 Kaushik 23 Kota 2000.00 2
4 Chaitali 25 Mumbai 6500.00 1
5 Hardik 27 Bhopal 8500.00 1
6 Komal 22 Hyderabad 4500.00 1
7 Muffy 24 Indore 10000.00 1

使用客戶端程式查詢重複記錄

我們還可以使用客戶端程式查詢重複記錄。

語法

要透過 PHP 程式查詢重複記錄,我們需要使用 GROUP BY 子句按列對所有行進行分組,然後使用 COUNT 函式來計算重複項的數量。為此,我們需要使用 **mysqli** 函式 **query()** 執行 SELECT 語句,如下所示:

$sql = "SELECT SALARY, COUNT(SALARY) AS "COUNT" FROM CUSTOMERS GROUP BY SALARY ORDER BY SALARY";
$mysqli->query($sql);

要透過 JavaScript 程式查詢重複記錄,我們需要使用 GROUP BY 子句按列對所有行進行分組,然後使用 COUNT 函式來計算重複項的數量。為此,我們需要使用 **mysql2** 庫的 **query()** 函式執行 SELECT 語句,如下所示:

sql = "SELECT SALARY, COUNT(SALARY) AS "COUNT" FROM CUSTOMERS GROUP BY SALARY ORDER BY SALARY";
con.query(sql)

要透過 Java 程式查詢重複記錄,我們需要使用 GROUP BY 子句按列對所有行進行分組,然後使用 COUNT 函式來計算重複項的數量。為此,我們需要使用 **JDBC** 函式 **executeQuery()** 執行 SELECT 語句,如下所示:

String sql = "SELECT SALARY, COUNT(SALARY) AS "COUNT" FROM CUSTOMERS GROUP BY SALARY ORDER BY SALARY";
statement.executeQuery(sql);

要透過 Python 程式查詢重複記錄,我們需要使用 GROUP BY 子句按列對所有行進行分組,然後使用 COUNT 函式來計算重複項的數量。為此,我們需要使用 **MySQL Connector/Python** 的 **execute()** 函式執行 SELECT 語句,如下所示:

duplicate_records_query = "SELECT SALARY, COUNT(SALARY) AS "COUNT" FROM CUSTOMERS GROUP BY SALARY ORDER BY SALARY"
cursorObj.execute(duplicate_records_query)

示例

以下是程式示例:

$dbhost = 'localhost';
$dbuser = 'root';
$dbpass = 'password';
$db = 'TUTORIALS';
$mysqli = new mysqli($dbhost, $dbuser, $dbpass, $db);
if ($mysqli->connect_errno) {
    printf("Connect failed: %s
", $mysqli->connect_error); exit(); } //printf('Connected successfully.
'); //let's create a table $sql = "CREATE TABLE Pets (ID int,DOG_NAME varchar(30) not null,AGE int not null,OWNER_NAME varchar(30) not null)"; if($mysqli->query($sql)){ printf("Pets table created successfully...!\n"); } //now lets insert some duplicate records; $sql = "INSERT IGNORE INTO Pets(ID, DOG_NAME, AGE, OWNER_NAME) VALUES(1, 'Fluffy', 1, 'Micheal')"; if($mysqli->query($sql)){ printf("First record inserted successfully...!\n"); } $sql = "INSERT IGNORE INTO Pets(ID, DOG_NAME, AGE, OWNER_NAME) VALUES(1, 'Fluffy', 1, 'Micheal')"; if($mysqli->query($sql)){ printf("Second record inserted successfully...!\n"); } $sql = "INSERT IGNORE INTO Pets(ID, DOG_NAME, AGE, OWNER_NAME) VALUES(2, 'Harry', 2, 'Jack')"; if($mysqli->query($sql)){ printf("Third records inserted successfully...!\n"); } $sql = "INSERT IGNORE INTO Pets(ID, DOG_NAME, AGE, OWNER_NAME) VALUES(3, 'Sheero', 1, 'Rose')"; if($mysqli->query($sql)){ printf("Fourth record inserted successfully...!\n"); } $sql = "INSERT IGNORE INTO Pets(ID, DOG_NAME, AGE, OWNER_NAME) VALUES(4, 'Simba', 2, 'Rahul')"; if($mysqli->query($sql)){ printf("Fifth record inserted successfully...!\n"); } //display the table records $sql = "SELECT * FROM PETS"; if($result = $mysqli->query($sql)){ printf("Table records: \n"); while($row = mysqli_fetch_array($result)){ printf("ID: %d, DOG_NAME %s, AGE: %d,OWNER_NAME: %s ", $row['ID'], $row['DOG_NAME'], $row['AGE'], $row['OWNER_NAME']); printf("\n"); } } //now lets group the all rows to find duplicate records... $sql = "SELECT ID, DOG_NAME, AGE, OWNER_NAME, COUNT(*) AS 'Count' FROM PETS GROUP BY ID, DOG_NAME, OWNER_NAME ORDER BY ID"; if($result = $mysqli->query($sql)){ printf("Table duplicate records: \n"); while($row = mysqli_fetch_array($result)){ printf("ID: %d, DOG_NAME %s, AGE: %d, OWNER_NAME: %s ", $row['ID'], $row['DOG_NAME'], $row['AGE'], $row['OWNER_NAME'], $row['Count']); printf("\n"); } } if($mysqli->error){ printf("Error message: ", $mysqli->error); } $mysqli->close();

輸出

獲得的輸出結果如下所示:

Pets table created successfully...!
First record inserted successfully...!
Second record inserted successfully...!
Third records inserted successfully...!
Fourth record inserted successfully...!
Fifth record inserted successfully...!
Table records:
ID: 1, DOG_NAME Fluffy, AGE: 1,OWNER_NAME: Micheal
ID: 1, DOG_NAME Fluffy, AGE: 1,OWNER_NAME: Micheal
ID: 2, DOG_NAME Harry, AGE: 2,OWNER_NAME: Jack
ID: 3, DOG_NAME Sheero, AGE: 1,OWNER_NAME: Rose
ID: 4, DOG_NAME Simba, AGE: 2,OWNER_NAME: Rahul
Table duplicate records:
ID: 1, DOG_NAME Fluffy, AGE: 1,OWNER_NAME: Micheal
ID: 2, DOG_NAME Harry, AGE: 2,OWNER_NAME: Jack
ID: 3, DOG_NAME Sheero, AGE: 1,OWNER_NAME: Rose
ID: 4, DOG_NAME Simba, AGE: 2,OWNER_NAME: Rahul     

var mysql = require('mysql2');
var con = mysql.createConnection({
    host: "localhost",
    user: "root",
    password: "Nr5a0204@123"
});

// Connecting to MySQL
con.connect(function (err) {
    if (err) throw err;
    console.log("Connected!");
    console.log("--------------------------");

    // Create a new database
    sql = "Create Database TUTORIALS";
    con.query(sql);

    sql = "USE TUTORIALS";
    con.query(sql);

    //Creating TABLE table
    sql = "CREATE TABLE Pets (ID int,DOG_NAME varchar(30) not null,AGE int not null,OWNER_NAME varchar(30) not null);"
    con.query(sql);

    sql = "INSERT IGNORE INTO Pets(ID, DOG_NAME, AGE, OWNER_NAME) VALUES(1,'Fluffy', 1, 'Micheal'),(1,'Fluffy', 1, 'Micheal'),(2,'Harry', 2, 'Jack'),(3,'Sheero', 1, 'Rose'),(4,'Simba', 2, 'Rahul'),(3,'Sheero', 1, 'Rose'),(3,'Sheero', 1, 'Rose');"
    con.query(sql);

    sql = "SELECT * FROM Pets;"
    con.query(sql, function(err, result){
      if (err) throw err
      console.log("**Records in Pets Table**");
      console.log(result);
      console.log("--------------------------");
    });

    sql = "SELECT ID, DOG_NAME, OWNER_NAME, COUNT(*) AS 'Count' FROM PETS GROUP BY ID, DOG_NAME, OWNER_NAME ORDER BY ID";
    con.query(sql, function(err, result){
      if (err) throw err
      console.log("**Count of duplicate records:**");
      console.log(result);
    });
});  

輸出

獲得的輸出結果如下所示:

 
Connected!
--------------------------
**Records in Pets Table**
[
  { ID: 1, DOG_NAME: 'Fluffy', AGE: 1, OWNER_NAME: 'Micheal' },
  { ID: 1, DOG_NAME: 'Fluffy', AGE: 1, OWNER_NAME: 'Micheal' },
  { ID: 2, DOG_NAME: 'Harry', AGE: 2, OWNER_NAME: 'Jack' },
  { ID: 3, DOG_NAME: 'Sheero', AGE: 1, OWNER_NAME: 'Rose' },
  { ID: 4, DOG_NAME: 'Simba', AGE: 2, OWNER_NAME: 'Rahul' },
  { ID: 3, DOG_NAME: 'Sheero', AGE: 1, OWNER_NAME: 'Rose' },
  { ID: 3, DOG_NAME: 'Sheero', AGE: 1, OWNER_NAME: 'Rose' }
]
--------------------------
**Count of duplicate records:**
[
  { ID: 1, DOG_NAME: 'Fluffy', OWNER_NAME: 'Micheal', Count: 2 },
  { ID: 2, DOG_NAME: 'Harry', OWNER_NAME: 'Jack', Count: 1 },
  { ID: 3, DOG_NAME: 'Sheero', OWNER_NAME: 'Rose', Count: 3 },
  { ID: 4, DOG_NAME: 'Simba', OWNER_NAME: 'Rahul', Count: 1 }
]
import java.sql.Connection;
import java.sql.DriverManager;
import java.sql.ResultSet;
import java.sql.Statement;
public class FindDuplicates {
  public static void main(String[] args) {
    String url = "jdbc:mysql://:3306/TUTORIALS";
    String user = "root";
    String password = "password";
    ResultSet rs;
    try {
      Class.forName("com.mysql.cj.jdbc.Driver");
            Connection con = DriverManager.getConnection(url, user, password);
            Statement st = con.createStatement();
            //System.out.println("Database connected successfully...!");
            String sql = "CREATE TABLE Pets (ID int,DOG_NAME varchar(30) not null,AGE int not null,OWNER_NAME varchar(30) not null)";
            st.execute(sql);
            System.out.println("Table Pets created successfully...!");
            //let's insert some records into it...
            String sql1 = "INSERT IGNORE INTO Pets(ID, DOG_NAME, AGE, OWNER_NAME) VALUES(1, 'Fluffy', 1, 'Micheal'), (1, 'Fluffy', 1, 'Micheal'),  (3, 'Sheero', 1, 'Rose'), (4, 'Simba', 2, 'Rahul')";
            st.execute(sql1);
            System.out.println("Records inserted successfully....!");
            String sql2 = "SELECT * FROM PETS";
            rs = st.executeQuery(sql2);
            System.out.println("Table records: ");
            while(rs.next()) {
              String id = rs.getString("ID");
              String dog_name = rs.getString("DOG_NAME");
              String age = rs.getString("AGE");
              String owner_name = rs.getString("OWNER_NAME");
              System.out.println("Id: " + id + ", Dog_name: " + dog_name + ", Age: " + age + ", Owner_name: " + owner_name);
            }
            //lets find duplicate records
            String sql3 = "SELECT ID, DOG_NAME, AGE, OWNER_NAME, COUNT(*) AS 'Count' FROM PETS GROUP BY ID, DOG_NAME, OWNER_NAME ORDER BY ID";
            rs = st.executeQuery(sql3);
            System.out.println("Table records are(with duplicate counts): ");
            while(rs.next()) {
              String id = rs.getString("ID");
              String dog_name = rs.getString("DOG_NAME");
              String age = rs.getString("AGE");
              String owner_name = rs.getString("OWNER_NAME");
              String t_count = rs.getString("Count");
              System.out.println("Id: " + id + ", Dog_name: " + dog_name + ", Age: " + age + ", Owner_name: " + owner_name + ", T_count: " + t_count);
            }
    }catch(Exception e) {
      e.printStackTrace();
    }
  }
}

輸出

獲得的輸出結果如下所示:

Table Pets created successfully...!
Records inserted successfully....!
Table records: 
Id: 1, Dog_name: Fluffy, Age: 1, Owner_name: Micheal
Id: 1, Dog_name: Fluffy, Age: 1, Owner_name: Micheal
Id: 3, Dog_name: Sheero, Age: 1, Owner_name: Rose
Id: 4, Dog_name: Simba, Age: 2, Owner_name: Rahul
Table records are(with duplicate counts): 
Id: 1, Dog_name: Fluffy, Age: 1, Owner_name: Micheal, T_count: 2
Id: 3, Dog_name: Sheero, Age: 1, Owner_name: Rose, T_count: 1
Id: 4, Dog_name: Simba, Age: 2, Owner_name: Rahul, T_count: 1  
import mysql.connector
# Establishing the connection
connection = mysql.connector.connect(
    host='localhost',
    user='root',
    password='password',
    database='tut'
)
# Creating a cursor object
cursorObj = connection.cursor()
# Creating the table 'Pets'
create_table_query = '''
CREATE TABLE Pets (
ID int,
DOG_NAME varchar(30) not null,
AGE int not null,
OWNER_NAME varchar(30) not null
);
'''
cursorObj.execute(create_table_query)
print("Table 'Pets' is created successfully!")
# Inserting records into 'Pets' table
sql = "INSERT IGNORE INTO Pets (ID, DOG_NAME, AGE, OWNER_NAME) VALUES (%s, %s, %s, %s);"
values = [
    (1, 'Fluffy', 1, 'Micheal'),
    (1, 'Fluffy', 1, 'Micheal'),
    (2, 'Harry', 2, 'Jack'),
    (3, 'Sheero', 1, 'Rose'),
    (4, 'Simba', 2, 'Rahul'),
    (3, 'Sheero', 1, 'Rose'),
    (3, 'Sheero', 1, 'Rose')
]
cursorObj.executemany(sql, values)
print("Values inserted successfully")
# Display table
display_table = "SELECT * FROM Pets;"
cursorObj.execute(display_table)
# Printing the table 'Pets'
results = cursorObj.fetchall()
print("\nPets Table:")
for result in results:
    print(result)
# Return the count of duplicate records
duplicate_records_query = """
SELECT ID, DOG_NAME, OWNER_NAME, COUNT(*) AS Count FROM Pets
GROUP BY ID, DOG_NAME, OWNER_NAME
ORDER BY ID;
"""
cursorObj.execute(duplicate_records_query)
dup_rec = cursorObj.fetchall()
print("\nDuplicate records:")
for record in dup_rec:
    print(record)
# Closing the cursor and connection
cursorObj.close()
connection.close()

輸出

獲得的輸出結果如下所示:

Table 'Pets' is created successfully!
Values inserted successfully

Pets Table:
(1, 'Fluffy', 1, 'Micheal')
(1, 'Fluffy', 1, 'Micheal')
(2, 'Harry', 2, 'Jack')
(3, 'Sheero', 1, 'Rose')
(4, 'Simba', 2, 'Rahul')
(3, 'Sheero', 1, 'Rose')
(3, 'Sheero', 1, 'Rose')

Duplicate records:
(1, 'Fluffy', 'Micheal', 2)
(2, 'Harry', 'Jack', 1)
(3, 'Sheero', 'Rose', 3)
(4, 'Simba', 'Rahul', 1)
廣告