<tutorialjinni.com/>

Remove duplicate rows in MySQL

Posted Under: Database, MySQL, Tutorials on Mar 13, 2020
Remove duplicate rows in MySQL
This tutorial provide snippet on how to remove duplicate rows that may have created over time. This may be a result of a spam attack or by mistakenly repeating entries. They consume precious resources of the server and decrease over all performance of any system. There are many ways to remove duplicates rows form a MySQL server. They can be removed based on a single column or measuring uniqueness based of multiple columns. Whatever you do before doing any alteration to your database first take BACKUP. Remove all Duplicate Rows based on a single column
ALTER IGNORE TABLE _TABLE_NAME_ ADD UNIQUE KEY idx1(_COLUMN_NAME_); 
Remove all Duplicate Rows based on a multiple columns
ALTER IGNORE TABLE _TABLE_NAME_ ADD UNIQUE KEY idx1(_COLUMN_NAME_1,_COLUMN_NAME_2,_COLUMN_NAME_3,...); 
This command will remove all duplicate rows and prevent any future duplication entries too. There is a issue with this command though, it only works for MySQL version <=5.6.

Another method is to delete duplicate entries is by using DELETE JOINS.
DELETE T1 FROM _TABLE_NAME_ T1, _TABLE_NAME_ T2 WHERE T1.ID > T2.ID AND T1.COL_1 = T2.COL_1
This command will preserve the oldest values.
DELETE T1 FROM _TABLE_NAME_ T1, _TABLE_NAME_ T2 WHERE T1.ID < T2.ID AND T1.COL_1 = T2.COL_1
This command will retain the newest values while deleting all other. You can add multiple columns for comparison to refine the results e.g AND T1.COL_2 = T2.COL_2 AND AND T1.COL_3 = T2.COL_3 and soo on. Remember T1 and T2 are the instances of the same table. Removing duplicates using a temporary tables. This method will create a temporary tables, which is the copy of the target table, with just an unique index on the fields we want to filter on.
-- create temporary table
CREATE TABLE _TEMP_TABLE_ LIKE _SOURCE_TABLE_;

-- add a unique index to desired columns    
ALTER TABLE _TEMP_TABLE_ ADD UNIQUE(_COL1_,_COL2_,_COL3_);

-- copy values in temporary table
INSERT IGNORE INTO _TEMP_TABLE_ SELECT * FROM _SOURCE_TABLE_;

--rename tables
RENAME TABLE _SOURCE_TABLE_ TO _SOURCE_TABLE_BACKUP_, _TEMP_TABLE_ TO _SOURCE_TABLE_;


imgae