How To Get & Count Duplicate Records In MYSQL

Welcome to a quick tutorial on how to get and count duplicate records in MySQL. So you want to filter out all the duplicate records? Or maybe get a count of the number of duplicates?

Assuming that only one table and column is involved, duplicate records can be retrieved with a simple query – SELECT `COLUMN` FROM `TABLE` GROUP BY `COLUMN` HAVING COUNT(*)>1

That covers the quick basics, but read on for detailed examples!

ⓘ I have included a zip file with all the source code at the start of this tutorial, so you don’t have to copy-paste everything… Or if you just want to dive straight in.

 

 

QUICK SLIDES

 

TABLE OF CONTENTS

Download & Notes Duplicate Records Useful Bits & Links
The End

 

DOWNLOAD & NOTES

Firstly, here is the download link to the example code as promised.

 

EXAMPLE CODE DOWNLOAD

Click here to download all the example source code, I have released it under the MIT license, so feel free to build on top of it or use it in your own project.

 

QUICK NOTES

If you spot a bug, please feel free to comment below. I try to answer questions too, but it is one person versus the entire world… If you need answers urgently, please check out my list of websites to get help with programming.

 

 

DEALING WITH DUPLICATE RECORDS

All right, let us now get into the examples of getting and counting duplicate records in MySQL.

 

DUMMY DATABASE TABLE

1-food.sql
CREATE TABLE `food` (
  `id` int(11) NOT NULL,
  `name` varchar(255) NOT NULL,
  `qty` int(11) NOT NULL DEFAULT 0
) ENGINE=InnoDB DEFAULT CHARSET=latin1;
 
INSERT INTO `food` (`id`, `name`, `qty`) VALUES
(1, 'Candy', 57),
(2, 'Pasta', 123),
(3, 'Apple', 24),
(4, 'Orange', 67),
(5, 'Pasta', 8),
(6, 'Melon', 64),
(7, 'Apple', 31),
(8, 'Pasta', 12),
(9, 'Rice', 87),
(10, 'Candy', 247);
 
ALTER TABLE `food`
  ADD PRIMARY KEY (`id`);
 
ALTER TABLE `food`
  MODIFY `id` int(11) NOT NULL AUTO_INCREMENT, AUTO_INCREMENT=11; 

For this example, we will be using a simple food table with only 3 fields:

  • id Primary key, auto-increment.
  • name Name of the food.
  • qty Food quantity.

For the uninitiated, this table is not exactly bad if it is a warehouse “stock tracking” table. But if this is the “main products table” for an eCommerce site, it gets an immediate red flag – The product name should be unique to prevent duplicate entries.

 

 

GET DUPLICATE ENTRIES

2-get-dup.sql
SELECT `name`, COUNT(*) AS `rows`
FROM `food`
GROUP BY `name`
HAVING `rows` > 1

So which foods have duplicate rows? As in the above introduction, a SELECT - GROUP BY - HAVING COUNT > 1 query is all we need to filter out the duplicate records.

 

GET FULL DUPLICATE ENTRIES

3-get-dup.php
SELECT *
FROM `food`
WHERE `name` IN (
  SELECT `name`
  FROM `food`
  GROUP BY `name`
  HAVING COUNT(*) > 1
)

The previous example only gave the name of the foods and the number of duplicate rows they have. Just how do we get all the duplicate rows? Very simply – We nest the previous “get duplicates” query into another SELECT * query.

 

 

SUM OF DUPLICATE RECORDS

4-sum.sql
SELECT `name`, SUM(`qty`) AS `total`
FROM `food`
GROUP BY `name`

Want to get the sum of the quantity of all the duplicate entries? This should be pretty self-explanatory.

 

REMOVING DUPLICATES

Lastly, this is one for you guys who are looking to “fix” and remove the duplicate entries. Some other tutorials give complicated crazy methods, I only have a simple one. The first and golden rule, be non-destructive, create a new “staging” table.

5a-remove.sql
-- (A) CREATE NEW "STAGING" TABLE
CREATE TABLE `food-stage` (
  `id` int(11) NOT NULL,
  `name` varchar(255) NOT NULL,
  `qty` int(11) NOT NULL DEFAULT 0
) ENGINE=InnoDB DEFAULT CHARSET=latin1;
 
ALTER TABLE `food-stage`
  ADD PRIMARY KEY (`id`),
  ADD UNIQUE KEY `name` (`name`);
 
ALTER TABLE `food-stage`
  MODIFY `id` int(11) NOT NULL AUTO_INCREMENT;

This is the same food table, except that name is now unique. Next, we write a simple script to “copy” the unique entries over.

5b-remove.php
<?php
// (A) CONNECT TO DATABASE
$pdo = new PDO(
  "mysql:host=localhost;dbname=test;charset=utf8",
  "root", "", [PDO::ATTR_DEFAULT_FETCH_MODE => PDO::FETCH_ASSOC]
);

// (B) GET SUM OF FOODS
$stmtA = $pdo->prepare("SELECT `name`, SUM(`qty`) AS `total` FROM `food` GROUP BY `name`");
$stmtA->execute();
while ($row = $stmtA->fetch()) {
  $stmtB = $pdo->prepare("INSERT INTO `food-stage` (`name`, `qty`) VALUES (?,?)");
  echo $row['name'] ." - ". $row['total'] . " - ";
  echo $stmtB->execute([$row['name'], $row['total']]) ? "OK" : "ERROR" ;
  echo "<br>";
}

This is in PHP, but it can really be in any programming language of your choice… So long as it connects to MySQL. Lastly, rename food to food-old and food-stage to food.

 

 

USEFUL BITS & LINKS

That’s all for the tutorial, and here is a small section on some extras and links that may be useful to you.

 

REMOVING DUPLICATES IS SUBJECTIVE

Before the trolls get angry, here’s a disclaimer – The process of removing duplicates is subjective to the existing database structure… It can get very complicated with foreign keys and even restructuring the entire database. So yes, which is why I recommend the non-destructive approach, and also create a backup copy of the database before you commit the changes.

 

LINKS & REFERENCES

 

INFOGRAPHIC CHEAT SHEET

Get & Count Duplicates In MYSQL (click to enlarge)

 

THE END

Thank you for reading, and we have come to the end. I hope that it has helped you to better understand, and if you want to share anything with this guide, please feel free to comment below. Good luck and happy coding!

Leave a Comment

Your email address will not be published. Required fields are marked *