Get & Count Duplicate Records In MYSQL (Simple Examples)

Welcome to a quick tutorial on how to get and count duplicate records in MySQL. So you want to filter out all the duplicate records? Or maybe get a count of the number of duplicates?

Assuming that only one table and column are involved, duplicate records can be retrieved with a simple query – SELECT `COLUMN` FROM `TABLE` GROUP BY `COLUMN` HAVING COUNT(*)>1

That covers the quick basics, but read on for detailed examples!

 

 

TABLE OF CONTENTS

 

DOWNLOAD & NOTES

Here is the download link to the example code, so you don’t have to copy-paste everything.

 

EXAMPLE CODE DOWNLOAD

Source code on GitHub Gist

Just click on “download zip” or do a git clone. I have released it under the MIT license, so feel free to build on top of it or use it in your own project.

 

SORRY FOR THE ADS...

But someone has to pay the bills, and sponsors are paying for it. I insist on not turning Code Boxx into a "paid scripts" business, and I don't "block people with Adblock". Every little bit of support helps.

Buy Me A Coffee Code Boxx eBooks

 

DEALING WITH DUPLICATE RECORDS IN MYSQL

All right, let us now get into the examples of getting and counting duplicate records in MySQL.

 

PART 1) DUMMY DATABASE TABLE

1-food.sql
-- (A) FOOD TABLE
CREATE TABLE `food` (
  `id` bigint(20) NOT NULL,
  `name` varchar(255) NOT NULL,
  `qty` bigint(20) NOT NULL DEFAULT 0
) ENGINE=InnoDB DEFAULT CHARSET=utf8mb4;
 
ALTER TABLE `food`
  ADD PRIMARY KEY (`id`);
 
ALTER TABLE `food`
  MODIFY `id` int(11) NOT NULL AUTO_INCREMENT;
 
-- (B) DUMMY ENTRIES
INSERT INTO `food` (`id`, `name`, `qty`) VALUES
(1, 'Candy', 57),
(2, 'Pasta', 123),
(3, 'Apple', 24),
(4, 'Orange', 67),
(5, 'Pasta', 8),
(6, 'Melon', 64),
(7, 'Apple', 31),
(8, 'Pasta', 12),
(9, 'Rice', 87),
(10, 'Candy', 247);

For this example, we will be using a simple food table with only 3 fields:

  • id Primary key.
  • name Name of the food.
  • qty Food quantity.

For those who are new, this table structure is not exactly bad if it is for “warehouse stock tracking”. But if this is the “main products table” for an eCommerce site, it gets an immediate red flag – The product name should be unique to prevent duplicate entries.

 

 

PART 2) GET DUPLICATE RECORDS WITH COUNT

2-dup-count.sql
SELECT `name`, COUNT(*) AS `rows`
FROM `food`
GROUP BY `name`
HAVING `rows` > 1

So which foods have duplicate rows? As in the above introduction, a SELECT - GROUP BY - HAVING COUNT > 1 query is all we need to filter out the duplicate records.

 

PART 3) GET ALL RECORDS WITH DUPLICATES

3-dup-get.sql
SELECT *
FROM `food`
WHERE `name` IN (
  SELECT `name`
  FROM `food`
  GROUP BY `name`
  HAVING COUNT(*) > 1
)

The previous example only gave the name of the foods and the number of duplicate rows. Just how do we get all rows that have duplicates? Very simply – We nest the previous “get duplicates” query into another SELECT * query.

 

 

PART 4) SUMMING DUPLICATE RECORDS

4-dup-sum.sql
SELECT `name`, SUM(`qty`) AS `total`
FROM `food`
GROUP BY `name`

Want to get the sum of the quantity of all the duplicate entries? This should be pretty self-explanatory.

 

PART 5) REMOVING DUPLICATES

5A) CREATE A STAGING TABLE

5a-dup-remove.sql
-- (A) CREATE NEW "STAGING" TABLE
CREATE TABLE `food-stage` (
  `id` bigint(20) NOT NULL,
  `name` varchar(255) NOT NULL,
  `qty` bigint(20) NOT NULL DEFAULT 0
) ENGINE=InnoDB DEFAULT CHARSET=utf8mb4;
 
ALTER TABLE `food-stage`
  ADD PRIMARY KEY (`id`),
  ADD UNIQUE KEY `name` (`name`);
 
ALTER TABLE `food-stage`
  MODIFY `id` bigint(20) NOT NULL AUTO_INCREMENT;

This is one for you guys who are looking to “fix” and remove the duplicate entries. Some other tutorials give complicated crazy methods, I only have a simple one. The first and golden rule, be non-destructive and create a new “staging” table.

This is the same food table, except that name is now unique. Next, we write a simple script to “copy” the unique entries over.

 

 

5B) SCRIPT TO REMOVE DUPLICATES

5b-dup-remove.php
<?php
// (A) CONNECT TO DATABASE
$pdo = new PDO(
  "mysql:host=localhost;dbname=test;charset=utf8",
  "USER", "PASSWORD", [
  PDO::ATTR_ERRMODE => PDO::ERRMODE_EXCEPTION,
  PDO::ATTR_DEFAULT_FETCH_MODE => PDO::FETCH_ASSOC
]);

// (B) GET SUM OF FOODS
$stmtA = $pdo->prepare("SELECT `name`, SUM(`qty`) AS `total` FROM `food` GROUP BY `name`");
$stmtA->execute();
while ($row = $stmtA->fetch()) {
  $stmtB = $pdo->prepare("INSERT INTO `food-stage` (`name`, `qty`) VALUES (?,?)");
  echo $row['name'] ." - ". $row['total'] . " - ";
  echo $stmtB->execute([$row['name'], $row['total']]) ? "OK" : "ERROR" ;
  echo "<br>";
}

This is in PHP, but it can really be in any programming language of your choice… So long as it connects to MySQL. After running this script, the last step is to rename the tables:

  • food to food-old.
  • food-stage to food.

Done – You still have the old table as backup and can switch back if anything goes wrong.

 

 

EXTRAS

That’s all for the tutorial, and here is a small section on some extras and links that may be useful to you.

 

REMOVING DUPLICATES IS SUBJECTIVE

Before the trolls get angry, here’s a disclaimer – The process of removing duplicates is subjective to the existing database structure… It can get very complicated with foreign keys and even restructuring the entire database. So yes, which is why I recommend the non-destructive approach, and also create a backup copy of the database before you commit the changes.

 

LINKS & REFERENCES

 

THE END

Thank you for reading, and we have come to the end. I hope that it has helped you to better understand, and if you want to share anything with this guide, please feel free to comment below. Good luck and happy coding!

1 thought on “Get & Count Duplicate Records In MYSQL (Simple Examples)”

Leave a Comment

Your email address will not be published. Required fields are marked *