select rows where column contains same data in more than one record
I have a table that has a column called
article_title. Let's say the table name is
articles. I need to find out the records where the
article_titledata is the same on more than one record.
Here's what I've got:
select a.* from articles a where a.article_title = (select article_title from articles where article_title = a.article_title AND a.id <> articles.id)
HAVING is a great aggregate filter. (http://dev.mysql.com/doc/refman/5.6/en/group-by-extensions.html) For example, select the article_titles with more than on occurrence:
SELECT count(*), article_title FROM articles GROUP BY article_title HAVING COUNT(*) > 1;
Adding columns to the SELECT and GROUP BY clauses allow you to locate duplicates based on a composite key of multiple columns.
@jkushner: but beware of the quirks (or bugs as some may see it) in MySQL's implementation of `GROUP BY`: http://www.mysqlperformanceblog.com/2006/09/06/wrong-group-by-makes-your-queries-fragile/
Your problem can be solved with this query:
SELECT * FROM article WHERE article_title IN (SELECT * FROM (SELECT article_title FROM article GROUP BY article_title HAVING COUNT(article_title) > 1) AS a);
have a table that has a column called article_title. Let's say the table name is articles. I need to find out the records where the article_title data is the same on more than one record.
Sound like to me you also need to have the id because you want to find records based on
article_titlebecause you have duplicates
Basic MIN/MAX with GROUP BY (you will miss id's when more then 2 duplicates)
SELECT MIN(id) -- for FIFO id's (first id by duplicate) , MAX(id) -- for LIFO id's (last id by duplicate) , article_title , COUNT(*) FROM articles WHERE -- Maybe to filter out '' or IS NOT NULL article_title != '' AND article_title IS NOT NULL GROUP BY article_title ASC HAVING COUNT(*) >= 2 ;
Or back to denormalisation to generate an CSV for LIFO id's (older id's by duplicates) but you know all id's here..
SELECT GROUP_CONCAT(id ORDER BY ASC SEPARATOR ',') -- change to DESC if want the last record first , article_title , COUNT(*) FROM articles GROUP BY article_title ASC HAVING COUNT(*) >= 2 ;