Thursday, May 16, 2013

SELECT DISTINCT, a code smell

I was about to write why we should avoid SELECT DISTINCT, as it looks like a red flag that there’s some wrong in logic of the query. DISTINCT looks like a case of Hope-Driven-Development, hoping that DISTINCT won’t kill performance, and hoping it’s sufficient enough to eliminate the error in the logic of the query; and certainly, Hope Is Not a Strategy. We tend to use DISTINCT, either there’s something wrong in the logic of the query and/or when the database is not properly normalized. Fortunately, there are already plenty of articles that explains this SQL anomaly very well





* In 9 out of 10 cases when I encountered a "Select distinct", the programmer tried to remove duplicate rows caused by joins along multiple too-many associations (AKA cartesian product). This not only is bad programming style, it might also kill performance. -- http://arnosoftwaredev.blogspot.com.au/2011/01/is-select-distinct-code-smell.html




* A SELECT DISTINCT query is frequently a "code smell"; it indicates that something is not quite right. That is because using the DISTINCT keyword means that redundant data is being pulled from the database and then discarded. Usually it is better to rewrite the query's FROM and WHERE clauses to use a subquery to filter the data correctly so that you only get back what you want. -- http://www.databasejournal.com/features/postgresql/article.php/3437821/SELECT-DISTINCT-A-SQL-Case-Study.htm




* Use of “SELECT DISTINCT” to mask a join problem (Nick Harrison) -- https://www.simple-talk.com/blogs/2010/11/22/listing-common-sql-code-smells/




* The SELECT DISTINCT is rarely used in a properly designed schema. It is a “code smell”, symptom and not proof -- http://www.sqlteam.com/forums/topic.asp?TOPIC_ID=159603



Good read: http://c2.com/cgi/wiki?SelectDistinctIsaCodeSmell



Happy Coding! ツ

No comments:

Post a Comment