Thursday, May 16, 2013

SELECT DISTINCT, a code smell

I was about to write why we should avoid SELECT DISTINCT, as it looks like a red flag that there’s some wrong in logic of the query. DISTINCT looks like a case of Hope-Driven-Development, hoping that DISTINCT won’t kill performance, and hoping it’s sufficient enough to eliminate the error in the logic of the query; and certainly, Hope Is Not a Strategy. We tend to use DISTINCT, either there’s something wrong in the logic of the query and/or when the database is not properly normalized. Fortunately, there are already plenty of articles that explains this SQL anomaly very well





* In 9 out of 10 cases when I encountered a "Select distinct", the programmer tried to remove duplicate rows caused by joins along multiple too-many associations (AKA cartesian product). This not only is bad programming style, it might also kill performance. -- http://arnosoftwaredev.blogspot.com.au/2011/01/is-select-distinct-code-smell.html




* A SELECT DISTINCT query is frequently a "code smell"; it indicates that something is not quite right. That is because using the DISTINCT keyword means that redundant data is being pulled from the database and then discarded. Usually it is better to rewrite the query's FROM and WHERE clauses to use a subquery to filter the data correctly so that you only get back what you want. -- http://www.databasejournal.com/features/postgresql/article.php/3437821/SELECT-DISTINCT-A-SQL-Case-Study.htm




* Use of “SELECT DISTINCT” to mask a join problem (Nick Harrison) -- https://www.simple-talk.com/blogs/2010/11/22/listing-common-sql-code-smells/




* The SELECT DISTINCT is rarely used in a properly designed schema. It is a “code smell”, symptom and not proof -- http://www.sqlteam.com/forums/topic.asp?TOPIC_ID=159603



Good read: http://c2.com/cgi/wiki?SelectDistinctIsaCodeSmell



Happy Coding! ツ

Wednesday, May 15, 2013

Database convention that prevents confusions

Don't do it like this:
Order:
    OrderId     PK
    Date
    Description



LineItem:
    LineItemId  PK  
    OrderId     FK
    ProductId   FK
    Qty
    UnitPrice
    Amount


To the uninitiated, the LineItem table will look like an standalone table, wherein OrderId will seems like a ProductId column too. Meaning, LineItem's OrderId seems something that can be changed by the user (e.g. via dropdown). Where in actual application, the OrderId on LineItem is something that is set by the app and not being set by the user, not visible to user.



To emphasize that the LineItem only makes sense within the domain of Order and is not a standalone table, and consequently emphasizing that the LineItem's OrderId is not a user-selectable value, move the OrderId before the LineItemId primary key. That is, any foreign key that is before the primary key, should be treated as immutable foreign key.


LineItem table will now look like this:
LineItem:
    OrderId     FK
    LineItemId  PK  
    ProductId   FK
    Qty
    UnitPrice
    Amount             





Placing the immutable foreign key before the primary key is also especially useful when you have a third table(e.g. Preference) on which the defined relationship is not readily apparent. For example:
Product:
    ProductId       PK
    ProductName
    
    
Reseller:
    ResellerId      PK
    ResellerName 
    

Preference:
    PreferenceId    PK
    ProductId       FK
    ResellerId      FK
    PreferenceLevel


If you have a third table that has an unclear name and renaming the table is not an option (e.g. brownfield projects), you can at least move the owning FK before the primary key to emphasize the ownership of Preference table.


For example, if you are a supplier of products and you want to keep track your preferred resellers on each of your product, it's good to structure your Preference table like this:
Preference
    ProductId       FK
    PreferenceId    PK
    ResellerId      FK
    PreferenceLevel


In that case, moving the ProductId before the PreferenceId primary key indicates that the owning entity of the Preference table is the Product table. Hence the PreferenceLevel on that table design, indicates your most preferred resellers for your given product down to the least preferred.



Likewise, if you move the ResellerId before the PreferenceId primary key:
Preference
    ResellerId      FK
    PreferenceId    PK  
    ProductId       FK
    PreferenceLevel


The PreferenceLevel value on that design indicates the most preferred products by a given reseller to buy from you, down to their least preferred. The owning entity on that Preference table is the Reseller


This is just one of the many simple database conventions you can apply to your database, and it can surely help anyone who will read and need to understand your domain entities' relationships.




Lastly, if you have a third table like this:
BookAuthor:
     BookAuthorId
     BookId
     AuthorId


It can also be named as AuthorBook:
AuthorBook:
     AuthorBookId
     AuthorId
     BookId



The third table can be named either way, this is usually called a junction table, this defines a many-to-many relationship, there is no one single owner of the junction table. The junction table is owned by both Book and Author entity. And the columns can be arranged either way too.


To emphasize that there is no single ownership of the junction table. Design your own convention to emphasize non-single ownership. First you can define the third table primary key be the first column of the table, and the two other columns as unique composite. You can also designate some naming convention, e.g. you can name the table like this:

Author_assoc_Book:
     Author_assoc_Book_Id
     AuthorId
     BookId

It's ok to give name your junction table a weird name, as junction tables without payload are usually hidden and not mapped to your ORM classes. On the example above, we put the word assoc between Author and Book, you can choose your own naming convention to emphasize the junction-ness of the table, just be consistent on the naming convention. On the order of entities on the junction table name, since there is no one single owner of the junction table, we can opt to just arrange the name of the two tables on the junction table name based on their alphabetical order to emphasize non-sole-ownership. So Author comes before Book, giving the junction table a name of Author_assoc_Book, not Book_assoc_Author. Likewise with the their columns arrangement, it's alphabetically arranged too, hence AuthorId comes before BookId.


Happy Coding! ツ

Sunday, May 12, 2013

NuGet on other platforms

How to use NuGet on other platforms, e.g. Mac, Linux, FreeBSD, etc

http://www.codedistillers.com/rui/2013/01/09/net-web-dev-on-mac-nuget-monodevelop-nancy/


Other platforms are good for developers who are moonlighting as astronauts, or if you are just simply a disenfranchised Windows developer


And oh, another Linux device is powering a satellite

Friday, May 10, 2013

Brownfield Projects Woes

Maintaining brownfield projects is a myth, especially if it reeks of technical debts. So they tend to be rewritten, and voila! a greenfield project!



5 reasons to use AngularJS in the corporate app world in order to avoid rewriting brownfield projects



And this is the coolness B-) On-the-fly unit test:



“By using Karma Test Runner given to you by the AngularJS generators in Yeoman, you can have your unit tests running all the time in the background, that way if you change anything and hit save, your tests will automatically re-run and you will know if you are breaking something, where you are breaking it, and if the tests are built correctly maybe even why you are breaking it.”

Saturday, May 4, 2013

Earworm, I kept humming the instrumental intro of Eat Bulaga

These past few weeks I keep humming the instrumental intro of Eat Bulaga while I'm coding, which only recently I was able to unlock the reason why. The instrumental intro of Eat Bulaga is eerily similar to Sing a Song.


Here's the line I keep humming on Sing a Song:


"Don't worry if it's not good enough" -- http://www.metrolyrics.com/sing-lyrics-lea-salonga.html


Very similar to Eat Bulaga's instrumental intro.


Subconciously, I'm singing a very developer-centric song. So the next time you hear someone humming Eat Bulaga, that doesn't automatically translate to them being an Eat Bulaga fan(not that I have a problem on people being a partisan of a given primetime show or TV network :)). For all we know, those developers are subconsciously seeking enlightenment on the quality of their work.


Code quality should be tempered with the reality of the deadlines. Code quality is important, but it's not a be-all and end-all of an application development. There's also this thing called Time-to-Market.


"Quality in development is important, but it has to be scaled appropriately" -- http://haacked.com/archive/2010/08/26/not-paid-to-write-code.aspx


Of course, code quality and time-to-market (that's deadline for us who are not on the frontlines) are not mutually exclusive. Both are achievable, when torn between those two, just remember that perfect is the enemy of the good. And there's the Pareto Principle that could guide us on achieving both.
And don't even tell you're a perfectionist. That's a humblebrag masquerading as a virtue. There are more people way more perfectionist than you, and that would reduce you to a mere humblebragger



So there's the feel-good tune of Eat Bulaga, the feel of the music resonates with developers ;-)


Happy Coding! ツ

XY Problem. If someone is looking for a fix on Y, ask them first: "What's the X?"

Nope, XY problem doesn't mean men are troubles lol


There are certain class of developers that are too enamored with their codes, that when they ask for help they will proudly show their codes first and ask help on those codes, rather than stating the problem itself first


So what is XY problem? That's the terminology for the scenario above.


"The X-Y Problem, as it is sometimes called, is a mental block which leads to enormous amounts of wasted time and energy, both on the part of people asking for help, and on the part of those providing help. It often goes something like this:

* User wants to do X.
* User doesn't know how to do X, but thinks they can fumble their way to a solution if they can just manage to do Y.
* User doesn't know how to do Y either.
* User asks for help with Y.
* Others try to help user with Y, but are confused because Y seems like a strange problem to want to solve.
* After much interaction and wasted time, it finally becomes clear that the user really wants help with X, and that Y wasn't even a suitable solution for X.
" -- http://mywiki.wooledge.org/XyProblem


So the next time around, before you even show how awesome your coding skills are, remember first that other people's time and energy are important, don't let it go wasted. Tell your X first, before you even show your Y.


Friends don't let friends waste time or energy on fixing the Y when a Y could even be discarded and go for a better solution when X is stated first.


The best code is no code at all. Code, especially the longer ones, distracts us from stating the intent/problem of the application.



Some developers just love to provide solution, even if it meant not writing a single line of code. They'll just purchase a Microsoft Word and sell it to you with a nice markup for their troubles heading to a software shop, and sip Pina Coladas on the beach the rest of the day. Ok, a lot of context is missing here, that's why this paragraph looks weird. See the XY problem connection on this paragraph? Y often looks weird if X is not stated. For the X, head to this blog post: http://haacked.com/archive/2010/08/26/not-paid-to-write-code.aspx



Other good sources on XY problem:

http://blog.spitfiremanagement.com/2013/04/10/avoid-the-xy-problem/


http://daniel-lange.com/archives/69-The-XY-problem-in-IT-management.html


"The XY problem is when you need to do X, and you think you can use Y to do X, so you ask about how to do Y, when what you really should do is state what your X problem is. There may be a Z solution that is even better than Y, but nobody can suggest it if X is never mentioned." -- http://meta.stackoverflow.com/questions/66377/what-is-the-xy-problem

Thursday, May 2, 2013

The last vestige of me thinking in imperative manner

SQL is a declarative language, but there are some imperative code lurking in an SQL query. Case in point:

select d.*, p.product_name
from order_detail d
join product p on p.product_id = d.product_id

Did you spot the imperative code there?

I still think of the second table and first table in terms of inner loop being compared to an outer loop, hence p.product_id = d.product_id. I should forget how things work under-the-hood, forget the loops, and be comfortable with a more declarative-looking code: d.product_id = p.product_id