Sunday, May 27, 2012

The specified Tomcat installation directory does not exist

If you encountered this error on Eclipse:

The specified Tomcat installation directory does not exist.

Check for CATALINA_HOME value on commandline. Type set on commandline, you can then see the value of CATALINA_HOME, mine is /Library/Tomcat/Home

Go to Eclipse's Preferences. Go to Server > Runtime Environments. Then click Add, then select Apache Tomcat, click Next; on Tomcat installation directory, click Browse..., then paste /Library/Tomcat/Home to the Open dialog textbox; if you are using OS X press command+shift+G first, then paste /Library/Tomcat/Home in the Go to the folder's textbox, then click Go, then click Open

Then click Finish, then click OK

You can now create Dynamic Web Project

Wednesday, May 23, 2012

Why I like Postgresql? It has first-class boolean

What would you rather read. This?

select personId
from personDailyDiet
group by personId
having count(case when fruit = 'apple' then 1 end) = count(*)


Or this?

select personId
from personDailyDiet
group by personId
having every(fruit = 'apple');

And every automatically short-circuit if it find a row that doesn't meet the condition. Not only it is readable, it is fast too.

EVERY is an alias for bool_and, you can use bool_and to make your query sounds computer-sciencey ;-)
select personId
from personDailyDiet
group by personId
having bool_and(fruit = 'apple');


It's unfortunate that ANY keyword is already taken, the bool_or could be aliased as ANY.

bool_and = EVERY
bool_or = no english-like keyword

Running total. Here and now

Moved to: http://www.ienablemuch.com/2012/05/running-total-here-and-now.html

Friday, May 18, 2012

Operator Precedence 101

#include <stdio.h>
 
int main() {
 
 
    int a = 1;
    int b = 0;
 
    // Following two statements returns the same output
    printf("\nhah %d", a != 0 & b != 0); // returns 1
    printf("\nhah %d", a != 0 && b != 0); // returns 1
 
    // Despite the common misconception that the statement 1 when written explicitly
    // is this...
    printf("\nmeh %d", (a != (0 & b)) != 0); // returns 0
    // ..., statement 1's AND(&) operator still retain the same operator precedence as its short-circuit cousin(&&)
 
 
    printf("\n");
 
    const int ALT_KEY = 2;
    int input = 1;
     
    // should return 0, it returns 0:
    printf("\nhah %d", (input & ALT_KEY) == ALT_KEY);
 
    // despite the expectation that this "should" return 0, this doesn't return 0:
    printf("\nmeh %d", input & ALT_KEY == ALT_KEY);
 
    // So it means, despite the introduction of short-circuit operator,
    // the non-short-circuit logical/bitwise operator (&,|) still retain their 
    // operator precedence.
    // Hence, the unparenthesized expression (input & ALT_KEY == ALT_KEY), when written explicitly is still evaluated as:
    printf("\nhah %d", input & (ALT_KEY == ALT_KEY) ); // returns 1
 
    // Similar with operator precedence of logical operator:
    printf("\nhah %d", input && ALT_KEY == ALT_KEY ); // returns 1
     
    // Logical operator when written explicitly
    printf("\nhah %d", input && (ALT_KEY == ALT_KEY) ); // returns 1
     
 
 
    printf("\n");
}


You might notice that I mentioned the phrase still retain two times. There was a time that there was no short-circuit operators, && and ||. Programmers uses & and | for logical expression. Despite the introduction of their short-circuit equivalent, the non-short-circuit logical/bitwise operator precedence was not changed, the non-short-circuit operators' precedence are still equivalent with the short-circuit ones.


Long before, when I'm making a DOS(hahah showing my age) program to detect a hotkey, I tried it this way first:

   if (input & ALT_KEY == ALT_KEY) {
      printf("You pressed the shortcut combo!");
   }


Expecting it to evaluate the same with arithmetic operator precedence:
   if (a + b == C) {
      printf("True true");
   }


But alas, C is not free with old cruft, it still has B-ness in it. So you must write it this way:
   if ( (input & ALT_KEY) == ALT_KEY ) {
      printf("You pressed the shortcut combo!");
   }


If the unparenthesized expression(input & ALT_KEY == ALT_KEY) is written explicitly, it is this:
   if ( input & (ALT_KEY == ALT_KEY) ) {
      printf("You pressed the shortcut combo!");
   }

Disappointing isn't it? Not symmetrical with arithmetic operator precedence.


Backward compatibility at its finest: http://cm.bell-labs.com/cm/cs/who/dmr/chist.html

Output:
hah 0
hah 0
meh 1

hah 0
meh 1
hah 1
hah 1
hah 1

See the live example: http://ideone.com/allR6

Thursday, May 17, 2012

Java said, C# said. Scope consistency

Whereas in Java, this illegal code would become legal when you move the int i = 20 before the last line, in C# it is still illegal


class TestLoop{
   public static void main(String[] args){
 

        int i = 20;                 
        
        for (int i = 0; i < 10; i++) System.out.print(i + " ");  
        for (int i = 10; i > 0; i--) System.out.print(i + " ");  


        System.out.print(i + " ");   // last line
   }
}


Moving int i = 20; after those loops that uses the same variable name i would become legal in Java; whereas in C#, it is still illegal:

class TestLoop{
   public static void main(String[] args){
 
        
        for (int i = 0; i < 10; i++) System.out.print(i + " ");  
        for (int i = 10; i > 0; i--) System.out.print(i + " ");  

        int i = 20;                 

        System.out.print(i + " ");   // last line
   }
}

Wednesday, May 16, 2012

C# said, Java said. I'm a B

Checking the type on C#

class A {}
class B : A {}
class C : B {}
class D : C {}

class TestInheritance {
    public static void Main(string[] args) { 
        object o = new C();

        if (o is B)
            System.Console.WriteLine("I'm a B");

        if (o.GetType() == typeof(B))
            System.Console.WriteLine("I'm the B");


        if (o is C)
            System.Console.WriteLine("I'm a C");

        if (o.GetType() == typeof(C))
            System.Console.WriteLine("I'm the C");

    }
}


Checking the type on Java

class A {}
class B extends A {}
class C extends B {}
class D extends C {}

class TestInheritance {
    public static void main(String[] args) { 
        Object o = new C();

        if (o instanceof B)
            System.out.println("I'm a B");

        if (o.getClass() == B.class)
            System.out.println("I'm the B");


        if (o instanceof C)
            System.out.println("I'm a C");      

        if (o.getClass() == C.class)
            System.out.println("I'm the C");

    }
}


Both languages has this output:
I'm a B
I'm a C
I'm the C

Monday, May 14, 2012

Java, moving furnitures around is a no-no. Refactoring-happy fellow be warned

class A { void m() { System.out.println("outer"); }}

public class TestInners {

 public static void main(String[] args) {
  new TestInners().go();
 }

 void go() {  
  new A().m();

  class A { void m() { System.out.println("inner"); } }  
 }

 class A { void m() { System.out.println("middle"); } }
}

You might have assumed the output is inner, since the class is in the same scope as the method. I don't know what are the Java language designers are thinking, it's more intuitive to choose the inner regardless of what line the class is declared. Java has some quirks on rules for resolving classes scope and accessibility, and this is one of those.

That code output is middle.


Move the class A at first line of go(), the output is inner:

void go() { 
 class A { void m() { System.out.println("inner"); } }  
 
 new A().m();
}

Sunday, May 13, 2012

Java said, C# said. Static nested class


Java's static nested class, is mostly used for scoping needs:

// package-level is the default, there's no keyword to explicitly say so. This is analogous to C#'s internal
class Car {

    public int i = 7;

    Car() {
        Tire t = new Tire("Goodyear");
        Tire x = new Tire("Firemint");

        x.run();
        t.run();


    }

    private static class Tire {


        // package-level is the default. package-level is analogous to C#'s internal
        private String _s;

        // package-level is the default. there's no keyword to explicitly say so
        Tire(String s) {
            _s = s;
        }

        void run() {
            System.out.println("Hello " + _s);

            // cannot access i:
            // System.out.println(i);
        }
    }
}

The equivalent in C# :

// internal is the default, no need to explicitly say so
class Car {
    public int i = 7;
    
    // private is the default, explicitly say internal. analogous to java's package-level
    internal Car() {
        
        var t = new Tire("Goodyear");
        var x = new Tire("Firemint");


        x.Run();
        t.Run();

    }
    
    
    // internal is the default, explicit say private
    private class Tire {
        // private is the default no need to explicitly say so
        string _s; 
        
        // private is the default, explicitly say internal. analogous to java's package-level
        internal Tire(string s) {
            _s = s;
        }
        
        internal void Run() {
            Console.WriteLine ("Hello " + _s);
            
            // cannot access i:
            // Console.WriteLine (i);

        }
    }
}

Friday, May 11, 2012

Accepting types from lowerbound(derived) to up(super) in Java generics

I wish Java had implemented generics first, so C# could copy this feature


import java.util.List;

import java.util.ArrayList;




public class Test {


    public static void main(String[] args) {

        // test addLot
        {
            List<Asset> assets = new ArrayList<Asset>();
            addLot(assets);

            List<Lot> lots = new ArrayList<Lot>();
            addLot(lots);

            List<CommercialLot> commercialLots = new ArrayList<CommercialLot>();
            // you cannot do this, compile-time error:
            // addLot(commercialLots);
        }

        // test addCommercialLot
        {
            List<Asset> assets = new ArrayList<Asset>();
            addCommercialLot(assets);

            List<Lot> lots = new ArrayList<Lot>();
            addCommercialLot(lots);

            List<CommercialLot> commercialLots = new ArrayList<CommercialLot>();
            addCommercialLot(commercialLots);
        }

    }

    public static void addLot(List<? super Lot> lots) {

        // you cannot do this, compile-time error:
        // lots.add(new Asset());

        // you can do this:
        lots.add(new Lot());

        // and so is this
        lots.add(new CommercialLot());

        ////////

        for(Asset a : (List<Asset>) lots) {
        }

        for(Lot l : (List<Lot>) lots) {
        }

        // you cannot do this, compile-time error:
        /*for(CommercialLot c : (List<CommercialLot>) lots) {
        } */
    }

    public static void addCommercialLot(List<? super CommercialLot> commercialLots) {

        // you cannot do this, compile-time error:
        // commercialLots.add(new Asset());

        // you cannot do this, compile-time error:
        // commercialLots.add(new Lot());

        // but of course, you can do this:
        commercialLots.add(new CommercialLot());

        ////////

        for(Asset a : (List<Asset>) commercialLots) {
        }

        for(Lot l : (List<Lot>) commercialLots) {
        }

        for(CommercialLot c : (List<CommercialLot>) commercialLots) {
        }
    }
}

class Asset {
}

class Lot extends Asset  {
}

class CommercialLot extends Lot {
}



Live test: http://ideone.com/dbgCU


The disadvantage of this approach though is it necessitates adding a new method with lowerbounded parameter whenever there's a new class that subclass the superclass. For alternative approach, read this: http://www.anicehumble.com/2012/05/c-said-java-said-oop-for-generics.html


Thursday, May 10, 2012

C# said, Java said. OOP for generics

C# said

using System.Collections.Generics;

class Program {

    public static void Main (string[] args) {
    
        IList<Animal> animals = new List<Animal>();
        AddAnimal(animals);
        foreach (Animal animal in animals) {
            Console.WriteLine (animal);
        }
    
        IList<Dog> dogs = new List<Dog>();
        AddAnimal(dogs);
        foreach (Animal animal in dogs) {
            Console.WriteLine (animal);
        }
        
    }
    
    public static void AddAnimal<T>(IList<T> animals) where T : Animal, new() {
                
        foreach(Animal a in animals) {
        }
                
        T x = new T();
        animals.Add(x);
    }   

}


class Animal {
}

class Dog : Animal {
}

class Plant {
}

Java said

import java.util.*;

public class Program {
    public static void main(String[] args) 
        throws InstantiationException, IllegalAccessException 
    {

        List<Animal> animals = new ArrayList<Animal>();
        addAnimal(animals, Animal.class);
        for (Animal animal : animals) {
            System.out.println (animal);
        }
    
        List<Dog> dogs = new ArrayList<Dog>();
        addAnimal(dogs, Dog.class);
        for (Animal animal : dogs) {
            System.out.println (animal);
        }

    }
    
    
    public static <T extends Animal> void addAnimal(
        List<T> animals, Class<T> clz)
        throws InstantiationException, IllegalAccessException
    {

        for(Animal a : animals) {
        }

        T x = clz.newInstance();
        animals.add(x);
        
    }
        
}

class Animal {
}

class Dog extends Animal {
}

class Plant {
}

The notable difference is between their instantiation of the generic's type. Whereas in C# you just say new T(), the same thing can't be done as straightforward as it be in Java. As Java generics employs type erasure, you need to explicitly pass to the function the type you want to instantiate.


The extends on our sample Java generics code is called upperbound wildcard, mostly used for getting values

The super on generics is called lowerbound wildcard, mostly used for putting values.


Though in our example, we use extends for both getting value and putting value. In order to put the value, we pass(a necessity on Java,Java erases the type of generic's parameter T) the class' type to the function.




To read something about get-put principle: http://www.ibm.com/developerworks/java/library/j-jtp07018/index.html


Another approach for using the generic on method parameter(e.g. good for collection) is to use lowerbounded generic parameters, read it at: http://www.anicehumble.com/2012/05/lowerbound-generics-in-java.html

Tuesday, May 8, 2012

C# said, Java said. Constraining generic's type

C# version:

public class InterfaceDemo {
    public static void Main(string[] args) { 
        
        Geek john = new Geek("John");
        Geek paul = new Geek("Paul");

        Hipster george = new Hipster("George");
        Hipster ringo = new Hipster("Ringo");

        var geekPersonalitiesIntroducer = new PersonalitiesIntroducer<Geek>();  
        var hipsterPersonalitiesIntroducer = new PersonalitiesIntroducer<Hipster>();

        geekPersonalitiesIntroducer.Introduce(paul,john);
        hipsterPersonalitiesIntroducer.Introduce(ringo,george);

        // won't work, you can only introduce geeks to each other.
        // error can be caught during compile-time:
        // geekPersonalitiesIntroducer.Introduce(paul,george);

        // won't work, you can only introduce hipsters to each other.
        // error can be caught during compile-time:
        // hipsterPersonalitiesIntroducer.Introduce(ringo,john);

    } 
}


interface IGreetable<T> {
    string Hello(T another);
}


class Geek : IGreetable<Geek> {

    string _name;
    public Geek(string name) {
        _name = name;
    }


    public string Hello(Geek another) {
        return "Howdy, " + another + ". I'm " + _name;
    }   


    override public string ToString() {
        return _name;
    }
}


class Hipster : IGreetable<Hipster> {
    
    string _name;
    public Hipster(string name) {
        _name = name;       
    }

    
    public string Hello(Hipster another) {
        return "Hola, " + another + ". I'm " + _name;
    }   


    override public string ToString() {
        return _name;
    }   
}


class PersonalitiesIntroducer<T> where T : IGreetable<T> {

    public void Introduce(T thisPerson, T toThatPerson) {
        
        System.Console.WriteLine(thisPerson + " meet " + toThatPerson);
        System.Console.WriteLine(thisPerson.Hello(toThatPerson));
        System.Console.WriteLine();
    }
}


Java version:

public class InterfaceDemo {
    public static void main(String[] args) {

        Geek john = new Geek("John");
        Geek paul = new Geek("Paul");

        Hipster george = new Hipster("George");
        Hipster ringo = new Hipster("Ringo");

        PersonalitiesIntroducer<Geek> geekPersonalitiesIntroducer = new PersonalitiesIntroducer<Geek>();
        PersonalitiesIntroducer<Hipster> hipsterPersonalitiesIntroducer = new PersonalitiesIntroducer<Hipster>();

        geekPersonalitiesIntroducer.introduce(paul,john);
        hipsterPersonalitiesIntroducer.introduce(ringo,george);

        // won't work, you can only introduce geeks to each other.
        // error can be caught during compile-time:
        // geekPersonalitiesIntroducer.introduce(paul,george);

        // won't work, you can only introduce hipsters to each other.
        // error can be caught during compile-time:
        // hipsterPersonalitiesIntroducer.introduce(ringo,paul);

    }
}


interface Greetable<T> {
    String hello(T another);
}


class Geek implements Greetable<Geek> {

    String _name;
    public Geek(String name) {
        _name = name;
    }


    public String hello(Geek another) {
        return "Howdy, " + another + ". I'm " + _name;
    }


    @Override public String toString() {
        return _name;
    }
}


class Hipster implements Greetable<Hipster> {

    String _name;
    public Hipster(String name) {
        _name = name;
    }


    public String hello(Hipster another) {
        return "Hola, " + another + ". I'm " + _name;
    }


    @Override public String toString() {
        return _name;
    }
}


class PersonalitiesIntroducer<T extends Greetable<T>> {

    public void introduce(T fromPerson, T toAnotherPerson) {

        System.out.println(fromPerson + " meet " + toAnotherPerson);
        System.out.println(fromPerson.hello(toAnotherPerson));
        System.out.println();
    }
}

Output:

Paul meet John
Howdy, John. I'm Paul

Ringo meet George
Hola, George. I'm Ringo

The only part where Java and C# significantly differs(if at all) in these codes are in constraining the generic's type.

Contrast C#'s code:

class PersonalitiesIntroducer<T> where T : IGreetable<T> {

    public void Introduce(T thisPerson, T toThatPerson) {
  
        System.Console.WriteLine(thisPerson + " meet " + toThatPerson);
        System.Console.WriteLine(thisPerson.Hello(toThatPerson));
        System.Console.WriteLine();
    }
}

To Java's code:

class PersonalitiesIntroducer<T extends Greetable<T>> {

    public void introduce(T fromPerson, T toAnotherPerson) {

        System.out.println(fromPerson + " meet " + toAnotherPerson);
        System.out.println(fromPerson.hello(toAnotherPerson));
        System.out.println();
     }
}


C#'s syntax for constraining the generic type has more affinity with how the generics-using class will be used(instantiated). So you say in C# (and Java too):

PersonalitiesIntroducer<Geek> pi = new PersonalitiesIntroducer<Geek>(); // same syntax on both Java and C
// In C#, you can shorten it to:
var pi = new PersonalitiesIntroducer<Geek>(); 
// In Java 7, you can shorten it to:
PersonalitiesIntroducer<Geek> pi = new PersonalitiesIntroducer<>();

Though in Java, the class declaration is like this:
class PersonalitiesIntroducer<T extends Greetable<T>> 

You will not say in Java:
PersonalitiesIntroducer<Geek extends Greetable<Geek>> pi = new PersonalitiesIntroducer<Geek extends Greetable<Geek>>();

In Java, you instantiate generics the same way as C#.

Minor syntactic differences aside, C# and Java are very similar. You could even replace the Java code in movie Independence Day with C# and people won't even notice it

C# said, Java said: Generic instantiation

C# said

public class GenericDemo {
    public static void Main(string[] args) {
        First f = Everybody.Want<First>();
        System.Console.WriteLine(f);
    }
}


class Everybody {
    public static T Want<T>() where T : new() {
        return new T();
    }
}


class First {
 override public string ToString() {
  return "Hello";
 }
}

Java said

public class GenericDemo {
    public static void main(String[] args) throws IllegalAccessException, InstantiationException  {
       First f = Person.Wants(First.class);
       System.out.println(f);
    }
}


class Person {
    public static <T> T Wants(Class<T> t) throws IllegalAccessException, InstantiationException {
        return t.newInstance();
    }
}


class First {
 @Override public String toString() {
  return "Hello";
 }
}

Capping negative numbers to zero

select 
   ( col1 + abs(col1) ) / 2 as capped_to_zero
from tbl

is faster than:


select 
   case when col1 >= 0 then col1 else 0 end as capped_to_zero
from tbl


Monday, May 7, 2012

Java generic type erasure advantages and disadvantages

Java's type erasure on its generics allows the components produced on generics-using JVM be instantly available on non-generics JVM


An example, let's say we have a Greetable interface

interface Greetable<T> {
   String hello(T another);
}

On non-generics-capable Java, e.g. developers using JVM 1.4 or earlier, shall see Greetable's hello method's parameter as:



Type erasure has some problems though. Let's say you have two classes Geek and Hipster that implements the same interface, yet they can be friendly only among their own kind, this is where type erasure can be a problem. Read on.


class Hipster implements Greetable<Hipster>
{
    String _name;

    Hipster(String name) {
        _name = name;
    }

    @Override
    public String toString() {
        return _name;
    }

    @Override
    public String hello(Hipster another) {
        return "Hi I'm " + _name + ", hola mi amigo " + another + "!";
    }

}

class Geek implements  Greetable<Geek>
{
    String _name;

    Geek(String name) {
        _name = name;
    }

    @Override
    public String toString() {
        return _name;
    }

    @Override
    public String hello(Geek another) {
        return "Hi I'm " + _name + ", charie " + another + "!";
    }
}

You can see that both of them implements method hello, yet their accepted parameter is confined only to their own kind. This is where generics shine, error(type mismatches) can be caught during compile-time, not when it is costly to fix an error(during runtime)

Given this:

Hipster george = new Hipster("George");
Geek paul = new Geek("Paul");


Then you do this:
george.hello(paul); // compile error

That will not be allowed by the compiler, that will result to compile-time error. You use generics if you want to enforce types compatibility. Compiler can catch this error

// this is allowed. Hipster is derived from Greetable
Greetable<hipster> george = new Hipster("Pete"); 


Geek paul = new Geek("Paul");


// but this will not be allowed.
george.hello(paul); 


george.hello method signature is hello(Hipster another);. Paul is not a Hipster, he's a Geek, hence the compiler can catch that Paul doesn't matched George's friend preference of Hipster.


With type erasure, type mismatches are not caught by the compiler:


// this is allowed. Hipster is derived from Greetable
Greetable george = new Hipster("George"); 

george.hello(paul); // this will be allowed by the compiler. yet this will throw an exception during runtime

With type erasure, the type on george's hello method became an untyped one, it has method signature of hello(Object another). This has the consequence of the compiler not being able to catch type mismatches for you, hence the type incompatibility just arises during runtime only, the runtime will throw an exception; which is bad, it's better to fix errors earlier.

Thursday, May 3, 2012

SQL COUNT: A Computer Program Does What You Tell It To Do, Not What You Want It To Do

Given this data:

test=# select * from ios_app order by app_code, date_released;
.ios_app_id | app_code | date_released | price  
------------+----------+---------------+--------
          1 | AB       | 2010-01-01    | 1.0000
          3 | AB       | 2010-01-03    | 3.0000
          4 | AB       | 2010-01-04    | 4.0000
          2 | TR       | 2010-01-02    | 2.0000
          5 | TR       | 2010-01-05    | 5.0000
(5 rows)

And this data:
test=# select * from android_app order by app_code, date_released;
.android_app_id | app_code | date_released |  price  
----------------+----------+---------------+---------
              1 | AB       | 2010-01-06    |  6.0000
              2 | AB       | 2010-01-07    |  7.0000
              7 | MK       | 2010-01-07    |  7.0000
              3 | TR       | 2010-01-08    |  8.0000
              4 | TR       | 2010-01-09    |  9.0000
              5 | TR       | 2010-01-10    | 10.0000
              6 | TR       | 2010-01-11    | 11.0000
(7 rows)

I made this query...

select x.app_code, 
 count(i.date_released) as ios_release_count, 
 count(a.date_released) as android_release_count
from app x
left join ios_app i on i.app_code = x.app_code
left join android_app a on a.app_code = x.app_code
group by x.app_code
order by x.app_code


...and I wanted that to display this...

.app_code | ios_release_count | android_release_count 
----------+-------------------+-----------------------
 AB       |                 3 |                     2
 MK       |                 0 |                     1
 PM       |                 0 |                     0
 TR       |                 2 |                     4
(4 rows)


...but why it displays this?

.app_code | ios_release_count | android_release_count 
----------+-------------------+-----------------------
 AB       |                 6 |                     6
 MK       |                 0 |                     1
 PM       |                 0 |                     0
 TR       |                 8 |                     8
(4 rows)


Why ios_release_count says 6 when it should be 3 instead?

The same with TR, why its ios_release_count displays 8 when it should be 2 instead?

Do you notice something? The incorrect result 6 on your query is a multiplication of ios_release_count of 3 and android_release_count of 2. Likewise with result 8 on your query, it is a multiplication of ios_release_count of 2 and android_release_count of 4.


To visualize, try to remove the grouping and count aggregation on your query...

select x.app_code, i.date_released as ios_release_date, a.date_released as android_release_date
from app x
left join ios_app i on i.app_code = x.app_code
left join android_app a on a.app_code = x.app_code
order by x.app_code, ios_release_date, android_release_date;


This will be the results of your query:


.app_code | ios_release_date | android_release_date 
----------+------------------+----------------------
 AB       | 2010-01-01       | 2010-01-06
 AB       | 2010-01-01       | 2010-01-07
 AB       | 2010-01-03       | 2010-01-06
 AB       | 2010-01-03       | 2010-01-07
 AB       | 2010-01-04       | 2010-01-06
 AB       | 2010-01-04       | 2010-01-07
 MK       |                  | 2010-01-07
 PM       |                  | 
 TR       | 2010-01-02       | 2010-01-08
 TR       | 2010-01-02       | 2010-01-09
 TR       | 2010-01-02       | 2010-01-10
 TR       | 2010-01-02       | 2010-01-11
 TR       | 2010-01-05       | 2010-01-08
 TR       | 2010-01-05       | 2010-01-09
 TR       | 2010-01-05       | 2010-01-10
 TR       | 2010-01-05       | 2010-01-11
(16 rows)


You notice something? the ios_release_date keeps on repeating. The reason why? For every AB row, it get paired with two rows on android. How many rows are there in AB? Three, right? So when you multiply three by two, you get 6, so that's where the count of 6 comes from! Likewise for every TR row in ios, there's four rows on android; there are two TR rows in ios. You multiply two by four, you get 8!


So it's time to correct the query.

With the advent of CTE, flattening the results is way neater and you can give them their own name; afterwards, you can join them to the master table:

with ios_app_release_count_list as
(
 select app_code, count(date_released) as ios_release_count
 from ios_app
 group by app_code
)
,android_release_count_list as
(
 select app_code, count(date_released) as android_release_count 
 from android_app 
 group by app_code  
)
select 
 x.app_code, 
 coalesce(i.ios_release_count,0) as ios_release_count, 
 coalesce(a.android_release_count,0) as android_release_count
from app x
left join ios_app_release_count_list i on i.app_code = x.app_code
left join android_release_count_list a on a.app_code = x.app_code
order by x.app_code;

Here's the output for that:

.app_code | ios_release_count | android_release_count 
----------+-------------------+-----------------------
 AB       |                 3 |                     2
 MK       |                 0 |                     1
 PM       |                 0 |                     0
 TR       |                 2 |                     4
(4 rows)

It's already correct, isn't it?


CTE lets you divide-and-conquer the problem pretty well, make a query of something, make a good name for it, then join it to master table.

So with absence of CTE, this is how your query shall be written like:

select x.app_code, 
 coalesce(i.ios_release_count,0) as ios_release_count, 
 coalesce(a.android_release_count,0) as android_release_count
from app x
left join
(
 select app_code, count(date_released) as ios_release_count
 from ios_app
 group by app_code
) i on i.app_code = x.app_code
left join
(
 select app_code, count(date_released) as android_release_count 
 from android_app 
 group by app_code   
) a on a.app_code = x.app_code
order by x.app_code

It's not as intuitive and neat as the CTE approach. Nevertheless, it works!





Find skills requirement using Postgresql

I saw this question on Stackoverflow

I have a question about a Database technique to store dependencies. I know there are out there a lot of them, but I cannot place them easily in the scheme I need. I have created simple image:


Skills example


As you can see what I need is to create a tree of skills (like in games) which depends on each other. So for example if someone goes and want to have skill 8 I can tell him that he needs to first have skill 1,2 and 5.

This might be fine for hirearchial data in database, but what I cannot quite figure out is how to do this model dynamically. The problem I have is, that skills will be added all the time in all possible places in the tree. Never ending. An skills can be added on any level.

Now after this first question there is one more complication. I need the skills to have levels as well. So for example skill 1 can have 10 levels. and you can achieve skill 2 only after achieving skill 1 level 5.

For people who play games like World of Warcraft should be understandable.

One more note, Skills can be added anytime, but cannot be changed after adding. On normal basis. Just in case some of the skill would be extremely bad or something like that, then it would be removed, but that would occur just very rarely.

Thank you for suggestions, links or any other materials!




Note: On Skill #7 with skills required of 1,2,3,4,9 it should be 1,2,3,4,6,9.


I'm challenged enough to solve it, but I'm not challenged enough to solve it without CTE. With that in mind, here's my Postgresql query for that:

with recursive 
skill_list(skill_id) as
(
    select distinct skill_id from skill_req 
    where req is not null
    union
    select distinct req from skill_req
    where req is not null 
)
,skill_tree(skill_group, depend_on) as
(
    select skill_id, skill_id -- seed
    from skill_list 
 
    union 
 
    select st.skill_group, sr.req
    from skill_req sr
    join skill_tree st 
    on sr.skill_id = st.depend_on 
)
,skills_required as
(
    select skill_group, depend_on
    from skill_tree
    where skill_group <> depend_on -- remove seeds 
)
select 
 
    sl.skill_id, 
    array_agg(sr.depend_on order by depend_on) as array_version,
    array_to_string(array_agg(sr.depend_on order by depend_on), ',') as  group_concat_version

from skill_list sl
left join skills_required sr on sr.skill_group = sl.skill_id
group by sl.skill_id   

Output:
 skill_id | array_version | group_concat_version 
----------+---------------+----------------------
        1 | {NULL}        | 
        2 | {1}           | 1
        3 | {NULL}        | 
        4 | {3}           | 3
        5 | {1}           | 1
        6 | {1,2,3,4}     | 1,2,3,4
        7 | {1,2,3,4,6,9} | 1,2,3,4,6,9
        8 | {1,2,5}       | 1,2,5
        9 | {3}           | 3
       10 | {1,3,4,5,9}   | 1,3,4,5,9
(10 rows)


Live test: http://www.sqlfiddle.com/#!1/77894/1

Get fluctuating price on MySQL

Given this data:

CREATE TABLE fluctuate
    (Date datetime, Company varchar(10), Price int);

INSERT INTO fluctuate
    (Date, Company, Price)
VALUES
    ('2012-01-04 00:00:00', 'Apple', 458),
    ('2012-01-03 00:00:00', 'Apple', 462),
    ('2012-01-02 00:00:00', 'Apple', 451),
    ('2012-01-01 00:00:00', 'Apple', 450),
    ('2012-01-01 00:00:00', 'Microsoft', 1),
    ('2012-01-03 00:00:00', 'Microsoft', 7),
    ('2012-01-05 00:00:00', 'Microsoft', 5),
    ('2012-01-07 00:00:00', 'Microsoft', 8),
    ('2012-01-08 00:00:00', 'Microsoft', 12);

You want to get the fluctuation in price even the date is non-contiguous:

DATE                       COMPANY             PRICE               DAY_CHANGE
January, 04 2012           Apple               458                 -4
January, 03 2012           Apple               462                 11
January, 02 2012           Apple               451                 1
January, 01 2012           Apple               450                 0
January, 08 2012           Microsoft           12                  4
January, 07 2012           Microsoft           8                   3
January, 05 2012           Microsoft           5                   -2
January, 03 2012           Microsoft           7                   6
January, 01 2012           Microsoft           1                   0


Since there's no windowing function on MySQL yet, we will settle for some MySQL-ism:

select 

date, 
company, 
price, 
day_change

from
(    
  select 

     case when company <> @original_company then
         -- new company detected,
         -- reset the original_price base on the new company
         @original_price := price
     end,
    f.*,
    price - @original_price as day_change,
    (@original_price := price),
    (@original_company := company)


  from fluctuate f

  cross join
  (
    select 
     @original_price := price,
     @original_company := company
     from fluctuate 
     order by company, date limit 1
  )
  as zzz

  order by company, date 

) as yyy
order by company, date desc


How does it work, first we need to get the starting price from the table, we use cross join to get the initial values. Then from there, we subtract the price from the original price and alias it as day_CHANGE, then save the state of the current price to original price, then apply the new original price to next price, and so on.

Now on new company, to reset the state of original price based on that company, on the first expression of our SELECT statement, we detect if the current company is not equal to the original_company; if it isn't, we reset the original_price based on the current company


Happy Coding! :-)

Live test: http://www.sqlfiddle.com/#!2/e16ec/1

Wednesday, May 2, 2012

Tuple matching is neat. How to simulate one if isn't available

Depending on the requirement, tuple-style query is the best query style everyone should write.

So if your database facilitates tuple matching, write your query in that form instead. To wit, list the countries with the highest population on each region, this query works even there's multiple countries with same population on each region:


First step, find the highest population:

select region, max(population)
from bbc
group by region


Final step, use tuple in your WHERE clause:

select region, name, population 
from bbc 
where (region, population) in
      (select region, max(population)
       from bbc
       group by region)
order by region, name

Great, isn't it? Your query investment on the first step can be integrated seamlessly to another query. Everything works out of the box


Contrast that with database that doesn't facilitate tuple test:

select region, name, population 
from bbc z
where exists
      (select null -- neutral. doesn't invoke Cargo Cult Programming ;-)
       from bbc
       where region = z.region 
       group by region
       having z.population = max(population) )
order by region, name

It's not easy to deduce the intent of the query, sometimes it takes another look to deduce the intent of that query.

There's a way to simulate tuple on your query, put it in join condition instead of in where condition. Remember the first query on this post? It will not go to waste, we will use that and make it seamlessly integrate to another query.

select z.region, z.name, z.population 
from bbc z
join (select region, max(population) as maxpop 
      from bbc 
      group by region) x
      on z.region = x.region and z.population = x.maxpop
order by z.region, z.name


Simulate the query here: http://sqlzoo.net/0.htm



Tuesday, May 1, 2012

Find the path between source and target using recursive CTE

Given this sample data:

CREATE TABLE tbl
 (source int, target int, strength int);

INSERT INTO tbl
 (source, target, strength)
VALUES
 (1, 2, 1),
 (1, 3, 1),
 (2, 4, 1),
 (2, 5, 1),
 (3, 6, 1),
 (3, 7, 1),
 (4, 8, 1),
 (4, 9, 1),
 (5, 10, 1),
 (5, 11, 1);


What's the best query to find the path between source = 2 and target = 9?

The output would be this:
SOURCE   TARGET  PATH
2        4       2.4
4        9       2.4.9



If you have to start with source(2), that would entail traversing all its child nodes, only to discard some of them later on, this is inefficient.

With this in mind, we shall construct our query to start at the target then work its way up to find the source.


We shall divide-and-conquer the problem, first, let's do the traversal from bottom to top:

with recursive find_parent(source, target, recentness) as
(
    select source, target, 0 
    from tbl
    where target = 9
  
    union all
  
    select i.source, i.target, fp.recentness + 1
    from tbl i
    join find_parent fp on i.target = fp.source 
)
select * from find_parent

Output:
SOURCE TARGET RECENTNESS
4      9      0
2      4      1
1      2      2

Live test for query above: http://www.sqlfiddle.com/#!1/73210/37


The query is working, but it has no terminating condition to stop searching its way back to the source yet.


We just need to add a condition on join clause to facilitate this scenario:
with recursive find_parent(source, target, recentness) as
(
    select source, target, 0 
    from tbl
    where target = 9
  
    union all
  
    select i.source, i.target, fp.recentness + 1
    from tbl i
    join find_parent fp on i.target = fp.source 
         -- despite the name, this target is another one's source
         and i.target <> 2
)
select * from find_parent


Output:
SOURCE TARGET RECENTNESS
4      9      0
2      4      1


Live test for query above: http://www.sqlfiddle.com/#!1/73210/38


Final step, construct the path:

with recursive find_parent(source, target, recentness) as
(
    select source, target, 0 
    from tbl
    where target = 9
  
    union all
  
    select i.source, i.target, fp.recentness + 1
    from tbl i
    join find_parent fp on i.target = fp.source 
         -- despite the name, this target is another one's source
         and i.target <> 2
)
,construct_path(source, target, recentness, path) as
(
    select source, target, recentness, source || '.' || target
    from find_parent 
    where recentness = (select max(recentness) from find_parent)
    
    union
    
    select dd.source, dd.target, dd.recentness, cp.path || '.' || dd.target
    from find_parent dd
    join construct_path cp on dd.recentness = cp.recentness - 1  

)
select source, target, path
from construct_path
order by recentness desc


SOURCE  TARGET   PATH
2       4        2.4
4       9        2.4.9

Live test for query above: http://www.sqlfiddle.com/#!1/73210/40


Basically that's it. First, we traverse the path from child to parent using find_parent, then we construct user-readable path using construct_path


So what will happen, if we didn't remove the and i.target<> on join condition in find_parent?

This will be the output:

SOURCE TARGET PATH
1      2      1.2
2      4      1.2.4
4      9      1.2.4.9


Live test: http://www.sqlfiddle.com/#!1/73210/42


Solution for this problem: https://stackoverflow.com/questions/10392567/postgresql-pass-data-from-recursive-cte-onto-function