Thursday, September 3, 2015

Fast text search in PostgreSQL

This works in Postgres:


select * from location where location_id = any(array[1,6,8])

However, if you tried to match it against an array on another query, it won't work, even though the another query returns exactly one row only.

select * from location where location_id = any(select location_ids from province p where p.province_name = 'Metro Manila' limit 1)

Output:
ERROR:  operator does not exist: integer = integer[]
LINE 1: select * from location where location_id = any
                                                 ^
HINT:  No operator matches the given name and argument type(s). You might need to add explicit type casts.


To correct it, unnest the array:

select * from location where location_id = any(select unnest(location_ids) from province p where p.province_name = 'Metro Manila')

Might as well use IN:

select * from location where location_id in (select unnest(location_ids) from province p where p.province_name = 'Metro Manila')


There's a drawback on the queries above, it uses equal operator. In Postgres, indexers are bound to operator, and GIN, the indexing technology for indexing array, doesn’t support equal operator. See the detail here: http://stackoverflow.com/questions/4058731/can-postgresql-index-array-columns/29245753#29245753


So to optimize the query, use either of the following:


-- This query works same as the query at the bottom. The query at the bottom is more readable though.
-- The advantage of this query is if the province matches many rows, it would still work, say the province_name uses ILIKE operator.

select  *
from    location l
where   exists
        (
            select  p.location_ids
            from    province p
            where   p.province_name = 'Metro Manila'
                    and array[l.location_id] && p.location_ids
        );

The query uses overlap operator, &&. See the use of supported operators on arrays: http://www.postgresql.org/docs/9.1/static/functions-array.html

-- If province returns exactly one row, this is more readable than above
select   *
from     location l
where    array[l.location_id] && ( select location_ids p from province p where p.province_name = 'Metro Manila' );


Live code: http://sqlfiddle.com/#!15/6ac38/5


Happy Coding!

No comments:

Post a Comment