## Tuesday, May 1, 2012

### Find the path between source and target using recursive CTE

Given this sample data:

```CREATE TABLE tbl
(source int, target int, strength int);

INSERT INTO tbl
(source, target, strength)
VALUES
(1, 2, 1),
(1, 3, 1),
(2, 4, 1),
(2, 5, 1),
(3, 6, 1),
(3, 7, 1),
(4, 8, 1),
(4, 9, 1),
(5, 10, 1),
(5, 11, 1);

```

What's the best query to find the path between source = 2 and target = 9?

The output would be this:
```SOURCE   TARGET  PATH
2        4       2.4
4        9       2.4.9
```

If you have to start with source(2), that would entail traversing all its child nodes, only to discard some of them later on, this is inefficient.

With this in mind, we shall construct our query to start at the target then work its way up to find the source.

We shall divide-and-conquer the problem, first, let's do the traversal from bottom to top:

```with recursive find_parent(source, target, recentness) as
(
select source, target, 0
from tbl
where target = 9

union all

select i.source, i.target, fp.recentness + 1
from tbl i
join find_parent fp on i.target = fp.source
)
select * from find_parent
```

Output:
```SOURCE TARGET RECENTNESS
4      9      0
2      4      1
1      2      2
```

Live test for query above: http://www.sqlfiddle.com/#!1/73210/37

The query is working, but it has no terminating condition to stop searching its way back to the source yet.

We just need to add a condition on join clause to facilitate this scenario:
```with recursive find_parent(source, target, recentness) as
(
select source, target, 0
from tbl
where target = 9

union all

select i.source, i.target, fp.recentness + 1
from tbl i
join find_parent fp on i.target = fp.source
-- despite the name, this target is another one's source
and i.target <> 2
)
select * from find_parent
```

Output:
```SOURCE TARGET RECENTNESS
4      9      0
2      4      1
```

Live test for query above: http://www.sqlfiddle.com/#!1/73210/38

Final step, construct the path:

```with recursive find_parent(source, target, recentness) as
(
select source, target, 0
from tbl
where target = 9

union all

select i.source, i.target, fp.recentness + 1
from tbl i
join find_parent fp on i.target = fp.source
-- despite the name, this target is another one's source
and i.target <> 2
)
,construct_path(source, target, recentness, path) as
(
select source, target, recentness, source || '.' || target
from find_parent
where recentness = (select max(recentness) from find_parent)

union

select dd.source, dd.target, dd.recentness, cp.path || '.' || dd.target
from find_parent dd
join construct_path cp on dd.recentness = cp.recentness - 1

)
select source, target, path
from construct_path
order by recentness desc
```

```SOURCE  TARGET   PATH
2       4        2.4
4       9        2.4.9
```

Live test for query above: http://www.sqlfiddle.com/#!1/73210/40

Basically that's it. First, we traverse the path from child to parent using find_parent, then we construct user-readable path using construct_path

So what will happen, if we didn't remove the and i.target<> on join condition in find_parent?

This will be the output:

```SOURCE TARGET PATH
1      2      1.2
2      4      1.2.4
4      9      1.2.4.9
```

Live test: http://www.sqlfiddle.com/#!1/73210/42

Solution for this problem: https://stackoverflow.com/questions/10392567/postgresql-pass-data-from-recursive-cte-onto-function