Recommendation discovery via graph traversal

I am quite excited about graph computing these days. It represents relational data such as customer behaviour naturally and otherwise complicated problems break down to simple pattern matching algorithm. Take recommendation system, for example. One way to do it is by machine learning as Wikipedia suggests. But if we represent the data in a property graph, a simplistic solution surfaces intuitively.

Picture this. If Bob likes item A; Cathy likes both item A and item B; then we can make the commutative link of item B for Bob.

Let's try it out in Neo4j using this pre-built web console example. You should see this graph with 4 person and 5 food items.

simple graph

Using this Cypher query, we get a list of all users and what food they like.

START   user = node:node_auto_index(type = "user") 
MATCH   person-[:IS_A]->user, person-[:LIKE]->x
RETURN  person.name, x.name

The second line is where we match the pattern that person is a user and that person like x. This query reads almost like the question which we want to ask.

We return all the person and those food they like, x:

+------------------------+
| person.name | x.name   |
+------------------------+
| "Andy"      | "apple"  |
| "Andy"      | "orange" |
| "Andy"      | "bread"  |
| "Bob"       | "apple"  |
| "Bob"       | "bread"  |
| "Cat"       | "apple"  |
| "Cat"       | "orange" |
| "Cat"       | "bread"  |
| "Cat"       | "fish"   |
| "Doug"      | "apple"  |
| "Doug"      | "orange" |
+------------------------+

Taking this a step further, we can find all the top common x and y that people like together in the above graph.

START   food = node:node_auto_index(type = "food"), user = node:node_auto_index(type = "user") 
MATCH   food<-[:IS_A]-x<-[:LIKE]-person-[:IS_A]->user,
        person-[:LIKE]->y-[:IS_A]->food
WHERE   NOT x = y
RETURN  x.name, y.name, count(*) as cnt 
ORDER BY cnt DESC 
LIMIT 10;

Resulting in:

+---------------------------+
| x.name   | y.name   | cnt |
+---------------------------+
| "apple"  | "orange" | 3   |
| "apple"  | "bread"  | 3   |
| "orange" | "apple"  | 3   |
| "bread"  | "apple"  | 3   |
| "bread"  | "orange" | 2   |
| "orange" | "bread"  | 2   |
| "fish"   | "apple"  | 1   |
| "fish"   | "bread"  | 1   |
| "apple"  | "fish"   | 1   |
| "bread"  | "fish"   | 1   |
+---------------------------+

So we find that a lot uf users that likes apple also likes orange or bread. We can then pick out all the people that likes apple but not orange yet to suggest (read: spam) orange to them.

START   apple = node:node_auto_index(name = "apple"), orange = node:node_auto_index(name = "orange")
MATCH   person-[:LIKE]->apple
WHERE   NOT(person-[:LIKE]->orange)
RETURN  person.name

+-------------+
| person.name |
+-------------+
| "Bob"       |
+-------------+

Easy, yes?

Paul Lam

Clojure Developer. Enterprise Startups. Being Useful.

Paul Lam

Clojure Developer. Enterprise Startups. Being Useful.

Recommendation discovery via graph traversal

Next articles