I am quite excited about graph computing these days. It represents relational data such as customer behaviour naturally and otherwise complicated problems break down to simple pattern matching algorithm. Take recommendation system, for example. One way to do it is by machine learning as Wikipedia suggests. But if we represent the data in a property graph, a simplistic solution surfaces intuitively.
Picture this. If Bob likes item A; Cathy likes both item A and item B; then we can make the commutative link of item B for Bob.
Let's try it out in Neo4j using this pre-built web console example. You should see this graph with 4 person and 5 food items.
Using this Cypher query, we get a list of all users and what food they like.
START user = node:node_auto_index(type = "user")
MATCH person-[:IS_A]->user, person-[:LIKE]->x
RETURN person.name, x.name
The second line is where we match
the pattern that person is a user
and that person like x
. This query reads almost like the question which we want to ask.
We return all the person and those food they like, x
:
+------------------------+
| person.name | x.name |
+------------------------+
| "Andy" | "apple" |
| "Andy" | "orange" |
| "Andy" | "bread" |
| "Bob" | "apple" |
| "Bob" | "bread" |
| "Cat" | "apple" |
| "Cat" | "orange" |
| "Cat" | "bread" |
| "Cat" | "fish" |
| "Doug" | "apple" |
| "Doug" | "orange" |
+------------------------+
Taking this a step further, we can find all the top common x
and y
that people like together in the above graph.
START food = node:node_auto_index(type = "food"), user = node:node_auto_index(type = "user")
MATCH food<-[:IS_A]-x<-[:LIKE]-person-[:IS_A]->user,
person-[:LIKE]->y-[:IS_A]->food
WHERE NOT x = y
RETURN x.name, y.name, count(*) as cnt
ORDER BY cnt DESC
LIMIT 10;
Resulting in:
+---------------------------+
| x.name | y.name | cnt |
+---------------------------+
| "apple" | "orange" | 3 |
| "apple" | "bread" | 3 |
| "orange" | "apple" | 3 |
| "bread" | "apple" | 3 |
| "bread" | "orange" | 2 |
| "orange" | "bread" | 2 |
| "fish" | "apple" | 1 |
| "fish" | "bread" | 1 |
| "apple" | "fish" | 1 |
| "bread" | "fish" | 1 |
+---------------------------+
So we find that a lot uf users that likes apple also likes orange or bread. We can then pick out all the people that likes apple
but not orange
yet to suggest (read: spam) orange
to them.
START apple = node:node_auto_index(name = "apple"), orange = node:node_auto_index(name = "orange")
MATCH person-[:LIKE]->apple
WHERE NOT(person-[:LIKE]->orange)
RETURN person.name
+-------------+
| person.name |
+-------------+
| "Bob" |
+-------------+
Easy, yes?