August 12, 2009

Graph database on Hadoop

Below is the problem list of the recent trends of graph data in my Insight.

- Very large (e.g. Web linked data, Social network, ..., etc)
- Diversified attributes of node and edge
- Requires real-time processing (for exampe, finding the shortest path based on attributes in Google Map)

So, I'm thinking the graph database on hadoop as described below:


HDFS Hama, Map/Reduce Hamburg
graph data -> graph partitioning for locality -> real-time processing


The large graph data can be stored on Hadoop/Hbase and, communication cost can be reduced by partitioning step as bulk processing. Then, finally we can perform the real-time graph processing. What do you think? ;)

No comments:

Post a Comment