June 22, 2009

Pregel, Google's large scale graph computing framework

According to google research blog, They made the Pregel for performing large-scale garph computing and uses it for PageRank calculations, shortest path, ..., etc.

In order to achieve that, we have created scalable infrastructure,
named Pregel, to mine a wide range of graphs. In Pregel, programs
are expressed as a sequence of iterations. In each iteration,
a vertex can, independently of other vertices, receive messages
sent to it in the previous iteration, send messages to other vertices,
modify its own and its outgoing edges' states, and mutate the
graph's topology (experts in parallel processing will recognize that
the Bulk Synchronous Parallel Model inspired Pregel).

Maybe the most important things are the locality of adjacent vertices and the dynamic programming. I talked with Hyunsik, a memeber of Heart project about this, We thought it's other distributed programming model instead of map/reduce, but same the shared-nothing architecture for the better performance. I guess it's not much different from the map/reduce framework.

See also:
- Distributing a database for parallelism

Updated - 26 August, 2009 : See also, Inference anatomy of the Google Pregel

No comments:

Post a Comment