tag:blogger.com,1999:blog-9588112.post6943796968803818155..comments2024-03-29T00:30:17.547-07:00Comments on Edward J. Yoon's Blog: Inference anatomy of the Google PregelEdward J. Yoonhttp://www.blogger.com/profile/06474219045532241598noreply@blogger.comBlogger2125tag:blogger.com,1999:blog-9588112.post-40830064830963242922010-04-22T19:00:28.475-07:002010-04-22T19:00:28.475-07:00Interesting! Thanks, I'll take a look at this....Interesting! Thanks, I'll take a look at this.Edward J. Yoonhttps://www.blogger.com/profile/01322177995889925565noreply@blogger.comtag:blogger.com,1999:blog-9588112.post-62498463523739236992010-04-17T21:42:42.758-07:002010-04-17T21:42:42.758-07:00I couldn't find any more specific information ...I couldn't find any more specific information on Pregel, but it indeed seems like a rather trivial specialization of MapReduce, where the key for vertex data and for messages is the vertex ID.<br />It is disappointing that the whitepaper does not describe the API (and potentially optimizations), since that is the most interesting part for a framework geared towards usability.<br /><br />I wonder if the API has one or two steps:<br />-one step: Vertex Process(Vertex oldVertex, Inbox, Outbox), which reads from the Inbox and oldVertex, returns the new Vertex and writes messages in the Outbox<br />-two steps: Vertex ProcessIncoming(Vertex, Message[] Inbox) and Message[] SendMessages(Vertex)<br /><br />Vertex would contain the list of neighboring vertices and other state information, as well as a flag for "done" (ie. no new "superstep" iteration needed).<br /><br />The advantage of two steps is that ProcessIncoming can be used for optimizing the re-partitioning on vertex ID, as it can be used recursively, thereby reducing network traffic. The whitepaper only mentions a single user-defined function (Compute) but also mentions "handler" functions...<br /><br />Vertices could be created and deleted by sending messages. These standard messages could be processed by handlers with default implementations.<br />Also, using a MapReduce trick, a dummy vertex with ID 0 can be used to collect the count of all vertices and all completed vertices.Julien Couvreurhttps://www.blogger.com/profile/15158751165174523704noreply@blogger.com