tag:blogger.com,1999:blog-9588112.post8080603710096495595..comments2024-03-29T00:30:17.547-07:00Comments on Edward J. Yoon's Blog: Summary of the Google PregelEdward J. Yoonhttp://www.blogger.com/profile/06474219045532241598noreply@blogger.comBlogger4125tag:blogger.com,1999:blog-9588112.post-78827641753175394322011-09-04T14:48:05.872-07:002011-09-04T14:48:05.872-07:00I found answer to my question: http://horicky.blog...I found answer to my question: http://horicky.blogspot.com/2010/07/graph-processing-in-map-reduce.html<br /><br />- AliAnonymousnoreply@blogger.comtag:blogger.com,1999:blog-9588112.post-33876552116453409672011-09-04T13:53:57.920-07:002011-09-04T13:53:57.920-07:00Hi,
It has been said "M/R framework isn'...Hi,<br /><br />It has been said "M/R framework isn't ideal for graph algorithms because **it does not support communications among nodes**." but isn't it possible to use m/r to model communication like this:<br /><br /> - Start with (srcNode, (destNode)*) tuples.<br /> - On mapping srcNode, emit (destNode, (srcNode, msg)) to send msg from srcNode to destNode.<br /> - On reducing, destNode receives all emitted (srcNode, msg) tuples.<br /><br />Any idea?<br /><br />- AliAnonymousnoreply@blogger.comtag:blogger.com,1999:blog-9588112.post-42572428142283288412010-07-05T04:07:43.794-07:002010-07-05T04:07:43.794-07:00HI,
I'm reading that paper and seeing few de...HI, <br /><br />I'm reading that paper and seeing few design pattens. <br /><br />I agree with you that we should measure about the number of iterations. And, as you said, there is still I/O overhead involved in reading and writing materialized data every time, even if avoiding the situation shuffle and sort of reduce phase.<br /><br />IMO, BSP will communication only some vertices which can't be solved locally, and I'm sure that the number of iterations will be less or equal to (M/R based) Schimmy approach.<br /><br />More I hope we can compare them using Hama BSP soon.Edward J. Yoonhttps://www.blogger.com/profile/01322177995889925565noreply@blogger.comtag:blogger.com,1999:blog-9588112.post-43806007728825531592010-06-16T05:28:52.227-07:002010-06-16T05:28:52.227-07:00The authors of Pregel didn't aware of a techni...The authors of Pregel didn't aware of a technique on MapReduce to speedup Graph algorithm up to 3x:<br /> <br />http://developer.yahoo.com/events/hadoopsummit2010/agenda.html#43<br /><br />Link to the paper:<br /><br />http://www.umiacs.umd.edu/~jimmylin/publications/Lin_Schatz_MLG2010.pdf<br /><br />Basically, the technique is to avoid the Graph to be included in the shuffling phase. By doing this, the improvement is around 3x.<br /><br />It's not clear from the Pregel paper what is the performance improvement versus MapReduce based SSSP?<br /><br />I'm also interested in the number of supersteps needed to complete the SSSP. If the diameter of the graph is large, the number of supersteps will also be large (for simple BFS). Since Pregel graph algorithms work in supersteps, it would be wiser to show the number of supersteps + runtime for each superstep. IMHO, graphs algorithms on Pregel or MapReduce should be measured in the number of supersteps needed (not the runtime). <br /><br />I think it's quite OK to say that the runtime for each superstep is more or less constant. It's because we need to read the whole graph anyway (since they are not stored all in memory). Providing random access to the graph is not an option here. Better to read them all in sequential read manner (and skip some vertices).Felix Halimhttps://www.blogger.com/profile/14244282087333883784noreply@blogger.com