Fault-tolerance in Hama

Recently, Hama core committers Suraj Menon and Thomas Jungblut are working on Fault-tolerant BSP system. And I am trying to read the source code. Their design describes the new BSP computing system and API enabling checkpoint-based recovery. Furthermore, describes the confined recovery, which can be used to improve the cost and latency of recovery.

I didn't fully understand and test yet but quite nice!


Popular posts from this blog

일본만화 추천 100선

구글링하는(googling) 방법

AWS re:Invent 2017 참관기