August 24, 2010

FW: How will Hama BSP different from Pregel?

Firstly, why did we use HBase?

Until last year, we were researched the distributed matrix/graph computing package, based on Map/Reduce.

As you know, the Hadoop is consists of HDFS, which is designed for commodity servers as a shared nothing model (also termed as data partitioning model), and a distributed programming model called Map/Reduce. The Map/Reduce is a high-performance parallel data processing engine, to be sure, but it's not good for complex numerical/relational processing requires huge iterations or inter-node communications. So, we used HBase as a shared storage (shared memory model).

Why BSP instead of Map/Reduce and HBase?

However, there were still problems as below:

OS overhead of running shared storage software (HBase)
The limitation of HBase faculty (especially, a size of column qualifier)
Growth of code complexity
Therefore, we started to consider about message-passing model, and decided to adopt the BSP (Bulk Synchronous Parallel) model, inspired by Pregel from Google Research Blog.

What's the Pregel?


According to my understanding, Pregel is graph-specific: a large-scale graph computing framework, based on BSP model.

How will Hama BSP different from Pregel?

Hama BSP is a computing engine, based on BSP model, like a Pregel, and it'll be compatible with existing HDFS cluster, or any FileSystem and Database in the future. However, we believe that the BSP computing model is not limited to a problems of graph; it can be used for widely distributed software such as Map/Reduce. In addition to a field of graph, there are many other algorithms, which have similar problems with graph processing using Map/Reduce. Actually, the BSP model has been researched for many years in the field of matrix computation, too.

Therefore, we're trying to implement more generalized BSP computing solution. And, the Hama will consists of the BSP computing engine, and a set of few examples (e.g., matrix inversion, pagerank, BFS, ..., etc).

You can locally test your BSP program using TRUNK version of Hama project.
Please subscribe the mailing list or comment here if you have any question, suggestion, objection about our project.

August 16, 2010

BBC's 50 Places to Visit Before You Die

Which Ones Have You Visited? 

1 The Grand Canyon USA
2 Great Barrier Reef Australia
3 Florida USA
4 South Island New Zealand
5 Cape Town South Africa
6 Golden Temple India
7 Las Vegas USA
8 Sydney Australia
9 New York USA
10 Taj Mahal India
11 Canadian Rockies Canada
12 Uluru Australia
13 Chichen Itza Mexico
14 Machu Picchu Peru
15 Niagara Falls Canada / USA
16 Petra Jordan
17 The Pyramids Egypt
18 Venice Italy
19 Maldives Maldives
20 Great Wall China
21 Victoria Falls Zambia / Zimbabwe
22 Hong Kong Hong Kong
23 Yosemite National Park USA
24 Hawaii USA
25 Auckland New Zealand
26 Iguassu Falls Argentina / Brazil
27 Paris France
28 Alaska USA
29 Angkor Wat Cambodia
30 Himalayas Nepal / Tibet
31 Rio de Janeiro Brazil
32 Masai Mara Kenya
33 Galapagos Islands Ecuador
34 Luxor Egypt
35 Rome Italy
36 San Francisco USA
37 Barcelona Spain
38 Dubai Arab Emirates
39 Singapore Singapore
40 La Digue Seychelles
41 Sri Lanka Sri Lanka
42 Bangkok Thailand
43 Barbados Barbados
44 Iceland Iceland
45 Terracotta Army China
46 Zermatt Switzerland
47 Angel Falls Venezuela
48 Abu Simbel Egypt
49 Bali Indonesia
50 French Polynesia French Polynesia 

August 11, 2010

Vinay Deolalikar's P ≠ NP preliminary paper

http://www.hpl.hp.com/personal/Vinay_Deolalikar/Papers/pnp_preliminary.pdf

예전에 어떤 전북대 교수님이 P = NP 라더니 그 연구 과정/결과 모든게 "구라" 였다라고 하더군 (나는 그 내용 잘 모름) ... 아직 검증결과는 안나왔다지만 사실 많은 사람들이 P ≠ NP 라 생각했던것 같다. 나는 구라여도 좋으니 그저 P = NP 신세계를 보여다오 :/