Showing posts from December, 2010

Now I'm a user of HBase.

Today, I've changed the database of my toy project, a URL shortener service, from the MySQL to the Apache HBase. It seems run well with my Java real-time web application.

Actually, it's not my first time. I was contributed a HQL (HBase Query Language) long long time ago. If they didn't tackle me, I could made something like Pig or Hive. Haha, anyways.

My HBase cluster size is 5 nodes. There's no big data yet but, it is now used for some twitter clients and a number of web sites and the rows are increasing as almost 30 per second.

Later, I'll use them for my some research work e.g., information-flow analysis, user propensity analysis, web structure mining, trend mining with the Apache Hama. :-)

[Note] Several problems where Apache Hama can be used

Web Graph Structure MiningSocial Network AnalysisInformation Flow On Social Network (finding top-K influential nodes)Evolution Of Social NetworkHigh Level Machine Learning (Bioinformatics, Chemical informatics .., etc).... and many others.

Serialize Printing of "Hello BSP" with Apache Hama

Serialize Printing of "Hello BSP"

Each BSP task of the HAMA cluster, will print the string "Hello BSP" in serial order. This example will help you to understand the concepts of the BSP computing model.

Each task gets its own hostname (hostname:port pair) and a sorted list containing the hostnames of all the other peers.Each task prints the LOG string "Hello BSP" only when its turn comes at intervals of 5 seconds.
BSP implementation of Serialize Printing of "Hello BSP"

public class SerializePrinting { public static class HelloBSP extends BSP { public static final Log LOG = LogFactory.getLog(HelloBSP.class); private Configuration conf; private final static int PRINT_INTERVAL = 5000; @Override public void bsp(BSPPeer bspPeer) throws IOException, KeeperException, InterruptedException { int num = Integer.parseInt(conf.get("bsp.peers.num")); int i = 0; for (String otherPeer : bspPeer.getAllPe…

Hide the Maven Target Directory from Open Resource Shortcut

I use m2eclipse and I am a big Maven fan. Unfortunately, I don't want resources to show up from my target directory when I use the open resource shortcut. So how do we get around this? Simply, right click the target folder, click Properties, then check the derived checkbox and hit the Ok button.