April 20, 2009

Compute the transpose of matrix using Hama

The transpose of a matrix is another matrix in which the rows and columns have been reversed. It will be used for SVD (Singular Value Decomposition).

+ + + +
| a11 a12 a13 | | a11 a21 a31 |
| a21 a22 a23 | => | a12 a22 a32 |
| a31 a32 a33 | | a13 a23 a33 |
+ + + +

- A map task receives a row n as a key, and vector of each row as its value
- emit (Reversed index, the entry with the given index)
- Reduce task sets the reversed values

The transpose of 5,000 * 5,000 dense matrix took 12 mins using Hadoop/Hama (10 nodes). Why need to store the result? If we store the result, the locality will be provided for next steps, such as multiplication.

April 10, 2009

The ASF is ten years

The Apache Software Foundation was 10 years old two weeks ago. At that time, I was university student in Korea, I never dreamed that I would someday play with apache. Recently I'm quite busy at work but I'll do my best for ASF.

- https://blogs.apache.org/foundation/entry/the_asf_is_ten_years

April 1, 2009

Spam Filtering using Personalized Ontologies

This paper introduce a user-customized filter based on user-preferences and emails as an personalized ontologies.

- http://imsc-dmim.usc.edu/publications/2009_SAC_SWA_final.pdf