August 31, 2007

Girls' Generation, Visit the NHN corp. !!!




But, I don't know why visit us. -_-a

Girls' Generation, also known as SNSD, the acronym of So Nyeo Shi Dae, is a large nine member girl group formed in 2007 by SM Entertainment. The group debuted on SBS Inkigayo on August 5, 2007, performing their first single, "Into the New World". The members are (in order of official announcement) YoonA, Tiffany, YuRi, HyoYeon, SooYoung, SeoHyun, TaeYeon (the leader), Jessica, and Sunny. They are said to be a multilingual group. Aside Korean, they are said to know English, Chinese, and Japanese. Both Jessica and Tiffany have been raised in America, HyoYeon traveled to Beijing in 2004, and SooYoung started out in the Japanese entertainment business first in the girl group Route-O in 2004.



August 30, 2007

The World’ s Largest Matrix Computation

The World's Largest Matrix Computation

One of the reasons why Google is such an effective search engine is the PageRank™ algorithm, developed by Google's founders, Larry Page and Sergey Brin, when they were graduate students at Stanford University. PageRank is determined entirely by the link structure of the Web. It is recomputed about once a month and does not involve any of the actual content of Web pages or of any individual query. Then, for any particular query, Google finds the pages on the Web that match that query and lists those pages in the order of their PageRank.

Imagine surfing the Web, going from page to page by randomly choosing an outgoing link from one page to get to the next. This can lead to dead ends at pages with no outgoing links, or cycles around cliques of interconnected pages. So, a certain fraction of the time, simply choose a random page from anywhere on the Web. This theoretical random walk of the Web is a Markov chain or Markov process. The limiting probability that a dedicated random surfer visits any
particular page is its PageRank. A page has high rank if it has links to and from other pages with high rank.

Let W be the set of Web pages that can reached by following a chain of hyperlinks starting from a page at Google and let n be the number of pages in W. The set W actually varies with time, but in May 2002, n was about 2.7 billion. Let G be the n-by-n connectivity matrix of W, that is,
gi,j is 1 if there is a hyperlink from page i to page j and 0 otherwise. The matrix G is huge, but very sparse; its number of nonzeros is the total number of hyperlinks in the pages in W.

Let cj and ri be the column and row sums of G.

cj = i gi,j, ri= j gi,j

The quantities ck and rk are the indegree and outdegree of the k-th page. Let p be the fraction of time that the random walk follows a link. Google usually takes p = 0.85. Then 1-p is the fraction of time that an arbitrary page is chosen. Let A be the n-by-n matrix whose elements
are

ai,j = p gi,j / cj + , where = (1-p) / n.

The matrix A is not sparse, but it is a rank one modification of a sparse matrix. Most of the elements of A are equal to the small constant . When n = 2.7·109, = 5.5·10-11.

The matrix is the transition probability matrix of the Markov chain. Its elements are all strictly between zero and one and its column sums are all equal to one. An important result in matrix theory, the Perron-Frobenius Theorem, applies to such matrices. It tells us that the largest eigenvalue of A is equal to one and that the corresponding eigenvector, which satisfies the
equation

x = Ax,

exists and is unique to within a scaling factor. When this scaling factor is chosen so that

ixi = 1

then x is the state vector of the Markov chain. The elements of x are Google's PageRank.
If the matrix were small enough to fit in MATLAB, one way to compute the eigenvector x would be to start with a good approximate solution, such as the PageRanks from the previous month, and simply repeat the assignment statement

x = Ax

until successive vectors agree to within specified tolerance. This is known as the power method and is about the only possible approach for very large n. I'm not sure how Google actually computes PageRank, but one step of the power method would require one pass over a database of Web pages, updating weighted reference counts generated by the hyperlinks between pages.

August 28, 2007

Yahoo! sued over disclosure of Chinese citizens' identities

Mark Tran
Tuesday August 28, 2007
Guardian Unlimited


Photograph: Getty Images

The internet company Yahoo! has become embroiled in a legal battle with a human rights group over a decision to disclose the identity of Chinese citizens, leading to their arrests.
Yahoo! is being sued by the World Organisation for Human Rights, based in Washington, on behalf of Wang Xiaoning and his wife, Yu Ling.

He is serving a 10-year prison sentence for advocating democratic reform in articles circulated on the internet.

The group is also suing Yahoo! on behalf of Shi Tao, a journalist serving a 10-year sentence for sending an email summarising a Chinese government communiqué on how reporters should handle the 15th anniversary of the 1989 crackdown on the pro-democracy movement.

The suit alleges that these people - and others yet to be identified - were tortured or subjected to inhumane treatment at the hands of the Chinese authorities because of information that Yahoo!, Yahoo! China or Alibaba.com, a Chinese company in which Yahoo! has a minority stake, had passed on to the government.
Shi's case has been taken up by the British human rights group Amnesty International. The group says he is kept under tight control, with family visits requiring special approval from the prison manager, and is not allowed to receive printed matter, including books or newspapers.

Last November, he was awarded the Golden Prize of Freedom by the World Association of Newspapers.

Amnesty has also criticised Yahoo! for providing information to the authorities that led to the arrests and, more generally, the involvement of the company in the practice of government censorship.

In a 40-page defence filed in Oakland, California yesterday, the internet firm argued that US courts were not the place for political grievances against the Chinese government.

"This is a political and diplomatic issue, not a legal one," Kelley Benander, a Yahoo! spokeswoman, told the Los Angeles Times. "The real issue here is the plaintiffs' outrage at the behaviour and laws of the Chinese government. The US court system is not the forum for addressing these political concerns."

Yahoo! does not dispute turning over information in response to Chinese government demands, but argues there was little connection between that information and the arrest, prosecution and conviction of the prisoners.

In its court filings, the company said it "deeply sympathises" with the plaintiffs and their families and does not condone the suppression of their liberties. However, it also argues that the company has no control over Chinese laws or their enforcement.

Human rights groups have criticised other internet companies over their dealings with China.

Google has come under fire for its decision to censor its search services on subjects such as the 1989 Tiananmen Square massacre in order to gain greater access to China's fast-growing market.

See my habitat through Google Earth

http://www.panoramio.com/photo/1986856
"Bundang First Tower" lat=37.383034, lon=127.121532

August 27, 2007

Bottom Line for Successful Collaboration

Remarks by Mitch Friedman, Director, Conservation Northwest
to the American Forest Resource Council
at their Annual Conference, April 18, 2006

Visit Grist for Mitch's full article on this important issue

Ladies and Gentleman,

As flattered as I am by Will’s kind words, I want the record to note that I attended today under assertions that this event was just some folks gathering for a round of golf. I’ve taken note of the nearest exits and am wearing running shoes.

Actually, it was with pleasure that I accepted the invitation to sit up here with Will and Russ, both of whom I think very highly of, to see if I could get a rise out of you all.

Let's acknowledge up front that there are a lot of battle scars in the room today. I have no intention of living down my past, even if you would allow me. I was among the very first tree-sitters and organized the first spotted owl protests. My organization has long been among the list of usual suspects on appeals and litigation. The fact is that most of us feel that there are things that warrant the waging of war. For folks like me, old growth, roadless areas, and a future for wildlife like lynx and, yes, owls, make the list. But with age we realize that war is not its own reward, and we look for better ways to achieve our objectives.

I know your bottom line: In providing needed wood products and jobs, your companies have to be profitable in a financial climate rocked by regulation, globalization, and other factors.

On my side, the bottom line is preserving the systems and fabric of life across our region and planet in a climate clouded by greenhouse gases, the vast appetite of surging human populations, and other factors. I believe that our federal landscape should provide a sufficient network of reserves to sustain even the most demanding wildlife species and outside of those reserves should be a model of forest practices that are sustainable for stands, soils and stream life as well as for local rural communities. I further want to see solutions to the increasingly transient ownership and instability on our region's private timber lands, both large and small.
Our respective objectives can conflict but are not necessarily exclusive of one another. One way to look at this is that if you are among the companies that no longer have an economic stake in logging old growth or wildlands, or you would tend to agree that the time is past when that type of logging is socially acceptable, then collaboration is a way to expedite trends for which you are already positioned. Furthermore, when we resolve our differences collaboratively and to mutual benefit, it improves life for the people that share our communities. They too hunger for solutions that can sustain both economic prosperity and natural heritage.

Fortunately new mill technologies, new market opportunities and advances in silviculture give us more decision space within which to find common ground. That and five bucks will get you a latte grande.

In my limited experience, I've found that successful collaboration requires more than common ground. We must also accomplish that biggest of human challenges: Getting along.

Getting along doesn’t have to mean male bonding, arranged marriages between Bellingham and Forks, or even buying a Subaru and going vegan. It does mean exhibiting the leadership to engage and sustain relationships despite water bars in the road.

Challenges that I have observed through Conservation Northwest's experience, primarily on the Gifford Pinchot, Olympic and Colville National Forests, involve building trust, yielding turf, reaching agreement on field prescriptions, enshrining new protocols into boilerplate contracts, and more. Collaboration fills the timber pipeline more like a hand-operated water pump in a traditional campsite than like breaching the dams on the Lower Snake. It requires patience.

Leadership must be exhibited on all sides: timber, agency, and conservation. There are other "sides" too, such as contractors, tribes and community representatives. But the three key legs of this stool are Forest Service district and forest leadership, people like me, and people like you. Each will find pressure from both inside his organization and from among her peers to abandon collaboration and return to the trenches. We all know people who are most comfortable being something familiar rather than doing something unfamiliar.

My friend David Syre, of Trillium Corporation, and I have been catching flack just for working together to revitalize the waterfront of downtown Bellingham. Imagine how temperatures rise when old adversaries work together to cut down trees! But if you know your objectives, have the courage to explore new ways to achieve them, and have the patience to overcome obstacles, then you have a moral and business obligation to try collaboration irrespective of what your middle managers or neighbors might say.

If you have fortitude, your collaboration can withstand the wedges driven - often by public employees - in an effort to perpetuate Balkanized positions and preserve power where it doesn't belong.

The return on investment in collaboration can be very satisfying. The Gifford Pinchot National Forest's timber pipeline is no longer blocked up in litigation. The Survey and Manage injunction had slight effect on that Forest and nothing else is even under current appeal. On the GP, conservationists are now partners in seeking funding to plan and implement the new generation of timber and stewardship projects. We are also active partners in exploring efficient ways to fulfill the purposes of the National Environmental Policy Act, which already contains enough flexibility to fully inform decisions on potentially damaging federal projects while not wrapping benign or beneficial projects in red tape.

On the GP, it took a while to prime that hand pump, but now the pipeline is starting to fill up from thinning sales that are ecologically beneficial and socially acceptable. I want to recognize and give thanks to AFRC's own Bob Dick for his hard work and thoughtful leadership in the Pinchot Partnership and elsewhere. I've enjoyed an amicable relationship with Bob for twenty years, only partially because I know he hangs out with a tough crowd of Harley riders. It's no accident that Bob's efforts are yielding fruit, or more specifically, lots of little stumps.
Conservationists want to sustain this flow from the tens of thousands of acres of plantation stands on which habitat can be improved by thinning over the next few decades.
Our experience on the Colville is even more gratifying, if only because of the greater resource and political challenges of that landscape. I'm confident that the eventual outcomes from our efforts there will jointly increase timber predictability, wilderness protection, habitat restoration, community safety, and even political harmony.

Advancing a positive and common vision has its amusing moments. Picture Russ and me sitting in the office of Representative Cathy McMorris, with me lobbying for increased logging of small diameter trees and Russ pitching Wilderness protection. In fact, I sense an opportunity here to make the front page of the Oregonian if you'll all join me in reciting the word "wilderness" three times.

Russ already pointed out some of the steps that have led to the progress we have experienced. A few additional lessons from the Colville include:

1. Walk before you run. Don’t be too ambitious in early steps.
2. Prioritize the work to real and agreed-upon urgent needs, such as community safety, rather than those upon which positions are most likely to differ.
3. I reiterate Russ’ advice to focus on interests, not positions. This conceptual tool was provided by an outside consultant. On both the Colville and GP we found that outside consultants and facilitators were pivotal at key points.
4. Relationships are everything, and they are built by listening and by solving problems together.
5. Technical tools, maps, and jargon do not advance relationships. They are peripheral, not central, to good collaboration.
6. Trust is built by the time-tested means of people honoring their word.
7. Sustainability is found in the common interests of business and conservation, not in the competition for who will be the last one standing.

It seems likely that collaboration will work better in some places than others. Perhaps the easiest experiences will be where mill capacity least exceeds available volume. Yet there are enough positive examples across the West, from Oregon's Fremont to New Mexico's Gila National Forest, to prove powerful potential. Why not shoot for the moon in testing this model? With leadership and effort, perhaps we can avoid the next Biscuit-like showdown in southwestern Oregon.

The past is behind us. Comparing scars is way more fun than comparing wounds. So in this world of problems, let's see how many we can solve through collaboration.
Thanks for your attention.

Job life story

I do a job life with the bad team unit together.
He is a clever in a superficial way and always snatches my business result.

But, only the Hbase Shell will be beautifully successful.

August 25, 2007

Huge Hole Found in the Universe

The universe has a huge hole in it that dwarfs anything else of its kind. The discovery caught astronomers by surprise.

The hole is nearly a billion light-years across. It is not a black hole, which is a small sphere of densely packed matter. Rather, this one is mostly devoid of stars, gas and other normal matter, and it's also strangely empty of the mysterious "dark matter" that permeates the cosmos. Other space voids have been found before, but nothing on this scale.

Astronomers don't know why the hole is there.

"Not only has no one ever found a void this big, but we never even expected to find one this size," said researcher Lawrence Rudnick of the University of Minnesota.

Rudnick's colleague Liliya R. Williams also had not anticipated this finding.

"What we've found is not normal, based on either observational studies or on computer simulations of the large-scale evolution of the universe," said Williams, also of the University of Minnesota.

The finding will be detailed in the Astrophysical Journal.

The universe is populated with visible stars, gas and dust, but most of the matter in the universe is invisible. Scientists know something is there, because they can measure the gravitational effects of the so-called dark matter. Voids exist, but they are typically relatively small.

The gargantuan hole was found by examining observations made using the Very Large Array (VLA) radio telescope, funded by the National Science Foundation.

There is a "remarkable drop in the number of galaxies" in a region of sky in the constellation Eridanus, Rudnick said.

The region had been previously been dubbed the "WMAP Cold Spot," because it stood out in a map of the Cosmic Microwave Background (CMB) radiation made by NASA's Wilkinson Microwave Anisotopy Probe (WMAP) satellite. The CMB is an imprint of radiation left from the Big Bang, the theoretical beginning of the universe.

"Although our surprising results need independent confirmation, the slightly colder temperature of the CMB in this region appears to be caused by a huge hole devoid of nearly all matter roughly 6 to 10 billion light-years from Earth," Rudnick said.

Photons of the CMB gain a small amount of energy when they pass through normal regions of space with matter, the researchers explained. But when the CMB passes through a void, the photons lose energy, making the CMB from that part of the sky appear cooler.

August 24, 2007

Lee, Moving to JBoss a division of Red Hat

A few days ago, he's leaved NHN corp.
I heard he was moving to JBoss a division of Red Hat.
I feel enviable. -0-

Trustin Lee is a member of the Apache Software Foundation, a PMC (Project Management Committee) chair, committer, and the founder of the Apache MINA project, who is involved in various open source projects. He has been developing high-performance network applications including a massive SMS gateway, a lightweight ESB, and ApacheDS LDAP server in Java for more than 4 years. Please look around his blog or his résumé to find out more about him.

Yahoo, Microsoft asked to censor Chinese blogs

Yahoo Inc., Microsoft Corp. and other providers of blogging technology in China agreed to try to sign up users under their real names and to censor their posts, a journalism advocacy group that condemns the accord said Thursday.

Under the accord with the Internet Society of China, an offshoot of the Information Industry Ministry, the companies are "encouraged" to register users under their real names, Reporters Without Borders said in a statement. The companies may be forced to censor content or identify bloggers, the Paris-based group said.

The agreement is detrimental to free speech because service providers would be forced to divulge bloggers' identities or be punished by the government, Reporters Without Borders said. The companies also are required to "delete illegal and bad information" from blogs, the group said.

"As they already did with website hosting services, the authorities have given themselves the means to identify those posting 'subversive' content by imposing a self-discipline pact," the group said.

The accord stopped short of banning anonymous blogging, a technique Chinese Internet users have used to criticize the government for fear of reprisal. China had 162 million users in June, second only to the U.S.

Microsoft said it wouldn't ask users to reveal their identities.

"The document makes some recommendations that Microsoft does not support," Adam Sohn, director of the company's online services group, said in a statement.

"We will not implement real-name registration for blogging in our Windows Live Spaces service."

Yahoo spokeswoman Linda Du referred questions to Alibaba.com Corp., which runs Yahoo's site in China. Porter Erisman, a spokesman for Alibaba.com, didn't immediately comment.

Other blog providers that agreed to the accord include Sohu.com Inc. and Qianlong Wang, Reporters Without Borders said.

Naver, The Google Of South Korea

Crowd's wisdom helps South Korean search engine beat Google and Yahoo from the New York Times describes South Korea's most popular search engine, Naver.
Naver currently has a 77 percent share of all searches from within South Korea. Daum.net follows with 10.8 percent, Yahoo with just 4.4 percent and Google with a tiny 1.7 percent of Korean Web searches.
Why does Google fall short in South Korea? Wayne Lee, an analyst at Woori Investment and Securities, said "No matter how powerful Google's search engine may be, it doesn't have enough Korean-language data to trawl to satisfy South Korean customers."
Naver's founders realized that when searching in Korean, there was hardly anything to be found. So they set out to create the content and databases, so that when you would search in Korean, you would find quality content. Naver set up "Knowledge iN" in 2002, enabling Koreans to help each other in a type of real-time question-and-answer platform. On average, 44,000 questions are posted each day with about 110,000 returned answers.
The company is now the most profitable in South Korea and employs "27,000 workers, posted 299 billion won, or $325 million, in profit out of 573 billion won in sales last year. It has a market value of nearly 8 trillion won," says the New York Times article.
Google and Yahoo are making efforts to catch up in this market. Google, for example, recently announced an answers service for Russia that could also come to South Korea (see Google Launches "Question and Answers" In Russia). Google also recently has tried to jazz up its Korean home page (see Google's New 'Animated' Home Page In Korea).

Hbase Shell

Hbase Shell is a basic, command-line, and interactive 'shell' for manipulating tables in Hbase. It has support for a small set of SQL-inspired operations. Results are presented in an ASCII-table format.

The Hbase Shell aims to be to Hbase what the mysql client command-line tool is to mysqld, and what sqlplus to Oracle.

Hbase Shell was first added to TRUNK in July, 2007.

Google Sky Gives a Close-Up View of the Universe

Armchair explorers will now have the entire universe at their fingertips, thanks to Google's latest venture, Google Sky, a new free feature that's an application in the popular Google Earth program.
Starting today, anyone with a computer can view a close-up of about 100 million galaxies and 200 million stars।
To access Google Sky, available today, download the new Google Earth at http://earth.google.com.
"This is an application that allows you to see the sky at very, very high resolution, as if you were just flying through the universe and seeing and visiting galaxies," said Chikai Ohazama, a Google product manager who has worked to gather data from astronomical organizations around the world.
Google has stitched together real photographs of the universe into one giant database.
"Basically you're seeing imagery that you have to have a very, very high-powered telescope to look at and we're placing that in the database," Ohazama said. "You can zoom in very, very close and see the actual spiral, a galaxy and the clusters around it."
Google already allows users to see Earth at a level of detail many spy agencies would envy. The program's satellite and street-level imagery is so advanced it has generated alarm from privacy advocates.
One of the unique features of Google Sky is that you can plug in your address and the program shows you what the sky above your home looks like.
Google Sky allows users to bookmark constellations, rotate the whole sky and zoom in to see details of black holes and stars.
It is an awe-inspiring look at the universe, not to mention a whole new way to waste time at work.

FT: Yahoo!'s bet on Hadoop

One of the most important announcements at Oscon last week was Yahoo!'s commitment to support Hadoop. We've been writing about Hadoop on radar for a while, so it's probably not news to you that we think Hadoop is important.
Yahoo's involvement wasn't actually news either, because Yahoo! had hired Doug Cutting, the creator of hadoop, back in January. But Doug's talk at Oscon was kind of a coming out party for Hadoop, and Yahoo! wanted to make clear just how important they think the project is. In fact, I even had a call from David Filo to make sure I knew that the support is coming from the top.
Jeremy Zawodny's post about hadoop on the Yahoo! developer network does a great job of explaining why Yahoo! considers hadoop important:
For the last several years, every company involved in building large web-scale systems has faced some of the same fundamental challenges. While nearly everyone agrees that the "divide-and-conquer using lots of cheap hardware" approach to breaking down large problems is the only way to scale, doing so is not easy.
The underlying infrastructure has always been a challenge. You have to buy, power, install, and manage a lot of servers. Even if you use somebody else's commodity hardware, you still have to develop the software that'll do the divide-and-conquer work to keep them all busy.
It's hard work. And it needs to be commoditized, just like the hardware has been...
To build the necessary software infrastructure, we could have gone off to develop our own technology, treating it as a competitive advantage, and charged ahead. But we've taken a slightly different approach. Realizing that a growing number of companies and organizations are likely to need similar capabilities, we got behind the work of Doug Cutting (creator of the open source Nutch and Lucene projects) and asked him to join Yahoo to help deploy and continue working on the [then new] open source Hadoop project.
Let me unpack the two parts of this news: hadoop as an important open source project, and Yahoo!'s involvement. On the first front, I've been arguing for some time that free and open source developers need to pay more attention to Web 2.0. Web 2.0 software-as-a-service applications built on top of the LAMP stack now generate several orders of magnitude more revenue than any companies seeking to directly monetize open source. And most of the software used by those Web 2.0 companies above the commodity platform layer is proprietary. Not only that, Web 2.0 is siphoning developers and buzz away from open source.
But there are open source projects that are tackling important Web 2.0 problems "up the stack." Brad Fitzpatrick's LiveJournal scaling tools memcached, perlbal, and mogileFS come to mind, as well as OpenID. Hadoop is another critical piece of Web 2.0 infrastructure now being duplicated in open source. (I'm sure there are others, and we'd love to hear from you about them in the comments.)
OK -- but why is Yahoo!'s involvement so important? First, it indicates a kind of competitive tipping point in Web 2.0, where a large company that is a strong #2 in a space (search) realizes that open source is a great competitive weapon against their dominant competitor. It's very much the same reason why IBM got behind Eclipse, as a way of getting competitive advantage against Sun in the Java market. (If you thought they were doing it out of the goodness of their hearts rather than clear-sighted business logic, think again.) If Yahoo! is realizing that open source is an important part of their competitive strategy, you can be sure that other big Web 2.0 companies will follow. In particular, expect support of open source projects that implement software that Google treats as proprietary. (See the long discussion thread on my post about Microsoft's submission of their shared source licenses to OSI for my arguments as to why "being on the right side of history" will ultimately drive Microsoft to open source.)
Supporting Hadoop and other Apache projects not only gets Yahoo! deeply involved in open source software projects they can use, it helps give them renewed "geek cred." And of course, attracting great people is a huge part of success in the computer industry (and for that matter, any other.)
Second, and perhaps equally important, Yahoo! gives hadoop an opportunity to be tested out at scale. Some years ago, I was on the board of Doug's open source search engine effort, Nutch. Where the project foundered was in not having a large enough data set to really prove out the algorithms. Having more than a couple of hundred million pages in the index was too expensive for a non-profit open source project to manage. One of the important truths of Web 2.0 is that it ain't the personal computer era any more, Eben Moglen's arguments to the contrary notwithstanding. A lot of really important software can't even be exercised properly without very large networks of machines, very large data sets, and heavy performance demands. Yahoo! provides all of these. This means that Hadoop will work for the big boys, and not just for toy projects. And as Jeremy pointed out in his post (linked and quoted above), today's big boy may be everyday folks a few years from now, as the size and scale of Web 2.0 applications continue to increase.
BTW, in followup conversations with Doug, he pointed out that web search is not actually the killer app for hadoop, despite the fact that it is in part an implementation of the MapReduce technique made famous by Google. After all, Yahoo! has been doing web search for years without this kind of general purpose scaling platform. "Where Hadoop really shines," says Doug, "is in data exploration." Many problems, including tuning ad systems, personalization, learning what users need -- and for that matter, corporate or government data mining -- involve finding signal in a lot of noise. Doug pointed me to an interesting article on Amazon Web Services Developer Connection: Running Hadoop MapReduce on Amazon EC2 and Amazon S3. Doug said in email:
It provides an example of using Hadoop to mine one's [logfile] data.
Another trivial application for log data that's very valuable is reconstructing and analyzing user sessions. If you've got logs for months or years from hundreds of servers and you want to look at individual user sessions, e.g., how often do users visit, how long are their sessions, how do they move around the site, do often do they re-visit the same places, etc. This is a single MapReduce operation over all the logs, blasthing through, sorting and collating all your logs at the transfer rate of all the drives in your cluster. You don't have to re-structure your database to measure something new. It's really as easy as 'grep sort uniq'.
Also, here are the slides from my talk.