Predicting The Future and The Past

July 24, 2010

When I first stumbled over Kaggle I thought it was just a page offering a data analysis competition. But I was wrong, it is a page which offers a platform for data-related competitions. Companies, researchers, government and others can open their research request to any analysts worldwide seeking the best solution possible. This approach is much more likely to yield an innovative solution to problems that seem impossible to solve as a single company or organization barely has the perfect team or technique to solve any given obstacle. Publishing a data analysis assignment via such a platform not only improves the chances of generating the most suitable results but also offers academical researchers the option to interact with the business world. They can test and improve their new methods applying them on industrial challenges.
Of course the idea of putting up a competion for business related topics by companies is not new. For example, in 2006 Netflix, a DVD rental provider, offered $1m to the analyst who could improve their recommendations algorithm by 10%. $1m dollars seems like a huge prize but according to Netflix CEO Reed Hastings, an improvement of 10 per cent was worth “well in excess of $1m”.
Something that really amazed me is that hosting a contest on Kaggle is for free! They take care about all the competitions privacy, provide the infrastructure and it is quite easy to set up a contest.

Here the three steps to host a competion on Kaggle:

STEP 1: The competition host posts contest details
The competition host frames the contest and uploads relevant data. Framing the contest involves setting a deadline, outlining the competition’s objectives, describing any data, providing submission instructions and the criteria by which the winner will be selected. Visit Post a Contest to step through the process.

STEP 2: Competitors upload their predictions

Competitors upload their submissions. For predicting-the-past competitions, submissions are evaluated on-the-fly (against a solution file uploaded during STEP 1). For predicting-the-future competitions, submissions are evaluated once the relevant event has taken place (but the competition host can make use of the forecasts in the meantime).

STEP 3: Predictions are evaluated based on their accuracy
Once the deadline passes, the winners are selected. In some competitions, the prize may not be awarded until a satisfactory explanation is received.

I love the idea!

Here is a great interview with the CEO of Kaggle.

What about you?
Would you enter a competion on Kaggle?

Data Mining Competition: Win up to 5000USD

June 23, 2010

This year´s ICDM Data Mining Contest task is to predict city traffic for the purpose of intelligent driver navigation and improved journey planning. There are three independent tasks: traffic, jams and GPS. Each of them approaches the problem of traffic prediction from a different perspective and involves different types of data. You can either participate in all of them or just choose any task you like. The competition started on June 22nd, 2010 (you can still register, the status is on open) and will last until September 6th, 2010. It is organized on TunedIT Challenges data mining platform. Best solutions will be awarded with prizes worth 5,000 USD in total.

The ICDM has established itself as the world’s premier research conference in data mining and will be held in Sydney, Australia on December 14-17.