Hans Rosling -The Joy of Stats

December 26, 2010

“I kid you not, statistics are now the sexiest subject on the planet.”

Hans Rosling

Hans Rosling is my idol, his online lectures are of this eye-opening, mind-expanding and funny kind. He is not only the developer of gapminder, a really cool information visualization software for animation of statistics but also an international known medical doctor, academic, statistician and public speaker.
In his videos ‘The Joy of Stats’ he shows exactly why some people (including me) have a passion for statistics, it is exciting and fun to find the story behind the data, tell it in a visual appealing way and make sense of the world.

You can find many more great lectures on TED or of The Joy of Stats series.

UPDATE: The Joy of Stats is now available in its entirety on Gapminder



Online book: Introduction to data mining

December 26, 2010

What a great online book about Data Mining!!! Thanks to the authors for providing this book for free but honestly I would buy the book if it was available in print (now I need to do a lot of printing work).

What is it that makes this book so great? The structure and visualization are what caught my attention. Have a look at the table of content it is actually a Data Mining Map; what a great idea to use structured map.

The authors´ approach is practical and does not go too deep in explanations which is good if you are not interested in the theoretical equations behind the stats. You will find a lot of pictures, tables, definitions and exercises but less formulas. This online book was created by Dr. Saed Sayad in a collaboration with Professor Stephen T. Balke in the Department of Chemical Engineering and Applied Chemistry at the University of Toronto.

The Beginner’s Guide For Web Data Analysis

November 19, 2010

On his blog Occam’s razor about web analytics Avinash Kaushik wrote a post with the topic “Beginner´s guide to web analytics: Ten steps to love & success”. Being an expert in web analytics he gives a practical introduction into this field. This is an overview about the ten steps that you should follow according to him. The whole post gives a useful outline how to get started with web analytics, so check out here.

Step 1: Visit the website. Note objectives, customer experience, suckiness.

Step 2: How good is the acquisition strategy? Traffic Sources Report.

Step 3: How strongly do Visitors orbit the website? Visitor Loyalty & Recency.

Step 4: What can I find that is broken and quickly fixable? Top Landing Pages.

Step 5: What content makes us most money? $Index Value Metric.

Step 6: How Sophisticated Is Their Search Strategy? Keyword Tag Clouds.

Step 7: Are they making money or making noise? Goals & Goal Values.

Step 8: Can the Marketing Budget be optimized? Campaign Conversions/Outcomes.

Step 9: Are we helping the already convinced buyers? Funnel Visualization.

Step 10: What are the unknown unknowns I am blind to? Analytics Intelligence.

Click here for full article.

KDD 2010 – Washington

October 17, 2010

It has been a while that the 16th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining is over but there are many keynote presentations, workshops, tutorials, case studies from the conference available. In fact, they are so interessting I do not know where to start watching. Some topics e.g. “Computational Social Science“, “Geometric Tools for Graph Mining of Large Social and Information Networks“, “Discovery of Significant Emerging Trends” etc.

In numbers:

22 Research Sessions

2 Data Mining Case Studies Sessions

4 Industry / Government Sessions

10 Tutorials

Have fun watching, I´ll defentley have 🙂

Predictive Analytics

September 26, 2010

In the white paper “Seven Reasons You Need Predictive Analytics Today” the author Eric Siegel (President, Prediction Impact, Inc. and Chair, Predictive Analytics World) states:

Predictive analytics has come of age as a core enterprise practice necessary to sustain competitive advantage. This technology enacts a wholly new phase of enterprise evolution by applying organizational learning, which empowers the business to grow by deploying a unique form of data-driven risk management across multiple fronts. This white paper reveals seven strategic objectives that can be attained to their full potential only by employing predictive analytics, namely Compete, Grow, Enforce, Improve, Satisfy, Learn, and Act.

1. Compete – Secure the Most Powerful and Unique Competitive Stronghold

A predictive model distinguishes the microsegments of customers who choose your company from those who defer or defect to a competitor. In this way, your organization identifies exactly where your competitor falls short, its weaknesses.

2. Grow – Increase Sales and Retain Customers Competitively

Each customer is predictively scored for sales-related behavior such as purchases, responses, churn and clicks. The scores then drive enterprise operations across marketing, sales, customer care and website behavior. In this way, predictive analytics delivers its unique competitive advantage to a range of customer-facing activity.

3. Enforce – Maintain Business Integrity by Managing Fraud

Scoring and ranking transactions with a predictive model leverages the organization’s recorded experience with fraud to dramatically boost fraud detection. […] more fraud is detected, and more losses are prevented or recouped.

4. Improve – Advance Your Core Business Capacity Competitively

Predictive analytics improves product manufacturing, testing and repair in many ways. For example, during production, faulty  items are detected on the assembly line.

5. Satisfy – Meet Today’s Escalating Consumer Expectations

Predictive analytics is an explicit selling point to the end consumer, […] with predictive analytics, the consumer gets better stuff for less, more easily and more reliably.

6. Learn – Employ Today’s Most Advanced Analytics

The capacity for predictive analytics to learn from experience is what renders this technology predictive, distinguishing it from other business intelligence and analytics techniques.

7. Act – Render Business Intelligence and Analytics Truly Actionable

[…] predictive analytics is specifically designed to generate conclusive action imperatives. Each customer’s predictive score drives action to be taken with that customer. In this way, predictive analytics is by design the most actionable form of business intelligence.

To read the full white paper click here.

Data Mine Games

September 20, 2010

In this tutorial Christian Thurau (a post-doctoral researcher at Fraunhofer IAIS, St. Augustin in Germany) talks about the theory of data mining techniques such as Matrix Factorization, soft and hard Clustering, Principal Component Analysis etc. and explains its application to games. The video covers World of Warcraft as well as some shooter games.

Microsoft SQL Server

September 19, 2010

In this video Scott Golightly shows how to use the Microsoft SQL Server data mining wizard. The video is about 20 minutes long and quite good to follow.
In the “How Do I?” library you can find many more video tutorials about different topics e.g.:  “How Do I: Optimize SQL Server Integration Services?” or “How Do I: Render reports to a wide-range of formats?”