Share via


Retail Analytics Challenges

Crowdsourcing solutions for complex business problems has always interested me and I have been following sites like Microsoft ImagineCup, TopCoder, CodeChef, Kaggle to discover challenges of interest to the Retail Industry. So far these are the challenges that I have found of interest:

dunnhumbychallenge Image dunhumby shopper challenge Going grocery shopping, we all have to do it, some even enjoy it, but can you predict it? dunnhumby is looking to build a model to better predict when supermarket shoppers will next visit the store and how much they will spend.
  The Results of the challenge were pretty impressive with 537 players participating in 287 teams and the winning entry was 208% more accurate than the existing benchmark. The winner, D'yakonov Alexander, a 32-year-old associate professor of mathematics at Moscow State University, used a method that gave more weight to recent visits to predict the next visit. In the member forum, he has posted his code from MATLAB as well as a description of his methodology.
   
The Netflix Prize  To help customers find movies they love, Netflix developed a movie recommendation system: Cinematch. Its job is to predict whether someone will enjoy a movie based on how much they liked or disliked other movies. Netflix uses those predictions to make personal movie recommendations based on each customer’s unique tastes. The goal of the Netflix prize was to improve the recommendations from Cinematch.
  The Results of the challenge was pretty impressive spanning 3 years and 50,051 contestants but in the end, it all came down to two teams submitting the same score 10 minutes apart in the final 20 minutes of the contest. Here is a document with details of the methodology by Yehuda Koren. The Pragmatic Theory Solution is at LINK. One of the key data analysis tools that the BellKor team used to win the Netflix Prize was the Singular Value Decomposition (SVD) algorithm. In this Video, Brian Lewis shows how to use the sparse Matrix object in R to efficiently store the data (about 99 million actual movie ratings) and the irlba package (which features a fast and efficient SVD algorithm for big data) to perform SVD analysis on the Netflix data in R.