Posts

Can Data Mining Algorithms Extract Value from your Personal Data (and should you get a piece of the action?)

Image
Go to  http://www.datamilk.com/survey/ to have your say.
Technology has made it easier than ever for people to collect and store a valuable trove of personal information about themselves. However, there is no readily available means by which individuals can reap a financial benefit by selling their personally generated data. Companies such as Facebook, Linkedin and Twitter are multi-billion dollar companies built almost entirely on user-generated data, so it’s clear that when used correct, your personal data is extremely valuable.
There is a growing unease about the disparity between the value that companies realize from personal data and the financial rewards individuals gleam from this information. Prof. Tim Wu from Columbia Law School recently argued that Facebook should pay us for our posts. Individually your data may not be worth very much. but collectibely it is a goldmine. The problem is that there is currently no way for individuals to collect and monetize their data. It’s as i…

Beyond the Hype - Data Science in the Real World

I will be presenting this talk at Phil Brieley’s Melbourne Data Science Meetup on June 23rd. See http://www.meetup.com/Data-Science-Melbourne/events/184731452/ for details. Hope you can join.
Here is the R code for the competition entry mentioned in my previous post. See http://www.datamilk.com/leaderboard_animation.gif for the animation.

library(lubridate)library(plyr)library(sqldf)library(ggplot2)library(animation)#clear everythingrm(list=ls(all=TRUE))# Injest datadata <- read.csv("unimelb_public_leaderboard.csv", header=TRUE)# calculate days and date time as numericdata <- data.frame(data, SubmissionDate_datetime = strptime(data$SubmissionDate,format="%m/%d/%Y %H:%M:%S %p"), Submission_day = round(strptime(data$SubmissionDate,format="%m/%d/%Y %H:%M:%S %p"),"day"), Submission_time_num = as.numeric(strptime(data$SubmissionDate,format="%m/%d/%Y %H:%M:%S %p")))   start_time <- min(na.omit(data$SubmissionDate_datetime)) end_time <- max(na.omit(data$SubmissionDate_datetime)) start_day <- min(na.omit(data$Submission_day)) end_day <- max(na.omit(data$Submission_day)) duration<- round(end_time - start_time,0)…

Please support me on Kaggle

Hi AllI have just entered a Kaggle competition. Please vote for my entry here. https://www.kaggle.com/c/leapfrogging-leaderboards/visualization/886 cheers
Ross Farrelly

Timeless Classics - the Antidote to Time Poverty

If, like many while collar workers in today’s modern economy, you are “time poor” and constantly swamped by the ever growing torrent of information coming at you every day, despair not. Help is at hand. But is comes in a somewhat unlikely guise. It’s not yet more sophisticated news aggregation text-mining algorithm, nor is it the next-gen web 3.0 nanoblogging, retwetting, facebook posting multifunction one-stop web-accumulation app for your smart phone. No. It is those leather bound volumes gathering dust on your bookshelf and that set of penguin classics you bought in a fit of self-improvement last year and have never read.

Let me explain.

The idea of being “time poor” really comes down to a balancing our desires. If we want to do more than we have time for, we say we are time poor. There are two possible solutions – either want to do less, or find a way to do more. Focusing on the latter solution, for many professionals, a closely related problem is that of deciding what information t…

Review of The Innovator’s Dilemma by Clayton M. Christensen

Review of Strategic Vision by Zbigniew Brzezinski