BIG DATA TALKS & LECTURE SERIES --FALL 2014
MIT CSAIL, Stata Center, Bldg 32
Talks will feature distinguished individuals from academia, industry and government including pre-eminent people from all the subfields of computer science that have something to say about data, data processing and analytics, as well as people from organizations that are consumers of Big Data from both industry and government.
Prior speakers: Erik Brynjolfsson (MIT Sloan School); Ion Stoica (UC Berkeley); Alyosha Efros (CMU); Murli Buluswar (AIG); Chris Olston (Google); Jeff Dean and Sanjay Ghemawat (Google).
Estimating Class Counts in Unlabeled Data
Wednesday, November 5, 2014 - 3:00pm to 4:00pm
Jerome H. Friedman, Professor of Statistics, Stanford University
The problem of estimating the fraction of positive examples in a future unlabeled data set using a regression model trained on past labeled data is studied. It is shown that if the training data fraction is substantially different than that of the future sample then the straightforward estimate can be highly biased.