You are here


MIT Big Data Initiative at CSAIL
Member Workshop
"Big Data Analytics: Challenges in Big Data for Data Mining, Machine Learning and Statistics"

Extracting useful information from very large data sets is challenging. In this workshop, we will focus on the challenges of applying machine learning, data mining, and statistics to massive-scale data sets.

Date: MARCH 26, 2014
Location:  MIT CSAIL, Stata Center, Bldg 32, Student Street 1st floor, 32-155
Cambridge, MA 02139



8:30AM Registration Check in/ Coffee and Continental Breakfast                                

9:00AM Introduction to Workshop Objectives - Elizabeth Bruce and Cynthia Rudin, MIT                                  

9:15-10:30AM  SESSION 1.  Moderator: Cynthia Rudin, MIT                               

"Making Information Intelligent: Opportunities to Unlock New Value"
Dr. Khalid Al-Kofahi, VP of Corporate R&D, Thomson Reuters                                   

           "Data Mining MOOC Forums"
            Dr. Una-May O'Reilly, MIT CSAIL, ALFA Group

"Computational Advertising with Big Data"
Deepak Agarwal, Director of Engineering, LinkedIn                            

10:30-10:50AM   BREAK

10:50-12:00PM   SESSION 2.  Moderator: Elizabeth Bruce, MIT

"The Unreasonable Power of Clinical Data"
 Prof. Peter Szolovits, MIT CSAIL, Clinical Decision Making Group

"Visualizing and Understanding Deep Convolutional Networks for Object Recognition"
Prof. Rob Fergus, New York University and Research Scientist at Facebook AI Lab

LIGHTNING ROUND 1. [short talks (2-5 min each) giving input on challenges/approaches from workshop participants]

1. Erez Shmueli (MIT Media Lab)
2. Evelyne Viegas (Microsoft Research)
3. Kalyan  Veeramachaneni (ALFA group, CSAIL MIT)

12:00-1:00PM  LUNCH (outside 32-155 on the Stata Center Student Street)

1:00-2:35PM SESSION 3.  Moderator:  Sam Madden, MIT

"Predictive Models from Massive Fine-Grained Data Work Great, But Do You Understand What They're Doing?
 Prof. Foster Provost, New York University

"Learning to Interact"
Dr. John Langford, Director of Learning, Microsoft Research

"Scaling up Event and Pattern Detection to Big Data"
Prof. Daniel Bertrand Neill, Carnegie Mellon University

LIGHTNING ROUND 2. [short talks (2-5 min each) giving input on challenges/approaches from workshop participants]

1.  Damien Bayart (BT)
2.  Hyungdong Lee (Samsung)
3.  Aditya Parameswaran  (Univ. of Illinois)

2:35-3:00PM  BREAK

3:00-4:00PM  SESSION 4.  Moderator:  Elizabeth Bruce, MIT

"Querying the probable implications of your tabular data with BayesDB"
Dr. Vikash Mansinghka, MIT CSAIL, Probabilistic Computing Project

"Gaining Insight from Interpretable Predictive Modeling"
Prof. Cynthia Rudin, MIT Sloan School, CSAIL, Prediction Analysis Lab


4:00PM  END WORKSHOP                                 

Examples of topics for discussion include:

• Developing approximations when algorithms don’t scale using sampling, sketching, and other techniques;
• Developing summarization and explanation techniques that show patterns in data in an intuitive and useful way; 
• Evaluating the quality of these methods on new data sets; and
• Integrating massive amounts of unstructured text from social media sites, scientific documents, industrial reports, websites, and other sources. 

Given the current speed at which these large data sets are generated, there is a pressing need for solutions to these challenges; we will discuss applications and related challenges in data mining and machine learning on Big Data and explore potential solutions.

This workshop is part of a series focusing on major challenges when it comes to Big Data as part of the MIT Big Data Initiative at CSAIL. These workshops bring together a select group of thought leaders, from industry, academia and government, to focus on the future of Big Data.
Host:  MIT Big Data Initiative at CSAIL
Workshop Organizers:  Prof. Cynthia Rudin MIT Sloan, Prof. Sam Madden MIT CSAIL, Elizabeth Bruce, MIT CSAIL
REGISTRATION:  [members please contact Susana Kevorkova,, for registration details]