Advanced Statistical Learning Seminar (11-745)

Fall 2015 Special Topics: Best Papers on Temporal Dynamics and Scalable Optimization Algorithms (6 Units)

Description: This seminar aims to deepen the participants' understanding of the theoretical foundation of statistical learning and applications, to broaden their knowledge of new techniques, directions and challenges in the field, and to inspire research ideas through class-room discussions.  In the past years, this seminar was structured as a 12-unit or 6-unit course in the form of group-reading, presenting and discussing the book (chapter by chapter) of The Elements of Statistical Learning: Data Mining, Inference, and Prediction (by Trevor Hastie et al.) and the Foundations of Machine Learning (by Mehryar Mohri et al.) In the fall of 2015, we will have a 6-unit seminar, reading a set of selected best papers from NIPS, ICML, KDD, ACM and IJCAI.  We will meet once a week, one main paper (sub-topic) per week, with presentations rotating among participants. In each week, the assigned presenter starts with the questions from all the participants (collected by email) about the current topic/sub-topic, followed by a presentation on that topic and leads the discussion. All the students are required to read the selected paper for the week before the class while the representer should read 2 or 3 additional related papers (chosen by the presenter) as needed; students should email their questions to the presenter and CC to everybody. The slides of each presentation should be shared (by email) with all the class members after the presentation (in the same day). By the end of the semester, each student will individually write a short 3-4-page white paper, outlining a research proposal for new work extending one of the research areas covered in class, or analyzing more than one area with respect to the open challenges, the state-of-the-art sollutions and the future research opportunities. There will be no exams or homework.  The grading is based on class participation (30%), questions submitted for each class and class-room discussions (20%), quality of the seminar presentations (30%) and the final paper (20%).

Prerequisites: 10-701 (PhD-level Machine Learning) or equivalent course is required.  Other relevant courses include 10-702 (Statiscial Machine Learning), 10-705/36-705 (Intermediate Statistics) and 10-725 (Convex Optimization) are helpful but not required.  If you are not sure about the expectation, please discuss with the instructor.

Time & Location: GHC 6708 on Tuesdays from 9:00am to 10:20am (exception: GHC 6404 on 9/15)

Tentative Schedule (papers and assigned presenters):

Course Organization, Paper Assignments

9/1

 

Part I. Modeling Temporal Dynamics

 

KDD

2002

Pattern discovery in sequences under a Markov assumption

Darya Chudova & Padhraic Smyth, University of California Irvine

9/8

Wanli (slides)

ICML

2015 (no award)

HawkesTopic: A Joint Model for Network Inference and Topic Modeling from Text-Based Cascades

Xinran He… Yan Liu, USC

9/15

Fatima (slides)

NIPS

2013

Scalable Influence Estimation in Continuous-Time Diffusion Networks

Nan Du, Le Song, Manuel Gomez-Rodriguez,
Hongyuan Zha

9/22

Guoqing (slides)

 

KDD

2005

Graphs over time: densification laws, shrinking diameters and possible explanations

Jure Leskovec, Carnegie Mellon University.

9/29

Mannal (slides)

 

KDD

2010

Connecting the dots between news articles

Dafna Shahaf & Carlos Guestrin, Carnegie Mellon University

10/6

Andrew (slides)

 

 

Part II.  Optimization Algorithms (6)

ACM

2004

ICML 2014 10-year best (not an ICML 2014 paper):

Multiple kernel learning, conic duality, and the SMO algorithm

Francis Bach, Gert Lanckriet and Michael Jordan

10/13

Ruochen (slides)

KDD

2006

Training linear SVMs in linear time

Thorstein Jochiams

10/20

Yiming (slides)

ICML

2008

SVM Optimization: Inverse Dependence on Training Set Size

Shai Shalev-Shwartz & Nathan Srebro, Toyota Technological Institute at Chicago

10/27

Yuantian (slides)

ICML

2015 (no award)

Mind the duality gap: safer rules for the Lasso

(ref: Safe Screening for Multi-Task Feature Learning with Multiple Data Matrices  )

Olivier Fercoq, Alexandre Gramfort, Joseph Salmon

Jie Wang, Jieping Ye, U of Michigan

11/3

Hanxiao (slides)

 

ICML

2015

A Nearly-Linear Time Framework for Graph-Structured Sparsity

Chinmay Hegde, Piotr Indyk, Ludwig Schmidt, MIT

11/10

Adams (slides)

 

ICML

2015

Optimal and Adaptive Algorithms for Online Boosting

Alina Beygelzimer, Satyen Kale, Haipeng Luo, Yahoo Lab and Prinston

11/17

Dean (slides)

ICML

2014

Best application paper: Compositional Morphology for Word Representations and Language Modelling

by Jan Botha and Phil Blunsom

11/24

Chenyuan (slides)

IJCAI

2015

Bayesian Active Learning for Posterior Estimation

Kirthevasan KandasamyJeff Schneider and Barnabás Póczos, CMU

12/1

Zhiting (slides)

ICML

2011

(no award)

Hashing with graphs

Liu W, Wang J, Kumar S, Chang, S.

12/8

Yuexin (slides)

ICML

2008

(no award)

Efficient projections onto the L1-ball for learning in high dimensions

Duchi J, Shalev-Shwartz S, Singer Y, et al.

12/15

Wei-Chang (Peter)

(slides)

White Paper Due

12/15