**Instructor:** Yiming Yang

**TA:** Guoqing Zheng, Pengfei Wang

**Time and
Location**: TR 12:00 - 1:20pm, GHC 4215

**Prerequisites:**

·
CS courses
on data structures and algorithms, strong programming capabilities, linear
algebra and intro probability;

·
Intro
Machine Learning is not required but helpful

** Syllabus and Detailed Course Materials **[Here]: You need a CMU account to access on campus or via VPN

This is a full-semester
lecture-oriented course (12 units) for students at the PhD-level, MS-level and
undergraduate students who meet the pre-requisites. It offers a blend of core theory, algorithms,
evaluation methodologies and applications of scalable data analytic
techniques. Specifically, it focuses on
the following topics:

·
Clustering

·
Link analysis

·
Collaborative Filtering
(Recommender systems)

·
Matrix Factorization

·
Social media analysis

·
Web-scale text classification

·
Learning to rank for document
retrieval

·
Statistical significance tests

Notice that 11-741 and 11-641 are 12-unit courses for graduate students, but 11-441 is a 9-unit course for undergraduate students. Although the lectures and exams are the same in all these courses, the required work load by students differs by course. That is, the required course work in11-441 is a subset of that in 11-641, and the latter is a subset of that in 11-741. See the detailed distinctions in the Grading section. 11-741 is among the required courses for PhD candidates in the Language Technologies of Institute while 11-641 only counts as a master-level course. Graduate students can choose either 11-741 or 11-641, depending on their career goals and the backgrounds. Undergraduate students should take 11-441; exception is possible if approved by the instructor.

· You're a
CS student interested in machine learning techniques for large-scale text
mining

· You like
AI, machine learning, and/or theoretical CS, and want to apply them to a hard
real-world problems

· You're a
non-CS student who can program well, have mathematical abilities and interested
in machine learning and its applications to text and social media

· You're a
language technology minor (this course is an elective option)

· You're are
interested in broad applications of machine learning such as web-scale
classification, structure discovery from massive data, learning to recommend,
social-community discovery, sentiment analysis, trend detection over time
sequences, etc.

· You’re
curious about statistical significance tests for machine learning and have the
background.