**Instructor:** Yiming Yang

**TA:** Hanxiao Liu, Jiachen Li

**Time and
Location**: TR, 12:00 - 1:20pm, HH B131

**Prerequisites:**

·
CS courses
on data structures and algorithms, strong programming capabilities, linear
algebra and intro probability;

·
Intro
Machine Learning is not required but helpful

** Syllabus and Detailed Course
Materials **[Here]: You need a CMU account to access on campus
or via VPN

This is a
full-semester lecture-oriented course (12 units) for students at the PhD-level,
MS-level and undergraduate students who meet the pre-requisites. It offers a blend of core theory, algorithms,
evaluation methodologies and applications of scalable data analytic techniques. Specifically, it focuses on the following
topics:

·
Clustering

·
Link
analysis

·
Collaborative
Filtering (Recommender systems)

·
Matrix
Factorization

·
Social
media analysis

·
Web-scale
text classification

·
Learning
to rank for document retrieval

·
Statistical
significance tests

Notice that 11-741 and 11-641 are 12-unit courses for graduate students, but 11-441 is a 9-unit course for undergraduate students. Although the lectures and exams are the same in all these courses, the required work load by students differs by course. That is, the required course work in11-441 is a subset of that in 11-641, and the latter is a subset of that in 11-741. See the detailed distinctions in the Grading section. 11-741 is among the required courses for PhD candidates in the Language Technologies of Institute while 11-641 only counts as a master-level course. Graduate students can choose either 11-741 or 11-641, depending on their career goals and the backgrounds. Undergraduate students should take 11-441; exception is possible if approved by the instructor.

· You're a
CS student interested in machine learning techniques for large-scale text
mining

· You like
AI, machine learning, and/or theoretical CS, and want to apply them to a hard
real-world problems

· You're a
non-CS student who can program well, have mathematical abilities and interested
in machine learning and its applications to text and social media

· You're a
language technology minor (this course is an elective option)

· You're are
interested in broad applications of machine learning such as web-scale classification,
structure discovery from massive data, learning to recommend, social-community
discovery, sentiment analysis, trend detection over time sequences, etc.

· You’re
curious about statistical significance tests for machine learning and have the
background.