TEACHER Dataset Collection


This webpage contains a set of education datasets (course prerequisite networks) collected from MIT OpenCourseWare, Caltech, CMU and Princeton during the TEACHER project.

For detailed usage please refer to our papers:

This project was supported by the NSF Grants 1350364.


University # Courses # Prerequisites # Words Download
MIT 2322 1173 15396 mit.tar.gz
Caltech 1048 761 5617 caltech.tar.gz
CMU 83 150 1955 cmu.tar.gz
Princeton 56 90 454 princeton.tar.gz


The *.tar.gz file for each institution contains two files
  1. Each row of *.lsvm is the bag-of-words feature for a course

    <course-id> <word>:<count> <word>:<count> ... <word>:<count>

  2. Each row of *.link is a pair of courses with a prerequisite relationship

    <prerequisite-course-id> <postrequisite-course-id>


Hanxiao Liu, Carnegie Mellon University.