Course Sequence Analysis

The Fordham EDM Labs Course Sequence Analysis Tool (CSAT) executes our Python-based implementation of the Generalized Sequence Pattern (GSP) algorithm on student course data to identify course sequences that are commonly taken by students. This is a form of association analysis but unlike conventional association analysis order does matter (i.e., we mine sequences rather than itemsets). 
 
The code is open source and available on Github (link to be provided shortly). We will also post a manual for our tool, which will include a tutorial, shortly. 
 
An earlier version of the tool was used in the following published paper (that version of the tool was not able to learn patterns that involved courses taken simultaneously in the same semester):

Daniel D. Leeds, Cody Chen, Yijun Zhao, Fiza Metla, James Guest, and Gary M. Weiss. Generalized Sequential Pattern Mining of Undergraduate Courses. Proceedings of The 15th International Conference on Educational Data Mining (EDM22), International Educational Data Mining Society, Durham, UK, July 24-27.