• Download Course-Correlation-Matrix-v1.csv (4362 records, 296KB)
  • Data Format (6 fields): ID, Course 1, Course 2, Pearson Correlation, Pval, #Students
  • The important fields are Course 1, Course 2, and the Pearson Correlation. The Pvalue measures the statistical significance of the Pearson correlation and the last field is the number of students common to both courses. The Pearson correlation is the correlation between the normalized grades for the two grade vectors, where each position in the vector corresponds to a student.
  • To cite the dataset use:
    Gary M. Weiss and Daniel D. Leeds (2021). Fordham University Course Correlation Matrix Data Set, Version 1 [data file],
  • Usage: This matrix was used in the paper “Mining Course Groupings using Academic Performance,” by Daniel D. Leeds, Tianyi Zhang, and Gary M. Weiss, in Proceedings of the 2021 Educational Data Mining Conference, Paris France.