This repo contains the code used in PhD thesis Exploring Statistical Biases in Educational Data: A Simulation-Based Approach. It consists of the following four main part.
- Custom implementation of Additive Factors Model (AFM).
- Implementation of simulation framework.
- Description of scenarios in JSON format.
- Analysis scripts that produce various figures.
How to use
This repository is also a Python package managed by Poetry. The easiest way to run the code is to install poetry package
install the package
and run one of the available scripts
that reproduce figures from the thesis.
Available scripts
Parameter stability
- Scripts
figure-3-2andfigure-3-3reproduce heatmaps from section about stability of estimated AFM parameters. Some of these heatmaps are shown in Figure 3.2 and Figure 3.3.
Effects of cheating
- Script
figure-4-1reproduces scatter plots of estimated β and γ parameters with increasing number of cheating students shown in Figure 4.1. - Script
figure-4-2reproduces histograms with estimate student α parameters as seen in Figure 4.2. - Script
figure-4-3reproduces learning curves shown in Figure 4.3.
Effects of item ordering
- Script
figure-5-3reproduces scatter plots with estimated β and γ parameters under fixed and random item ordering depicted in Figure 5.3. - Script
figure-5-4reproduces barplots with various performance metrics comparing performance of AFM with correct and misspecified Q-matrix from Figure 5.4. - Script
figure-5-5reproduces scatter plots showing detailed estimated values of β and γ parameters for AFM with correct and misspecified Q-matrix from Figure 4.2.