Figure 2.
95% confidence interval (CI) for each of the performance metrics: sensitivity (A), false positive rate (B) and calibration error (C) as a function of the number of questions answered. The black vertical dashed lines show the minimum number of questions required to drive the 95% CI below 0.1, corresponding to 549 (A), 514 (B) and 250 (C).