SC3 Math competences SEM

Dear all,

I have some questions regarding the raw competence data in NEPS SC3. The file I am working with is Stata\SC3_xTargetCompetencies_D_10-0-0.dta.

For starters, I would like to model a simple SEM growth model in Stata to see how competences change from grade 5 to 7 to 9. My code looks like this.

clear all
version 16.1
use "Stata\SC3_xTargetCompetencies_D_10-0-0.dta"
keep ID_t mag*

global math5 mag5d041_c mag5q291_c mag5q292_c mag5v271_c mag5r171_c ///
	mag5q231_c mag5q301_c mag5q221_c mag5d051_c mag5d052_c mag5q14s_c ///
	mag5q121_c mag5r101_c mag5r201_c mag5q131_c mag5d02s_c mag5d023_c mag5v024_c ///
	mag5r251_c mag5v01s_c mag5v321_c mag5v071_c mag5r191_c mag5v091_c
global math7 mag9q071_sc3g7_c mag7v071_c mag7r081_c mag7q051_c mag5q301_sc3g7_c ///
	mag9d151_sc3g7_c mag5d051_sc3g7_c mag5d052_sc3g7_c mag9v011_sc3g7_c ///
	mag9v012_sc3g7_c mag7q041_c mag7d042_c mag7r091_c mag9q181_sc3g7_c mag7d011_c ///
	mag7v012_c mag7v031_c mag5r251_sc3g7_c mag7d061_c mag5v321_sc3g7_c ///
	mag9v091_sc3g7_c mag5r191_sc3g7_c mag7r02s_c

global math9 mag9d151_sc3g9_c mag9d201_sc3g9_c mag9d05s_c mag9d061_c mag9d111_c ///
	mag9d09s_c mag9d131_c mag9q021_sc3g9_c mag9q071_sc3g9_c mag9q081_sc3g9_c ///
	mag9q101_sc3g9_c mag9q181_sc3g9_c mag9q211_sc3g9_c mag9q121_c mag9q151_c ///
	mag9q191_c mag9q021_c mag9q041_c mag9q011_c mag9q031_c mag9r051_sc3g9_c ///
	mag9r061_sc3g9_c mag9r111_sc3g9_c mag9r191_sc3g9_c mag9r261_sc3g9_c ///
	mag9r10s_c mag9r14s_c mag9v011_sc3g9_c mag9v012_sc3g9_c mag9v091_sc3g9_c ///
	mag9v121_sc3g9_c mag9v131_sc3g9_c mag9v13s_sc3g9_c mag9v081_c
sum $math5
sum $math7
sum $math9

egen nmissmath5 = rowmiss($math5)
fre nmissmath5

egen nmissmath7 = rowmiss($math7)
fre nmissmath7

egen nmissmath9 = rowmiss($math9)
fre nmissmath9

*Latent model*
gsem (Math5 -> $math5, ologit) (Math7 -> $math7, ologit) (Math9 -> $math9, ologit) ///
	(Math5 -> Math7) (Math7 -> Math9) ///
	, nocapslatent latent(Math5 Math7 Math9)

My first question is whether I got all raw items. From the enumeration of the variables I assume it should work but some names are rather confusing like mag9q181_sc3g7_c. What I took from the documentation file is that this var is still given to pupils in grade 7 and thus contributes to the Math7 construct.

My next question regards the missing values. When we look at the output of nmissmath9, we see every pupil has at least 11 missing values, so no pupil worked on all 34 items. How does this affect my estimation and how should I handle this? My hunch is that I could draw a limit and restrict the sample to all pupils who at least answered 20 out of the 34 items. Is there any guidline for doing so?

Dear Felix Bittmann,

the psychometric properties of the mathematical tests administered in Starting Cohort 3 are summarized in several technical reports:

These documents also describe the measurement models that guided the test development. Simple raw scores (i.e., the sum of correctly solved items) cannot be used to compare mathematical competencies across grades because they do not acknowledge missing values and, more importantly, are not linked (i.e., they are not on the same scale because different items were administered in each grade). Therefore, the NEPS provides properly scaled and linked proficiency scores (i.e., weighted likelihood estimates) in the scientific use files that can be used for longitudinal comparisons (e.g., variables mag5_sc1u, mag7_sc1u, and mag9_sc1u). You do not have to calculate scores for yourself.

Details on how the competences are linked in the NEPS are given in the following survey paper:

If you want to acknowledge measurement error in the estimated competencies, we recommend estimating plausible values using our R package NEPSscaling:
More information on plausible values estimation is also given in the respective survey paper:

Best regards,
Timo Gnambs

Thank you Timo, that was very helpful! We will go with plausible values and use the R script to generate them.