I have some questions regarding the raw competence data in NEPS SC3. The file I am working with is Stata\SC3_xTargetCompetencies_D_10-0-0.dta.
For starters, I would like to model a simple SEM growth model in Stata to see how competences change from grade 5 to 7 to 9. My code looks like this.
clear all version 16.1 use "Stata\SC3_xTargetCompetencies_D_10-0-0.dta" keep ID_t mag* nepsmiss global math5 mag5d041_c mag5q291_c mag5q292_c mag5v271_c mag5r171_c /// mag5q231_c mag5q301_c mag5q221_c mag5d051_c mag5d052_c mag5q14s_c /// mag5q121_c mag5r101_c mag5r201_c mag5q131_c mag5d02s_c mag5d023_c mag5v024_c /// mag5r251_c mag5v01s_c mag5v321_c mag5v071_c mag5r191_c mag5v091_c global math7 mag9q071_sc3g7_c mag7v071_c mag7r081_c mag7q051_c mag5q301_sc3g7_c /// mag9d151_sc3g7_c mag5d051_sc3g7_c mag5d052_sc3g7_c mag9v011_sc3g7_c /// mag9v012_sc3g7_c mag7q041_c mag7d042_c mag7r091_c mag9q181_sc3g7_c mag7d011_c /// mag7v012_c mag7v031_c mag5r251_sc3g7_c mag7d061_c mag5v321_sc3g7_c /// mag9v091_sc3g7_c mag5r191_sc3g7_c mag7r02s_c global math9 mag9d151_sc3g9_c mag9d201_sc3g9_c mag9d05s_c mag9d061_c mag9d111_c /// mag9d09s_c mag9d131_c mag9q021_sc3g9_c mag9q071_sc3g9_c mag9q081_sc3g9_c /// mag9q101_sc3g9_c mag9q181_sc3g9_c mag9q211_sc3g9_c mag9q121_c mag9q151_c /// mag9q191_c mag9q021_c mag9q041_c mag9q011_c mag9q031_c mag9r051_sc3g9_c /// mag9r061_sc3g9_c mag9r111_sc3g9_c mag9r191_sc3g9_c mag9r261_sc3g9_c /// mag9r10s_c mag9r14s_c mag9v011_sc3g9_c mag9v012_sc3g9_c mag9v091_sc3g9_c /// mag9v121_sc3g9_c mag9v131_sc3g9_c mag9v13s_sc3g9_c mag9v081_c sum $math5 sum $math7 sum $math9 egen nmissmath5 = rowmiss($math5) fre nmissmath5 egen nmissmath7 = rowmiss($math7) fre nmissmath7 egen nmissmath9 = rowmiss($math9) fre nmissmath9 *Latent model* gsem (Math5 -> $math5, ologit) (Math7 -> $math7, ologit) (Math9 -> $math9, ologit) /// (Math5 -> Math7) (Math7 -> Math9) /// , nocapslatent latent(Math5 Math7 Math9)
My first question is whether I got all raw items. From the enumeration of the variables I assume it should work but some names are rather confusing like mag9q181_sc3g7_c. What I took from the documentation file is that this var is still given to pupils in grade 7 and thus contributes to the Math7 construct.
My next question regards the missing values. When we look at the output of nmissmath9, we see every pupil has at least 11 missing values, so no pupil worked on all 34 items. How does this affect my estimation and how should I handle this? My hunch is that I could draw a limit and restrict the sample to all pupils who at least answered 20 out of the 34 items. Is there any guidline for doing so?