To whom it may concern,
I am working on variables t44630d, t44613a and t44630c in SC4. However, I noticed that all of these three variables are measured at both pTarget and pTargetCATI. And all of these variables are all measured at wave 3 and wave 8.
I also noticed that participants response on each item in pTarget and pTargetCATI are quite different. Let me use t44630d at wave 3 as an example, the mean of t44630d in pTarget was 3.08, while the mean of t44630d in pTargetCATI was 2.88. Furthermore, there was a significant difference between means among pTarget and pTargetCATI. I also found out that,at wave 3, there is no overlap participants among pTarget and pTargetCATI. To put it in another way, I guess there is one group of participants’ answers documented in pTarget, and another group of participants’ answers documented in pTargetCATI.
If so, how did you decide which group documented in pTarget and which group documented in pTargetCATI? Is it spontaneous?
I apologize before if you mentioned this aspect in other NEPS documentations already. I have some problems understanding the original German documentations in SC4. Even though I translated all German documentation titles, and I think I did not find relevant aspects, I might missed it.
Hello Huizhang!
The questions concerning these variables have been conducted in different modes because some people left the general school system after grade 9. As those school leavers are no longer within a class context, those school leavers have been interviewed via CATI. The school-remainers still got the paper-based questionaires. This PAPI-data end up in pTarget whereas CATI data ends up in pTargetCATI.
In later waves more and more people are leaving the general educational system and end up in pTargetCATI.
I just took a glimpse in the data and reshape data to wide format, with different variables for each wave and mode:
use ID_t wave t44630d t44613a t44630c using "<PATH_TO_DATA>/SC4_pTarget_D_10-0-0.dta" , clear
keep if !inlist(-54,t44630d,t44613a,t44630c)
bysort wave: sum t44630d t44613a t44630c
rename (t*) (t*_papi)
tempfile papi
save "`papi'", replace
use ID_t wave t44630d t44613a t44630c using "<PATH_TO_DATA>/SC4_pTargetCATI_D_10-0-0.dta" , clear
keep if !inlist(-54,t44630d,t44613a,t44630c)
bysort wave: sum t44630d t44613a t44630c
rename (t*) (t*_cati)
merge 1:1 ID_t wave using "`papi'", nogen
reshape wide t44613a_cati t44613a_papi t44630c_cati t44630c_papi t44630d_cati t44630d_papi, i(ID_t) j(wave)
order _all, alphabetic
sort ID_t
nepsmiss
sum t*, sep(0)
Variable | Obs Mean Std. Dev. Min Max
-------------+---------------------------------------------------------
t44613a_ca~3 | 2,054 2.235638 1.000232 1 4
t44613a_ca~8 | 9,254 1.738059 .8078733 1 4
-------------+---------------------------------------------------------
t44613a_pa~3 | 11,465 1.890711 .9725857 1 4
t44613a_pa~8 | 591 1.516074 .7359961 1 4
-------------+---------------------------------------------------------
t44630c_ca~3 | 2,052 2.579922 .8898071 1 4
t44630c_ca~8 | 9,252 2.757782 .8695096 1 4
-------------+---------------------------------------------------------
t44630c_pa~3 | 11,236 3.083126 .8931505 1 4
t44630c_pa~8 | 590 2.876271 .9120657 1 4
-------------+---------------------------------------------------------
t44630d_ca~3 | 2,028 3.200197 .8371964 1 4
t44630d_ca~8 | 9,147 3.074451 .8129379 1 4
-------------+---------------------------------------------------------
t44630d_pa~3 | 10,173 3.27632 .8774867 1 4
t44630d_pa~8 | 560 3.2 .8725592 1 4
I just can assume that the different means due to the survey mode could be an effect of different socio-economic stratas in connection with gender-role-specific stereotypes. Another reason for different means over time/wave could be the increasing ability to give social desired answers or becoming more mature and reflecting things more thoroughly.
I could find any funny recoding during the process of data preparation.
I hope this helps you a litttle bit.
Kind regards
Dietmar