Hi Marco!
Since I am only interested in individuals who drop out from a Bachelor’s degree, I will only keep those who declared in the first wave that they intended to get a Bachelor degree (either in teaching or not). To do so, I use the variable „tg02001_ha".
intentions have been collected during recruiting, some of the changed their mind. If you keep those who intended to do a bachelor degree, you will see that some of them will not stick to this intention.
Afterwards, I used the variable „tx24101" from the „StudyStates.dta" file to identify individuals who have dropped out of their degree (tx24101==3). This variable seems to capture the same information as the CAWI variable „tg51000" for the waves 2 and 4 and the CAWI variable „tg51004" for the waves 6.8, 11,14.
yes, tx24101 combines data of the pTargetCAWI-variables tg51000 and tg51004
Would that be the correct way to capture those who never return to university?
no, all information in StudyStates, and pTarget*-datasets follows a wave-wise-logic. Judging by the set of variables you picked, you just want to use data which comes from pTargetCAWI but you deny the data that comes from spVocTrain. You could use pTargetCAWI instead of StudyStates. But you kept that in mind later on…
I just made a quick overview of how many students aborted and where success full
use ID_t wave tg02001_ha using "SC5_D_19-0-0\Stata14\SC5_pTargetCATI_D_19-0-0.dta", clear
keep if inlist(tg02001_ha,1,8)
isid ID_t
keep ID_t
tempfile w1ba
save "`w1ba'", replace
use "C:\Users\bainb201\Desktop\Data\SC5_D_19-0-0\Stata14\SC5_StudyStates_D_19-0-0.dta", clear
label drop `: value label wave'
lab lang en
keep if tx24000 == 1
keep ID_t wave tx24001 tx24100 tx24101 tx15318 tx15317 tx15316
merge m:1 ID_t using "`w1ba'", keep(matched) nogenerate
//merge 1:1 ID_t wave using "C:\Users\bainb201\Desktop\Data\SC5_D_19-0-0\Stata14\SC5_pTargetCAWI_D_19-0-0.dta", keep(master matched) keepusing(tg51000 tg51004)
generate abort = (tx24101== 3)
bysort ID_t: egen abort_any = max(abort) // any abort at all?
generate success_cawi = (tx24101== 4)
bysort ID_t: egen success_cawi_any = max(success_cawi) // any success at all?
bysort ID_t (tx24001):
/*
tab abort_any success_cawi_any, mis
| success_cawi_any
abort_any | 0 1 | Total
-----------+----------------------+----------
0 | 98,151 36,615 | 134,766
1 | 6,451 430 | 6,881
-----------+----------------------+----------
Total | 104,602 37,045 | 141,647
. distinct ID_t if abort_any == 1 & success_cawi_any == 1
| Observations
| total distinct
-------+----------------------
ID_t | 430 25
>> success and abort for 25 people
*/
bysort ID_t (tx24001) : egen success_cati_any = max(tx15318)
/*
distinct ID_t if abort_any == 1 & success_cati_any == 1
| Observations
| total distinct
-------+----------------------
ID_t | 1570 92
*/
so there is contradiction within CATI and between CATI and CAWI - but it is suprisingly little. So do you want to try harmonize CATI and CAWI-information?
I have seen this thread: SC5: Studienabbruch - #2 von dietmar.angerer where Dietmar provides a great solution to this issue. Is there any additional (more straightforward) way to obtain this? I might be missing a new variable in any of these datasets.
sorry but there are no straightforward solutions
Alternatively, if I want to use a more general definition of dropout, considering individuals who have changed of major or institution, I plan to use the following variables from SpVocTrain: tg24159 (filtering afterwards by tg24162) and tg24121. I guess that for the latter I should control for whether they have finished their previous studies (and not changing of institution due to the start of a master’s program, for example). This could be done using tx15318, right?
no, tg24121 just tells you if a person changed university - no matter if continued the same bachelor or if the started something completely new. tx15318 is more or less ts15218 from spVocTrain
I hope this helps you a bit
Dietmar