I am currently conducting analysis on the returns to adult learning with the NEPS Cohort 6 dataset.
For the analysis, I would need the education level that individuals had the first time (e.g. wave 1) they participated in the NEPS. However, the education variable that I currently have in the Basics dataset is for example tx28103 (Recent ISCED), which as far as I can tell only shows the most recent education level though.This does not allow me to identify whether there has been a change in the education level in the interim years.
For the analysis, I would need the education level that individuals had the
first time (e.g. wave 1) they participated in the NEPS. However, the
education variable that I currently have in the Basics dataset is for
example tx28103 (Recent ISCED), which as far as I can tell only shows the
most recent education level though.This does not allow me to identify
whether there has been a change in the education level in the interim
years.
Is there another variable I could use? Or would I need to check the value
of the education variable from a previous dataset version (e.g. 1.0.0.)?
the Basics dataset’s tx28103 variable always features the last known level of education.
There is no ready-to-use variable featuring the level of education at the time of each
participant’s first interview (be reminded that not all participants joined starting
cohort 6 in its first wave, namely ALWA study conducted at the Institute for Employment
Research IAB).
However, the Basics dataset variable is just created from a more feature rich dataset,
called Education dataset, which contains the complete educational history of each
participant. I think the way you want to go is:
generate the interview date for each participant’s first participation from CohortProfile
generate the exact date of each educational achievement in Education
merge both
select only educational attainments that have been achieved before the first
participation; and select the chronologically last per person.
In Stata, this could be done like so:
/*
this Stata code uses the neps: prefix command;
to install it, run the following Stata command:
. net install nepstools , from(http://nocrypt.neps-data.de/stata)
*/
/* (0) set up */
neps set study SC6 /* NEPS Starting Cohort 6 */
neps set version 11.0.0 /* version 11.0.0 */
neps set level D /* confidentiality level: Download */
/* (1) generate the interview date for each participant's first participation from
CohortProfile */
neps : use CohortProfile , clear
sort ID_t wave
keep if tx80220==1 /* keep only participants */
generate intdate=ym(inty,intm) /* generate interview date */
format intdate %tm
keep ID_t intdate
by ID_t : keep if _n==1 /* keep chronologically first observation per person */
tempfile intdates
save `"`intdates'"' /* temporarily save */
/* (2) generate the date of each educational attainments in Education */
neps : use Education , clear
generate att_date=ym(datey,datem) /* generate date of attainment */
format att_date %tm
/* (3) merge both */
merge m:1 ID_t using `"`intdates'"' , assert(match using) keep(match) nogenerate
/* (4) select only educational attainments that have been achieved before the first
participation;
and select the chronologically last per person */
keep if att_date<=intdate
sort ID_t att_date
by ID_t : keep if _n==_N /* this keeps the _last_ known status up to the first interview date */
You can merge the resulting dataset with any other dataset by the key variable ID_t; the
ISCED and CASMIN variables in this crossectional file contain the educational status at
the time of the first interview of a person.
Hi @skeyteam ,
I am still a beginner in NEPS. But maybe this can help anyway, by providing ideas. For neat solutions, please stick to the experts’ posts.
I recently identified the latest ISCED in the Education file. So the procedure should be similar. So I would proceed like that in R:
education <- education[education$datey < intdate ]#after importing the data frame shrink to before the first interview date
fshrinkedu <- with(education, ave(education$datey, ID_t, FUN = function(x) seq_along(x) == which.min(x))) == 1# construct a function to sequence all the spells (presuming they are in order) with seq_along - for each ID_t and return 1 where the minimum is - only keep those rows
educationearliest <- education[fshrinkedu, ]#apply the function to subset the rows
# then merge educationearliest to your master file