Merging bioedu and youthl

I am trying to merge the bioedu and youthl datasets using the pid variable, but for some reason many observations are missing in both the master and using datasets. It is unclear why some individuals in bioedu have no corresponding data in youthl, and vice versa. Do you have any ideas why it might be the case?

Dear Victor, Unfortunately, bioedu has not been updated for several years. This means that current information is not included. Information on education for young people you find e.g. in the variable: ylg0005_h or ylg0057_h.

Kind regards

Jana Nebelin (part of the SOEP Team)

1 „Gefällt mir“

Dear Jana, thank you for your answer. However, some questions remain.

The issue is not only with the years after 2015. For many birth cohorts from 1991 to 2015, matches between youthl and bioedu were also not found. Do you know why this might be the case? My task is to enrich the bioedu data with sociodemographic characteristics, as well as information on school type and grades, and I planned to do this using youthl. Do you know why there are so many unmatched observations even before 2015, and whether another dataset can be used for these purposes instead of youthl?

Hi Victor,

I suspect that bioedu was created from the $jugend datasets and the bioagel dataset up to 2019. In addition, we have further questionnaires for children (early adolescence and school-age children) that were stored in biopupil for a long time. These don’t seem to be integrated into bioedu. Youthl contains information from four questionnaire types: the Youth Instrument, the Early Adolescence and School-Age Questionnaire, as well as the new integrated Youth Instrument. I suspect that’s why the merge isn’t working. But I’m not an expert on bioedu; unfortunately, the person who worked on it is no longer involved with the SOEP.

You can find core demographic information, e.g., about children, in ppathl or also in kidlong.

Kind regards Jana