Missing data

I have three questions about missing data in NEPS.
I am using SC3 but I assume this question applies to all cohorts.

  1. What kind of mising is „blanck entries“? Missing by design?
  2. „nepsmiss“ code in Stata recodes all the negative values (e.g., -97) into missing with letters (e.g., .a). Where can I find information about what each letter means?
  3. I am a bit confused with the variations of how the missing values are coded. For instance, some values are coded as numeric values (e.g., -54). Other values are coded as numeric values + texts (e.g., -54 (Missing by design). Other values are coded as just texts (Missing by design). Do they simply mean the same thing? Are there any information/documentation about how these variations of coding?

Thank you very much for your help!


Dear Ai,

unfortunately, the SC3 data manual is not yet up to date. I would therefore recommend that you take a look at the SC6 manual, which you can find at https://www.neps-data.de/Portals/0/NEPS/Datenzentrum/Forschungsdaten/SC6/13-0-0/SC6_13-0-0_DataManual.pdf
There is a whole chapter on the conventions regarding missing values ​​starting on page 41.

When recoding the missings using nepsmiss, the numerical values ​​are converted into so-called „stata-missings“ so that they can be handled correctly, for example, in calculations of mean values, etc. The letters .a - .z have no meaning, they only serve as an internal distinction for Stata.

Since additional information can be stored for numeric variables via the variable label, the missings in this case only have numeric codes such as -54. However, no variable label is provided for string variables, which is why the missing here is „-54 (missing by design)“. But that all means the same thing.

I hope I could help you.

Best regards,
Benno Schönberger

Dear Benno,

Thank you so much for your immediate and detailed response.
It was very helpful!

Best regards,