Merging several files in SC4

Dear FDZ team, I want to use variables from different files in SC4, as follows:

-As my main file, I use CohortProfile

  • Competencies from xTargetCompetencies,
  • Personality and social competency variables from pTarget file
  • Final grade of students from spSchoolExtExam file

I need at the end a wide format (for each ID only one line).
My own attempt was:
Step 1. reshaping CohortProfile from panel format to wide format.
Step 2. Then merging it 1:1 to xTargetCompetencies
Step 3. Then keeping only ID and Final grade in spSchoolExtExam file (one final grade per ID); This way it will have a wide format.
Step 4. Merging it 1:1 with the file from Step 2.
Step 5: Merging it with the pTarget file, which has a panel format.
I got confused in Step 4, as the variable final grade was not merged and I couldn’t find it in the final file.
Long story short, do you have a practical and efficient way to mix variables from these files?
I have the Excel file with merging guidance but I need more than 2 files to merge.
I appreciate your support in advance

Dear Mahdi,

I would strongly recommend doing it a little differently.
If you keep the original data structures as long as possible and only convert the data to wide format at the very end, you can stick to the recommendations of the MergeingMatrix for longer.
I would do the following:
(1) Take CohortProfile as a starting point
(2) Merge the variables of interest from pTarget via „merge 1: 1 ID_t wave“.
(3) Keep only final grades from spSchoolExtExam (one line per ID_t) and merge your final grade via „merge m:1 ID_t“ to your main dataset. So you now have the final grade on all data lines, but if necessary you can delete it from those for which you do not need it. Since this is a time-constant variable, I don’t see any problem.
(4) Now you can either restrict your data set to the variables that are of interest to you and convert it to wide in order to merge the variables from xTargetCompetencies. Or you merge the data from xTargetCompetencies to ALL data lines via „merge m:1 ID_t“ and then convert the data. In principle, you can do both.

Good luck with your work!
Best regards,

Thank you Benno for your helpful feedback. I did as you said. Only there were two issues:

  • in step (3), there is more than one final grade per ID_t! However, I just took the one from the latest wave and it was okay in the end.
  • As you recommended I merged all files with my variables of interest and as the last step I tried to reshape it from long (panel) to wide. The (i) is ID_t and (j) is wave; Then pTarget variables in the reshape should be Xij variables, which change over waves. We have 11 waves and it creates 11 new variables out of each variable! I know, for example, persistence as a variable only exists in the 7th wave, and another variable openness was measured in the first and fifth waves. Is it possible that I reshape based on the available data in each wave for a variable? For instance, for openness after reshape I just have two new variables in first and fifth wave (not 11 new variables).
    I really appreciate it if you may have a recommendation.


Dear Mahdi,

as far as I know, the reshape command always generates as many variables as are indicated by the variable you use as (j). But if you already know, that you will only work with the information of several waves, simply drop the not needed vars and everything is fine.

Best regards,