Collect information on ongoing education for a given date

Data on ongoing education (studies) exist with course as unit level (through unique course-identificator numbers). Courses are defined by the combination person x course type, and each individual can be represented by more than one course type simultaneously.

As data on ongoing education do not have person as a unit level, such data can not be imported directly into an individual level dataset, but instead need to be merged through the command merge.

First, one must add/import a variable containing a link between course-ids and corresponding person-ids onto the course data (ongoing education). Next, one must aggregate the data to individual level through the command collapse. Finally, the data need to be merged into the main individual level dataset.

In the example below, an individual level dataset containing persons resident in Norway (regstatus == '1') per 2021-01-01 is used as the main dataset. Then ongoing education study events per 2021-11-01 are collected into a separate dataset. The command collapse (count) is used to count the number of observations/events for ongoing education per individual on the specific date, and the result is finally merged into the main individual level dataset for further analysis.

Note: The values of the variable coursetype will after the collapse-transformation be replaced by numerical values referring to the statistical measure being used, in this case count (number of observations/events).

//Connect to datastore
require no.ssb.fdb:23 as db

//Create individual level dataset containing residents in Norway per 2021-01-01 
create-dataset persondata
import db/BEFOLKNING_KJOENN as gender
import db/BEFOLKNING_STATUSKODE 2021-01-01 as regstatus
keep if regstatus == '1'

//Retrieve people who are studying as of 1st November 2021, and connect this onto the personal data set. Since course data can have several observations per individual, the collapse command must be used to aggregate up to person level. We use count as aggregation value (number of records)
create-dataset coursedata
import db/NUDB_KURS_NUS 2021-11-01 as coursetype
import db/NUDB_KURS_FNR as idnr
collapse (count) coursetype, by(idnr)
rename coursetype courses
merge courses into persondata

//Produce tabulation for individuals who are studying as of 1st November 2021
use persondata
generate student = 0
replace student = 1 if courses >= 1
tabulate student gender