Create couples datasets

Microdata.no offers a long range of demographical variables that makes it possible to find couples amongst others, and to use this to link together information on both individuals in a relationship (married or partners) for use in various analysis.

The variable BEFOLKNING_REGSTAT_FAMNR contains the family identification number for each individual, where all family members have the same number. This can be used to create family information or to aggregate data into family level. And since the family number is defined by the personal identification number of the oldest person in the family, this can be used in combination with other family information to find persons living together as a couple, either married or partners:

  1. Start by creating a dataset consisting of the following variables:
    • Family number
    • Person code
    • Couple status (optional)
    • Various information, for instance occupation
  2. Keep only the oldest persons in each family using the variable “person code” (person code = ‘1’)
  3. Create a new dataset with exactly the same variables, but this time keep only the youngest persons in each family (person code = ‘2’). Children are not included as they are given by a separate code (person code = ‘3’).
  4. Move to the dataset containing the oldest persons, and merge the variables into the dataset of the youngest. Since the youngest have information on the oldest person in the familiy via the family number, this can be used as a merge key resulting in a dataset consisting of all couples.
//Connect to database
require no.ssb.fdb:14 as ds

//Create a dataset of oldest persons in each family
create-dataset oldest
import ds/BEFOLKNING_REGSTAT_FAMNR 2021-01-01 as famnr
import ds/BEFOLKNING_REGSTAT_PERSONKODE 2021-01-01 as personcode_oldest
import ds/BEFOLKNING_PARSTATUS 2021-01-01 as couplestatus_oldest
import ds/REGSYS_ARB_YRKE_STYRK08 2020-11-16 as occ_oldest
keep if personcode_oldest == '1'

//Create a dataset of youngest person in each family, which is not a child
create-dataset youngest
import ds/BEFOLKNING_REGSTAT_FAMNR 2021-01-01 as famnr
import ds/BEFOLKNING_REGSTAT_PERSONKODE 2021-01-01 as personcode_youngest
import ds/BEFOLKNING_PARSTATUS 2021-01-01 as couplestatus_youngest
import ds/REGSYS_ARB_YRKE_STYRK08 2020-11-16 as occ_youngest
keep if personcode_youngest == '2'

//Use the oldest persons dataset and merge variables into the youngest persons dataset using the family number variable (family number = oldest persons id number)
use oldest
merge personcode_oldest couplestatus_oldest occ_oldest into youngest on famnr 

//The dataset youngest now contains data on both couple individuals, for every couple in the population. Run some controle tabulations
use yngst
tabulate personcode_youngest, missing
tabulate personcode_oldest, missing
tabulate couplestatus_youngest, missing
tabulate couplestatus_oldest, missing
tabulate occ_oldest, missing
tabulate occ_youngest, missing

//Check whether both, one or none of the couple individuals are currently working
generate job_youngest = 1 if sysmiss(occ_youngest) == 0
generate job_oldest = 1 if sysmiss(occ_oldest) == 0
tabulate job_oldest job_youngest, missing