The example below shows how to create aggregate data at the municipal level (contextual data) for use in a simple regression analysis.
You start by creating a regular person level data set with selected information, including a variable to be used as the merge key for the aggregated data, in this case the variable
municipality (actual municipality of residence). Next, you need to create two new datasets that are used to retrieve resp. wage/wealth and unemployment/resident figures at municipal level.
The method for aggregating is to use the
collapse() command in combination with the option
by() to specify the level of aggregation (
municipality). Then the
merge command is used to connect the aggregated data into the person level data set via the specified merge key (
NB! Remember to rename the aggregate variables before merging them into the person level dataset.
//Connect to database require no.ssb.fdb:15 as db //Create person level dataset create-dataset persons import db/BEFOLKNING_KOMMNR_FAKTISK 2019-01-01 as municipality import db/BEFOLKNING_KJOENN as gender import db/INNTEKT_WLONN 2019-01-01 as wage import db/INNTEKT_BER_BRFORM 2019-01-01 as wealth summarize wage wealth histogram wage, freq histogram wealth, freq //Create municipality level dataset containing mean wage and wealth, and merge into person level dataset create-dataset municipality_wage_wealth import db/BEFOLKNING_KOMMNR_FAKTISK 2019-01-01 as municipality import db/INNTEKT_WLONN 2019-01-01 as wage import db/INNTEKT_BER_BRFORM 2019-01-01 as wealth collapse (mean) wage wealth, by(municipality) rename wage meanwage_mun rename wealth meanwealth_mun summarize meanwage_mun meanwealth_mun merge meanwage_mun meanwealth_mun into persons on municipality //Create municipality level dataset containing number of unemployed and residents, and merge into person level dataset create-dataset municipality_unempl_residents import db/BEFOLKNING_KOMMNR_FAKTISK 2019-01-01 as municipality import db/ARBSOEK2001FDT_HOVED 2019-01-01 as ws_status generate num_unempl = 1 if ws_status == '1' generate num_residents = 1 collapse (sum) num_unempl num_residents, by(municipality) summarize num_unempl num_residents merge num_unempl num_residents into persons on municipality //Finally, use person level dataset to perform a simple regression analysis use persons generate unempl_pst = (num_unempl / num_residents) * 100 generate male = 0 replace male = 1 if gender == '1' generate oslo = 0 replace oslo = 1 if municipality == '0301' generate large_mun = 0 replace large_mun = 1 if num_residents > 80000 generate rich_mun = 0 replace rich_mun = 1 if meanwealth_mun > 2000000 regress wage male oslo wealth meanwealth_mun meanwage_mun unempl_pst num_residents regress wage male oslo wealth rich_mun meanwage_mun unempl_pst large_mun