Logistic regression analysis

The purpose of logistic regression analysis (logit/probit) is to estimate the effects of a set of explanatory variables on the probability of a certain outcome represented by a dichotomous dependent variable (job/not-jobb, measures/not-measures etc). The output can be adjusted through options (exclude the fixed joint, adjust the level of significance etc).

This example illustrates a basic logit analysis. Alternatively, the probit model can be used if preferred. Multinomial logistic analysis (more than 2 outcomes) can be performed through the mlogit command.

//Connect to datastore
require no.ssb.fdb:13 as db

//Start by importing relevant variables
create-dataset demography
import db/BEFOLKNING_KJOENN as gender
import db/BEFOLKNING_FOEDSELS_AAR_MND as birthdate
import db/BEFOLKNING_STATUSKODE 2018-01-01 as regstat
import db/SIVSTANDFDT_SIVSTAND 2018-01-01 as sivilstatus
import db/INNTEKT_BRUTTOFORM 2018-01-01 as wealth
import db/INNTEKT_WYRKINNT 2019-01-01 as workincome19

//Create population
generate age = 2018 - int(birthdate / 100)
keep if regstat == '1' & age > 15 & age < 67

//Create a dependent dichotomous variable (dummy): High vs. low workincome
generate income_high = 0
replace income_high = 1 if workincome19 > 800000

//Adapt the independent variables to suit the statistical model
generate male = 0
replace male = 1 if gender == '1'

generate married = 0
replace married = 1 if sivilstatus == '2'

generate wealth_high = 0
replace wealth_high = 1 if wealth > 1500000

//Perform logit analysis where the dependent variables is listed first (required)
logit income_high male married age wealth_high