ivregress can be used to specify instrument variables. This is applicable if you have a hypothesis that multicollinearity exists (correlation between at least two of the independent variables). You define instrument variables inside the parenthesis expression. In the example below, the instrument variable wealth_high is used, and the instrument age. But you can use as many instruments as you like. E.g. if you think that place of residence (= Oslo) also affects the amount of wealth, you can use the parenthetical expression(
wealth_high = age oslo). But in principle,
ivregress treats all independent variables as instruments, except for the instrument variable.
require no.ssb.fdb:23 as db create-dataset ivtest import db/INNTEKT_WLONN 2021-12-31 as wage import db/BEFOLKNING_FOEDSELS_AAR_MND as birth_year_month generate age = 2020 - int(birth_year_month /100) drop if age < 18 | age > 60 import db/BEFOLKNING_KJOENN as gender generate male = 0 replace male = 1 if gender == '1' import db/INNTEKT_BRUTTOFORM 2020-12-31 as wealth generate wealth_high = 0 replace wealth_high = 1 if wealth > 1500000 //First run a regular linear regression regress wage age male wealth_high //Suspects a correlation between age and wealth. Use a model with a instrument variable (wealth_high) ivregress wage male (wealth_high = age) //In addition to comparing the two results, a check is performed for multicollinearity and normaldistributed residuals correlate wealth_high age regress-predict wage age male wealth_high, residuals(res1) ivregress-predict wage male (wealth_high = age), residuals(res2) histogram res1 histogram res2