Paneldata analysis is a more advanced form of linear regression analysis where variance over time is taken into account for the variables included. This form of analysis has many similarities with ordinary regression analysis (OLS). Among other things, the dependent variable (which is listed first in the regression panel command) must contain values with continuous/metric format, e.g. income.
The main difference from ordinary regression analysis (cf. the command
regress) is that the data must be organized in a panel format where all variables are measured several times depending on how many measurement points are specified through the command
import-panel. A panel dataset will then consist of T x N observations, where T stands for the number of measurement points and N stands for the number of units in the population.
NB! Panel data sets can often become very large as each unit is measured more than once: If you analyse the entire population and use 2 measurement points, the data set will typically consist of approx. 10 million observations (5 million x 2). SO: Preferably use as small populations as possible, and preferably below 1 million units. Otherwise, the system will be pushed so that executions will be very time-consuming.
//Connect to database require no.ssb.fdb:23 as db //First create a paneldata population (should be as small as possible) //Population: Persons who complete their masters studies in the autumn semester of 2015 create-dataset population import db/NUDB_AAR_FORSTE_FULLF_HOV as compl_master keep if compl_master > 201507 & compl_master < 201601 //Create a new and empty dataset that consists of the individuals from the dataset population clone-units population paneldata //Import a set of variables measured at given measurement dates into the empty dataset use paneldata import-panel db/INNTEKT_WLONN db/SIVSTANDFDT_SIVSTAND db/BEFOLKNING_KOMMNR_FAKTISK 2016-01-01 2017-01-01 2018-01-01 2019-01-01 //Recode and run descriptive statistics and regression analysis rename INNTEKT_WLONN wage generate married = 0 replace married = 1 if SIVSTANDFDT_SIVSTAND == '2' generate oslo = 0 replace oslo = 1 if BEFOLKNING_KOMMNR_FAKTISK == '0301' tabulate-panel married tabulate-panel oslo tabulate-panel married oslo summarize-panel wage transitions-panel oslo married //Run paneldata regression with resp. fixed og random effects regress-panel wage married oslo, fe regress-panel wage married oslo, re //Run hausman-test hausman wage married oslo