Panel data regression analysis

It is now possible to perform panel data regression analysis through Panel data are datasets where each unit (individual) takes values for all variables and all measurement dates specified in the import-panel expression. This makes it possible to incorporate time as a component in the regression analysis, and the data basis will become much larger than ordinary linear regressions, leading to analysis of better quality. Panel data analysis makes it possible to take into account the variance over time for the included variables. Initially, it will only be possible to use linear panel data regression analysis through the command regress-panel. The main difference from regular regression analyzes (OLS) using the command regress, is that the data need to be organized on a panel data format where each unit (individual) are measured several times, not only once. Thus, a panel dataset will consist of T x N observations, where T is the number of measurements and N is the number of units in the population. Panel data are constructed through the command import-panel. Linear panel data regression require that the dependent variable (listed first in the regress-panel expression) contains continuous/metrical values, e.g. income. Depending on assumptions made about the various variable’s variation over time, the variants “fixed effect” or “random effect” among others may be used. In addition to regression analyzes, it is possible to map out panel dataset through various descriptive tools:
  • tabulate-panel corresponds to the command tabulate used for regular datasets, but shows in stead values for all measurement dates. Like tabulate, percentage options can be used. If multiple variables are specified, multi-dimensional cross tables are displayed for the relevant variables
  • summarize-panel corresponds to the command summarize used for regular datasets, but shows instead values for all measurement dates. Values are displayed vertically and not horizontally, and the mouse cursor need to be held over the respective values to show their meaning
  • transitions-panel shows a two-way matrix containing frequencies/probabilities of transitions between all combinations of categorical values over time (transition probabilities), for a given variable. The leading column represents the base values, while the table header represents the transition values. If multiple variables are specified, two-way transition tables are displayed for each variable. Transitions are by default represented by frequencies and percentages (row percentage). Transitions either from or to missing values (sysmiss) are kept out of the tabulation
Also logistic panel data analyzes will be introduced some time in the near future. This is regression analyzes for paneldata where the dependent variable measure two possible outcomes (“success” vs. “non-success”), i.e. dummy variable – just like logit and probit analyzes. Click here for panel data analysis example