For statistics and analyzes in microdata.no, datasets created through the command
import are normally used. These are data sets of the “wide” type, where information about all units in a population is structured horizontally at a variable level. The new
reshape-to-panel command now makes it possible to change the data structure to long-format (panel-format), where information about each unit (individual) is structured vertically at the observation / record level.
Variables that are measured over several times and that you want in long / panel format, must be named through
reshape-to-panel with specified prefixes that consist of the letters (prefix) from the original variable in the wide dataset. Other variables for which no prefix is specified, typically information that is only measured once (gender, country of birth, etc), are automatically defined as fixed information and the values for these are repeated for all sub-levels of each unit.
The script below shows how to use
reshape-to-panel in practice. First, a standard dataset of the wide type is created, consisting of a 1% random sample of all who were registered as resident in Norway as of 1/1 2018. In addition, the variables regstat (register status), sivstand (marital status) and lønn (annual salary) are imported for the years 2018-2020, as well as the fixed information kjønn (gender). Then the
reshape-to panel command is used to restructure the dataset to long-format (panel data). The data we want in long / panel format is specified by entering the prefix of the sets of variables (the letter part preceding the digit that is linked to the year or the like), in this case regstat, sivstand and lønn. Gender is a fixed information, and we therefore do not need to include a variable prefix for this in the command expression. The finished long dataset will include the variable date@panel which contains the value of the sublevel of each unit. In this case, all units will have sub-levels 18, 19 and 20 (i.e. three sub-levels / observations / records each).
require no.ssb.fdb:17 as db //Create a common type wide format dataset consisting of a 1% random sample of all residents per 1/1 2018 create-dataset wide import db/BEFOLKNING_STATUSKODE 2018-01-01 as regstat18 keep if regstat18 == '1' sample 0.01 333 import db/BEFOLKNING_STATUSKODE 2019-01-01 as regstat19 import db/BEFOLKNING_STATUSKODE 2020-01-01 as regstat20 import db/SIVSTANDFDT_SIVSTAND 2018-01-01 as sivstand18 import db/SIVSTANDFDT_SIVSTAND 2019-01-01 as sivstand19 import db/SIVSTANDFDT_SIVSTAND 2020-01-01 as sivstand20 import db/BEFOLKNING_KJOENN as kjønn import db/INNTEKT_WLONN 2018-01-01 as lønn18 import db/INNTEKT_WLONN 2019-01-01 as lønn19 import db/INNTEKT_WLONN 2020-01-01 as lønn20 //Run some testings of wide format data tabulate regstat18, missing tabulate regstat19, missing tabulate regstat20, missing tabulate sivstand18, missing tabulate sivstand19, missing tabulate sivstand20, missing tabulate kjønn, missing summarize lønn18 lønn19 lønn20 //Convert to long format (panel data) reshape-to-panel regstat sivstand lønn //Run some new tests of the long format data for comparing purposes tabulate date@panel, missing tabulate-panel regstat, missing tabulate-panel sivstand, missing tabulate-panel regstat sivstand, missing tabulate-panel kjønn, missing tabulate-panel regstat kjønn, missing tabulate-panel sivstand kjønn, missing summarize lønn summarize-panel lønn