Restructure datasets from long to wide format
The script below shows how to use reshape-from-panel
in practice.
First, a standard dataset of the wide type is created, consisting of a 1% random sample of everyone who was registered as resident in Norway as of 1/1 2019. The fully converted long dataset will include the variable date@panel which contains the value of the sublevel of each single unit (person), in this case the values 19 and 20. See separate analysis example for explanation of what happens when using reshape-to-panel
.
Then reshape-from-panel
is used to convert the dataset to wide format again. This will give us the same dataset that we started with. Note that you do not specify variables or prefixes using this command, and that all variables are converted to wide format with a suffix based on the corresponding sublevel. This also applies to gender, although this is a fixed information. As in the example below, you can remove the superfluous gender variables (kjønn) and rename into “kjønn”.
The last part of the example creates a classic panel dataset using the import-panel
command, and then uses the reshape-from-panel
to convert the data to wide-format. Note that here the suffixes are different. When using tabulate-panel
or summarize-panel
, it will appear that the sublevel has values of the type “YYYY-MM-DD”, but this only applies as a display format. In this case, the actual values for date@panel use reference dates as the value format (number of days measured from 1/1 1970). This is solved by renaming the variable names with the rename
command. Note that the converted dataset will be identical to the result of the conversion of the dataset created using reshape-to-panel
.
require no.ssb.fdb:17 as db
//Create a wide format dataset consisting of a 1% sample of all residents as of 1/1 2019
create-dataset wide
import db/BEFOLKNING_STATUSKODE 2019-01-01 as regstat19
keep if regstat19 == '1'
sample 0.01 333
import db/BEFOLKNING_STATUSKODE 2020-01-01 as regstat20
import db/SIVSTANDFDT_SIVSTAND 2019-01-01 as sivstand19
import db/SIVSTANDFDT_SIVSTAND 2020-01-01 as sivstand20
import db/BEFOLKNING_KJOENN as kjønn
import db/INNTEKT_WLONN 2019-01-01 as lønn19
import db/INNTEKT_WLONN 2020-01-01 as lønn20
tabulate regstat19, missing
tabulate regstat20, missing
tabulate sivstand19, missing
tabulate sivstand20, missing
tabulate kjønn, missing
summarize lønn19 lønn20
//Restructure into panel/long format
reshape-to-panel regstat sivstand lønn
tabulate-panel regstat, missing
tabulate-panel sivstand, missing
tabulate-panel kjønn, missing
summarize-panel lønn
//Restructure back into original wide format
reshape-from-panel
drop kjønn20
rename kjønn19 kjønn
tabulate regstat19, missing
tabulate regstat20, missing
tabulate sivstand19, missing
tabulate sivstand20, missing
tabulate kjønn, missing
summarize lønn19 lønn20
//Create a new panel-dataset for the same population using the import-panel command
clone-units wide paneltest
use paneltest
import-panel db/BEFOLKNING_STATUSKODE db/SIVSTANDFDT_SIVSTAND db/INNTEKT_WLONN db/BEFOLKNING_KJOENN 2019-01-01 2020-01-01
rename BEFOLKNING_STATUSKODE regstat
rename SIVSTANDFDT_SIVSTAND sivstand
rename INNTEKT_WLONN lønn
rename BEFOLKNING_KJOENN kjønn
tabulate-panel regstat, missing
tabulate-panel sivstand, missing
tabulate-panel kjønn, missing
summarize-panel lønn
//Restructure into wide format
reshape-from-panel
drop kjønn18262
rename kjønn17897 kjønn
rename regstat17897 regstat19
rename regstat18262 regstat20
rename sivstand17897 sivstand19
rename sivstand18262 sivstand20
rename lønn17897 lønn19
rename lønn18262 lønn20
summarize lønn19 lønn20