{"id":1418,"date":"2021-03-11T20:41:00","date_gmt":"2021-03-11T19:41:00","guid":{"rendered":"https:\/\/microdata.spirehost.no\/?p=1418"},"modified":"2022-03-28T13:13:10","modified_gmt":"2022-03-28T12:13:10","slug":"winsorization-change","status":"publish","type":"post","link":"https:\/\/www.microdata.no\/en\/winsorization-change\/","title":{"rendered":"Winsorization change"},"content":{"rendered":"\n<p><em><strong>The winsorization confidentiality measure in microdata.no has been adjusted; the underlying data in the user\u2019s workspace is no longer affected by this measure. Thus, regression analyzes can now be executed without being influenced by winsorization.<\/strong><\/em><\/p>\n\n\n\n<!--more-->\n\n\n\n<p>Until now, numerical data has been winsorized during import of data to the user\u2019s workspace, population delimitations (<code>drop if<\/code> \/ <code>keep if<\/code>), and for descriptive statistics for sub-samples. For example:<\/p>\n\n\n<div id=\"rose-block_620d53d6b7b44\" class=\"rose-code codeblock-wrapper\">\n<pre tabindex=\"0\" class=\"codeblock\"><code>summarize\u00a0income\u00a0if\u00a0gender\u00a0==\u00a0\"1\"<\/code><\/pre>\n<\/div>\n\n\n<p>This is one of the confidentiality measures in microdata.no, and is intended to prevent users from being able to indirectly identify people via extreme values. Incomes are examples of information where this can be a problem.<\/p>\n\n\n\n<p>Winsorization in this context means that the 1% highest values \u200b\u200bare censored and set to the lower limit for the last percentile, and the 1% lowest values \u200b\u200bare set to the upper limit for the first percentile.<\/p>\n\n\n\n<p><strong>Impact on means, standard deviations and regressions<\/strong><\/p>\n\n\n\n<p>An undesirable effect created by the way winsorization has worked so far, is that numerical variables imported into the user\u2019s workspace have been censored, impacting all subsequent analyses and data management processes.<\/p>\n\n\n\n<p>Statistical measures such as means and standard deviations will then report values \u200b\u200bthat are somewhat lower than the actual ones.<\/p>\n\n\n\n<p>Until now, regression estimates have also been affected by the fact that the estimation is based on censored values. The degree of influence depends on how long the \u201ctails\u201d are in the value distribution of the relevant variables (i.e. to what extent extreme values \u200b\u200boccur).<\/p>\n\n\n\n<p>To minimize the disadvantages of winsorization, only the visible and identifiable output of descriptive statistics now undergo winsorization. The underlying user workspace data is no longer subject to censorship. Regression estimates are therefore 100% correct as they are based on the actual data.<\/p>\n\n\n\n<p>For descriptive statistics, the reported means and standard deviations will still be somewhat lower than the actual values \u200b\u200bfor most numerical variables. This is intentional, and is regarded necessary to maintain the correct balance between confidentiality and sufficient flexibility in defining the analysis population.<\/p>\n\n\n\n<p><strong>Dummy variables and numeric multi-category variables<\/strong><\/p>\n\n\n\n<p>A common problem has been that also imported dummy variables (numerical variables with the values \u200b\u200b0 and 1) were winsorized if one of the categories accounted for less than 1% of the values \u200b\u200bin your population. Since the winsorization uses the neighboring percentile as the censorship value, all dummy values \u200b\u200bhave been coded to resp. 0 or 1 in such cases.<\/p>\n\n\n\n<p>When running regression analyzes, this can create a problem in cases where winsorized dummy variables are included, either imported or variables derived from these, since variables with only one value are not accepted.<\/p>\n\n\n\n<p>Also for numerical multi-category variables, there is a risk that the highest and\/or lowest category has been merged with the neighboring category. Then it looks like the highest or lowest category has no observations for your dataset.<\/p>\n\n\n\n<p>After the change, it still will appear that dummy variables will be winsorized when running descriptive statistics. However, this only applies to the visible statistics output. If the same variables are used in regression analyses,&nbsp;<em>non-winsorized<\/em>&nbsp;data is used as input.<\/p>\n\n\n\n<p><strong>Population delimitations<\/strong><\/p>\n\n\n\n<p>Until now, numerical data have not only been winsorized by import, but also for each time population delimitations are made. So if you have run many\u00a0<code>drop if<\/code>\u00a0or\u00a0<code>keep if<\/code>, your data has been winsorized the corresponding number of times.<\/p>\n\n\n\n<p>This problem is now eliminated, since winsorization is only carried out in the generation of descriptive statistics.<\/p>\n","protected":false},"excerpt":{"rendered":"<p>The winsorization confidentiality measure in microdata.no has been adjusted; the underlying data in the user\u2019s workspace is no longer affected by this measure. Thus, regression analyzes can now be executed without being influenced by winsorization.<\/p>\n","protected":false},"author":4,"featured_media":0,"comment_status":"closed","ping_status":"closed","sticky":false,"template":"","format":"standard","meta":{"_acf_changed":false,"inline_featured_image":false,"_kad_blocks_custom_css":"","_kad_blocks_head_custom_js":"","_kad_blocks_body_custom_js":"","_kad_blocks_footer_custom_js":"","_kad_post_transparent":"","_kad_post_title":"","_kad_post_layout":"","_kad_post_sidebar_id":"","_kad_post_content_style":"","_kad_post_vertical_padding":"","_kad_post_feature":"","_kad_post_feature_position":"","_kad_post_header":false,"_kad_post_footer":false,"_kad_post_classname":"","footnotes":""},"categories":[2],"tags":[],"class_list":["post-1418","post","type-post","status-publish","format-standard","hentry","category-news"],"acf":[],"yoast_head":"<!-- This site is optimized with the Yoast SEO plugin v27.4 - https:\/\/yoast.com\/product\/yoast-seo-wordpress\/ -->\n<title>Winsorization change - microdata.no<\/title>\n<meta name=\"robots\" content=\"index, follow, max-snippet:-1, max-image-preview:large, max-video-preview:-1\" \/>\n<link rel=\"canonical\" href=\"https:\/\/www.microdata.no\/en\/winsorization-change\/\" \/>\n<meta property=\"og:locale\" content=\"en_US\" \/>\n<meta property=\"og:type\" content=\"article\" \/>\n<meta property=\"og:title\" content=\"Winsorization change - microdata.no\" \/>\n<meta property=\"og:description\" content=\"The winsorization confidentiality measure in microdata.no has been adjusted; the underlying data in the user\u2019s workspace is no longer affected by this measure. Thus, regression analyzes can now be executed without being influenced by winsorization.\" \/>\n<meta property=\"og:url\" content=\"https:\/\/www.microdata.no\/en\/winsorization-change\/\" \/>\n<meta property=\"og:site_name\" content=\"microdata.no\" \/>\n<meta property=\"article:published_time\" content=\"2021-03-11T19:41:00+00:00\" \/>\n<meta property=\"article:modified_time\" content=\"2022-03-28T12:13:10+00:00\" \/>\n<meta name=\"author\" content=\"Trond Pedersen\" \/>\n<meta name=\"twitter:card\" content=\"summary_large_image\" \/>\n<meta name=\"twitter:label1\" content=\"Written by\" \/>\n\t<meta name=\"twitter:data1\" content=\"Trond Pedersen\" \/>\n\t<meta name=\"twitter:label2\" content=\"Est. reading time\" \/>\n\t<meta name=\"twitter:data2\" content=\"4 minutes\" \/>\n<script type=\"application\/ld+json\" class=\"yoast-schema-graph\">{\"@context\":\"https:\\\/\\\/schema.org\",\"@graph\":[{\"@type\":\"Article\",\"@id\":\"https:\\\/\\\/www.microdata.no\\\/en\\\/winsorization-change\\\/#article\",\"isPartOf\":{\"@id\":\"https:\\\/\\\/www.microdata.no\\\/en\\\/winsorization-change\\\/\"},\"author\":{\"name\":\"Trond Pedersen\",\"@id\":\"https:\\\/\\\/www.microdata.no\\\/#\\\/schema\\\/person\\\/76761ddfe0d06e3f08f5491a9faeab92\"},\"headline\":\"Winsorization change\",\"datePublished\":\"2021-03-11T19:41:00+00:00\",\"dateModified\":\"2022-03-28T12:13:10+00:00\",\"mainEntityOfPage\":{\"@id\":\"https:\\\/\\\/www.microdata.no\\\/en\\\/winsorization-change\\\/\"},\"wordCount\":561,\"articleSection\":[\"News\"],\"inLanguage\":\"en-US\"},{\"@type\":\"WebPage\",\"@id\":\"https:\\\/\\\/www.microdata.no\\\/en\\\/winsorization-change\\\/\",\"url\":\"https:\\\/\\\/www.microdata.no\\\/en\\\/winsorization-change\\\/\",\"name\":\"Winsorization change - microdata.no\",\"isPartOf\":{\"@id\":\"https:\\\/\\\/www.microdata.no\\\/#website\"},\"datePublished\":\"2021-03-11T19:41:00+00:00\",\"dateModified\":\"2022-03-28T12:13:10+00:00\",\"author\":{\"@id\":\"https:\\\/\\\/www.microdata.no\\\/#\\\/schema\\\/person\\\/76761ddfe0d06e3f08f5491a9faeab92\"},\"breadcrumb\":{\"@id\":\"https:\\\/\\\/www.microdata.no\\\/en\\\/winsorization-change\\\/#breadcrumb\"},\"inLanguage\":\"en-US\",\"potentialAction\":[{\"@type\":\"ReadAction\",\"target\":[\"https:\\\/\\\/www.microdata.no\\\/en\\\/winsorization-change\\\/\"]}]},{\"@type\":\"BreadcrumbList\",\"@id\":\"https:\\\/\\\/www.microdata.no\\\/en\\\/winsorization-change\\\/#breadcrumb\",\"itemListElement\":[{\"@type\":\"ListItem\",\"position\":1,\"name\":\"Hjem\",\"item\":\"https:\\\/\\\/www.microdata.no\\\/en\\\/\"},{\"@type\":\"ListItem\",\"position\":2,\"name\":\"Winsorization change\"}]},{\"@type\":\"WebSite\",\"@id\":\"https:\\\/\\\/www.microdata.no\\\/#website\",\"url\":\"https:\\\/\\\/www.microdata.no\\\/\",\"name\":\"microdata.no\",\"description\":\"Gj\u00f8r det enklere \u00e5 analysere registerdata\",\"potentialAction\":[{\"@type\":\"SearchAction\",\"target\":{\"@type\":\"EntryPoint\",\"urlTemplate\":\"https:\\\/\\\/www.microdata.no\\\/?s={search_term_string}\"},\"query-input\":{\"@type\":\"PropertyValueSpecification\",\"valueRequired\":true,\"valueName\":\"search_term_string\"}}],\"inLanguage\":\"en-US\"},{\"@type\":\"Person\",\"@id\":\"https:\\\/\\\/www.microdata.no\\\/#\\\/schema\\\/person\\\/76761ddfe0d06e3f08f5491a9faeab92\",\"name\":\"Trond Pedersen\",\"image\":{\"@type\":\"ImageObject\",\"inLanguage\":\"en-US\",\"@id\":\"https:\\\/\\\/secure.gravatar.com\\\/avatar\\\/b90e3f42c839e825d86949fc2f9a318f2a81da5f9e6b1431ff4d872333d4e717?s=96&d=mm&r=g\",\"url\":\"https:\\\/\\\/secure.gravatar.com\\\/avatar\\\/b90e3f42c839e825d86949fc2f9a318f2a81da5f9e6b1431ff4d872333d4e717?s=96&d=mm&r=g\",\"contentUrl\":\"https:\\\/\\\/secure.gravatar.com\\\/avatar\\\/b90e3f42c839e825d86949fc2f9a318f2a81da5f9e6b1431ff4d872333d4e717?s=96&d=mm&r=g\",\"caption\":\"Trond Pedersen\"},\"url\":\"https:\\\/\\\/www.microdata.no\\\/en\\\/author\\\/trond\\\/\"}]}<\/script>\n<!-- \/ Yoast SEO plugin. -->","yoast_head_json":{"title":"Winsorization change - microdata.no","robots":{"index":"index","follow":"follow","max-snippet":"max-snippet:-1","max-image-preview":"max-image-preview:large","max-video-preview":"max-video-preview:-1"},"canonical":"https:\/\/www.microdata.no\/en\/winsorization-change\/","og_locale":"en_US","og_type":"article","og_title":"Winsorization change - microdata.no","og_description":"The winsorization confidentiality measure in microdata.no has been adjusted; the underlying data in the user\u2019s workspace is no longer affected by this measure. Thus, regression analyzes can now be executed without being influenced by winsorization.","og_url":"https:\/\/www.microdata.no\/en\/winsorization-change\/","og_site_name":"microdata.no","article_published_time":"2021-03-11T19:41:00+00:00","article_modified_time":"2022-03-28T12:13:10+00:00","author":"Trond Pedersen","twitter_card":"summary_large_image","twitter_misc":{"Written by":"Trond Pedersen","Est. reading time":"4 minutes"},"schema":{"@context":"https:\/\/schema.org","@graph":[{"@type":"Article","@id":"https:\/\/www.microdata.no\/en\/winsorization-change\/#article","isPartOf":{"@id":"https:\/\/www.microdata.no\/en\/winsorization-change\/"},"author":{"name":"Trond Pedersen","@id":"https:\/\/www.microdata.no\/#\/schema\/person\/76761ddfe0d06e3f08f5491a9faeab92"},"headline":"Winsorization change","datePublished":"2021-03-11T19:41:00+00:00","dateModified":"2022-03-28T12:13:10+00:00","mainEntityOfPage":{"@id":"https:\/\/www.microdata.no\/en\/winsorization-change\/"},"wordCount":561,"articleSection":["News"],"inLanguage":"en-US"},{"@type":"WebPage","@id":"https:\/\/www.microdata.no\/en\/winsorization-change\/","url":"https:\/\/www.microdata.no\/en\/winsorization-change\/","name":"Winsorization change - microdata.no","isPartOf":{"@id":"https:\/\/www.microdata.no\/#website"},"datePublished":"2021-03-11T19:41:00+00:00","dateModified":"2022-03-28T12:13:10+00:00","author":{"@id":"https:\/\/www.microdata.no\/#\/schema\/person\/76761ddfe0d06e3f08f5491a9faeab92"},"breadcrumb":{"@id":"https:\/\/www.microdata.no\/en\/winsorization-change\/#breadcrumb"},"inLanguage":"en-US","potentialAction":[{"@type":"ReadAction","target":["https:\/\/www.microdata.no\/en\/winsorization-change\/"]}]},{"@type":"BreadcrumbList","@id":"https:\/\/www.microdata.no\/en\/winsorization-change\/#breadcrumb","itemListElement":[{"@type":"ListItem","position":1,"name":"Hjem","item":"https:\/\/www.microdata.no\/en\/"},{"@type":"ListItem","position":2,"name":"Winsorization change"}]},{"@type":"WebSite","@id":"https:\/\/www.microdata.no\/#website","url":"https:\/\/www.microdata.no\/","name":"microdata.no","description":"Gj\u00f8r det enklere \u00e5 analysere registerdata","potentialAction":[{"@type":"SearchAction","target":{"@type":"EntryPoint","urlTemplate":"https:\/\/www.microdata.no\/?s={search_term_string}"},"query-input":{"@type":"PropertyValueSpecification","valueRequired":true,"valueName":"search_term_string"}}],"inLanguage":"en-US"},{"@type":"Person","@id":"https:\/\/www.microdata.no\/#\/schema\/person\/76761ddfe0d06e3f08f5491a9faeab92","name":"Trond Pedersen","image":{"@type":"ImageObject","inLanguage":"en-US","@id":"https:\/\/secure.gravatar.com\/avatar\/b90e3f42c839e825d86949fc2f9a318f2a81da5f9e6b1431ff4d872333d4e717?s=96&d=mm&r=g","url":"https:\/\/secure.gravatar.com\/avatar\/b90e3f42c839e825d86949fc2f9a318f2a81da5f9e6b1431ff4d872333d4e717?s=96&d=mm&r=g","contentUrl":"https:\/\/secure.gravatar.com\/avatar\/b90e3f42c839e825d86949fc2f9a318f2a81da5f9e6b1431ff4d872333d4e717?s=96&d=mm&r=g","caption":"Trond Pedersen"},"url":"https:\/\/www.microdata.no\/en\/author\/trond\/"}]}},"taxonomy_info":{"category":[{"value":2,"label":"News"}]},"featured_image_src_large":false,"author_info":{"display_name":"Trond Pedersen","author_link":"https:\/\/www.microdata.no\/en\/author\/trond\/"},"comment_info":0,"category_info":[{"term_id":2,"name":"News","slug":"news","term_group":0,"term_taxonomy_id":2,"taxonomy":"category","description":"","parent":0,"count":79,"filter":"raw","cat_ID":2,"category_count":79,"category_description":"","cat_name":"News","category_nicename":"news","category_parent":0}],"tag_info":false,"_links":{"self":[{"href":"https:\/\/www.microdata.no\/en\/wp-json\/wp\/v2\/posts\/1418","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/www.microdata.no\/en\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/www.microdata.no\/en\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/www.microdata.no\/en\/wp-json\/wp\/v2\/users\/4"}],"replies":[{"embeddable":true,"href":"https:\/\/www.microdata.no\/en\/wp-json\/wp\/v2\/comments?post=1418"}],"version-history":[{"count":5,"href":"https:\/\/www.microdata.no\/en\/wp-json\/wp\/v2\/posts\/1418\/revisions"}],"predecessor-version":[{"id":2018,"href":"https:\/\/www.microdata.no\/en\/wp-json\/wp\/v2\/posts\/1418\/revisions\/2018"}],"wp:attachment":[{"href":"https:\/\/www.microdata.no\/en\/wp-json\/wp\/v2\/media?parent=1418"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/www.microdata.no\/en\/wp-json\/wp\/v2\/categories?post=1418"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/www.microdata.no\/en\/wp-json\/wp\/v2\/tags?post=1418"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}