To main content

Development of disclosure protection

Below we describe Statistics Sweden's development work with disclosure control of individual-based statistics. The work is adapted to external conditions to provide stronger protection.

Statistics Sweden has regularly worked with disclosure control of individual statistics to avoid confidential information being disclosed. Under the Public Access to Information and Secrecy Act, it must not be possible to use statistics to identify a specific individual, a household or a business, or disclose anything about their personal or financial circumstances.

There are several methods that can be used to make it possible to publish the statistics without disclosing confidential data. When choosing a method for disclosure control, the protection is designed so that adequate protection is provided without unnecessarily reducing the value of the information.

Background

Statistics Sweden currently makes use of several different methods for disclosure control. If the same statistical values are published in different products, they may have undergone disclosure control in different ways, which in practice increases the risk of information about someone being disclosed. A uniform method has therefore long been sought after.

One common method, used mainly in business statistics but also partly in individual-based statistics, is suppression. Suppression means that cells with a high risk of disclosure are hidden (primary suppression). Other cells are subsequently also hidden (secondary suppression) to avoid the possibility of deducing the primary suppressed cells via the marginal cells. If the statistics are presented with many different classifications, extensive secondary suppression may be needed to protect a small number of primary suppressed cells, which can result in a relatively large loss of information. This is one of the reasons why suppression is now being replaced by CKM in individual-based statistics.

In today’s digital society, where an increasing amount of information about private individuals is available digitally, there is an increased risk that the identity of private individuals may be disclosed. Whoever wishes can easily obtain information from different authorities and other public sector services and combine it. This is why Statistics Sweden is now strengthening the protection of confidential information by changing the disclosure control method for the individual-based statistical products.

Choice of protection method

Which protection method for disclosure control is used depends on the type of table involved. Read more about some of the most common methods of disclosure control under ‘Statistical methods’.

Protection methods for disclosure control

Implementing the Cell Key Method (CKM) in individual-based statistics

Statistics Sweden has decided that all register-based totally enumerated individual statistics will use CKM as the disclosure control method. CKM is a disclosure control method for frequency tables based on totally enumerated data. The method implies that a small and controlled amount of random uncertainty is automatically added to the statistical values when the tables are prepared. This method makes it possible to present detailed statistics without having to hide statistical values or make parts of the report less detailed. Read more about CKM under ‘Statistical methods’.

Protection methods for disclosure control

Statistics Sweden already makes use of CKM in the statistical product Population by Labour Market Status (BAS), which was first presented in May 2022. The method is used for the register-based population and housing census 2021, as recommended by Eurostat. Regarding Statistics Sweden’s other products in the area of individual-based statistics, the method will be introduced gradually, based on the preliminary timetable given below.

Timetable

The following appropriation-funded products will incorporate CKM in 2025 and 2026:

  • Population statistics (March 2025)
  • Household's Housing (April 2026)
  • Demographic Analysis (May 2026)

After that, the following appropriation-funded products will start looking at the possibility of incorporating a disclosure control method:

  • Those entitled to vote (2027)
  • Elections participation survey (2027)
  • Nominated and elected candidates (preliminarily 2027)
  • Elected representatives (preliminarily 2027)

Income and tax statistics began making use of CKM in 2024 for reporting of aggregated earned income and net income. The statistics are for 2022. The first publication of new statistics following the introduction of CKM will be in January 2025. These statistics will be for 2023.

CKM will also be introduced in those components of commissioned activities related to individual-based statistics.

How the introduction of CKM affects the statistics

When CKM is applied, the statistics are supplemented with a little random uncertainty in a controlled manner, without adding any bias to the statistics. Statistical values that are greater than zero are either adjusted by a small negative or positive integer or left unchanged. All totals are adjusted in the same way. One consequence of the method is that the reported totals are not always equal to the sum of their reported components. For example, the reported totals for women and men do not necessarily correspond to the sum of the reported statistical values for women and men, respectively. Statistics Sweden considers that in most cases, the uncertainty generated by using the method is negligible in relation to other sources of uncertainty that affect the statistics. Read more about how the statistics are affected under ‘Statistical methods’.

Protection methods for disclosure control

How time series, APIs and saved queries are affected

In the future, many tables in the Statistical Database will be based on a new table structure. This is because there will be more underlying subtables when CKM is introduced. As a result, it will not be possible to continue updating existing Statistical Database tables. There will therefore be links to both the older and the new version of the same Statistical Database table on the website. Accordingly, retrieving long time series will require retrievals from two different tables in the Statistical Database. Statistics that have already been published will not be affected by the introduction of CKM.

All old API links in the Statistical Database will continue to work in the future. However, to access the new statistics that are inserted into new tables, users must send an additional API call to the new table. The same applies to saved queries in the Statistical Database.

Contact

E-mail
ckm@scb.se