Differentially private synthetic linked 2011 Census and mortality
Contents
Details
Details
Summary
Synthetic Census 2011 linked with death data, both created through statistical modelling to make it non sensitive.
Description
Synthetic data, created by the Office for National Statistics Data Science Campus by statistically modelling the original data (Census 2011 linked with Mortality data) and then using those models to generate new data values that reproduces the original data’s statistical properties. Any biases already in the real data may propagate though to this synthetic version. created using a differentially private algorithm for categorical data synthesis. This algorithm seeks to preserve the most important statistical properties within the data while protecting the confidentiality of the data contributors according to a mathematical definition of privacy. The privacy budget, epsilon = 1.0. This will be used in the IDS to facilitate analysis and innovation whilst maintaining the principle of data minimisation. It will also facilitate data access within the IDS, while preventing disclosure of confidential respondent information. This dataset is also known by Synthetic Linked Census 2011 Mortality. Data available only to provider approved projects.
Documentation:
Details of any additional information regarding this dataset (opens in a new tab)
About this data
- Data creator
- Office for National Statistics
- Temporal coverage
- 01 January 2011 to 31 December 2020
- Frequency
- Historical
- Dataset theme
- Health
- Restrictions for access
- Access for all accredited researchers
- Project approval
- Projects must be accredited and have approval from the data owner
- Search keywords
- Synthetic data Census 2011 Mortality Linked MSOA
Metadata
Metadata
-
Dataset themeMain category for the topic of the resource
- Health
-
Dataset resource typeThe type of the dataset resource
- Statistical Output - Experimental
-
Geographic coverageThe geographic area covered by the dataset
- England and Wales
-
Temporal coverageThe timeframes covered by these data
- 01 January 2011 to 31 December 2020
-
FrequencyThe frequency at which the dataset resource is published or updated
- Historical
-
Geographic levelThe lowest level of geography covered by the dataset
- Middle layer Super Output Area
-
Data creatorThe name of the organisation that produced or published this resource
- Office for National Statistics
-
Data contributorsThe name(s) of any organisation(s), other than the data supplier or provider, that have data which are included in this dataset resource
-
Licensing statusThe license used for making this resource available and defining how it can be used
- Restricted
-
Disclosure controlThe standard or bespoke disclosure control rules that apply to this dataset
- Standard disclosure rules apply
-
Restrictions for accessRestrictions for users and researchers accessing data in Google Cloud Platform
- Access for all accredited researchers
-
Research outputsThe approval route researchers must take for their outputs to be published
- Research outputs must be approved by the data owner
-
Project approvalThe project approvals needed for this resource
- Projects must be accredited and have approval from the data owner
-
Research disclaimerThe disclaimer required to be published with the outputs for this dataset
- A disclaimer must be published with research outputs
-
AcronymAny other names or acronyms used to refer to the data
-
ProvenanceDetails of how this dataset resource came to be generated