Technical reports

Public Confidence in Official Statistics

This supplementary technical report outlines the research methodology used in the Public Confidence in Official Statistics survey.
  • Publishing date:
    26 March 2024

1. Introduction

1.1 Background

The Public Confidence in Official Statistics (PCOS) survey provides an insight into the knowledge and opinions of the British public on official statistics, including their knowledge of, use of and trust in these statistics.

The research was commissioned by the UK Statistics Authority (the Authority), an independent body at arm’s length from Government. Its executive office, the Office for National Statistics (ONS), is the UK’s National Statistical Institute and largest producer of official statistics. The Authority also has an independent regulatory function (Office for Statistics Regulation, OSR), which ensures that statistics are produced and disseminated in the public interest and acts as a watchdog against misuse of statistics.

PCOS has been run intermittently over the last two decades including, most recently, in 2014, 2016, 2018 and 2021. From 2009 - 2018, PCOS was fielded as a module on NatCen’s British Social Attitudes survey (BSA). However, when BSA 2020 was delayed as a result of the COVID-19 pandemic which began in that year and switched from its traditional face-to-face methodology to an online survey, it was decided to field PCOS as an independent survey.

1.2 Summary of methodology

The 2023 survey was designed to allow comparisons to be made with previous waves of the survey including that fielded on BSA 2018.

PCOS 2023 was run as a stand-alone push-to-web survey. It was designed to encourage participants to complete the survey online but offered paper self-completion surveys to all non-responding households to maximise response and sample quality. Fieldwork took place between 4th October and 17th December 2023. Interviews were achieved with a representative sample of 2,634 adults aged 18 and over in Britain.

As the survey was moved to online fieldwork in 2021 there is a possibility that any differences identified between the results of the 2021 and 2023 surveys and those of previous waves of PCOS might, at least in part, be caused by the change of method rather than reflecting real change in public attitudes. Changing the way a survey is conducted can also introduce the risk of both selection and measurement effects.

Selection effects occur because different ways of collecting data have different coverage and response rates, meaning that the profile of people who complete a survey in one mode may differ from the profile of people who complete the survey in another mode. Measurement effects occur because people may answer the same question in different ways depending on how the question was administered. In addition, the use of two modes of data collection (online and paper-self completion) in the 2023 and 2021 surveys, although considered necessary to improve the representativeness of the survey by giving those without internet access an opportunity to take part, may also have introduced some additional measurement differences.

However, as in 2021, PCOS 2023 has been designed to minimise as far as possible the impact of the change in mode on the comparability of the data over time. The target population and sampling frame are the same (see Section 2). 

The majority of the Public Confidence in Official Statistics (PCOS) 2023 questionnaire was the same as the 2021 questionnaire and similar to the questionnaire fielded on British Social Attitudes (BSA) 2018, allowing comparisons to be made across survey years. Relatively small changes were made to the 2023 questionnaire such as the inclusion of two Census questions on respondent disability and the removal of questions asking about COVID-19 statistics (see Section 3).

Weighting has been used to correct for differences in sample composition and ensure that PCOS 2023 survey is representative of the underlying population of adults 18+ in Great Britain as previous surveys (see Section 5.3 and Section 7). Measurement differences between online and paper interviews are expected to be relatively small. As in 2021, both modes of the 2023 survey are similar in that they are self-completion and present information visually, whilst the paper questionnaire has been designed as far as possible to mimic the layout of the online questionnaire (see Section 3.3). 

Although we cannot be certain of the extent to which any changes (or lack thereof) in attitudes observed across time are down to real-world change or methodological change (or seek to quantify formally the extent of any mode effects) we expect the impact of the methodological change on findings to be relatively small. We are confident that comparisons can be made across survey years and have reported on differences between 2021 and 2023 . When a difference between the 2021 and 2023 data has been found to be statistically significant (at the 95% level), this has been stated explicitly in the report. 

This report provides further details on the survey methodology. Weighted tables of findings are presented separately. 

2. Sampling

2.1 Sample design

The sample design for the Public Confidence in Official Statistics (PCOS) 2023 survey follows the same methodology as PCOS 2021. In 2021 PCOS ran as a standalone push to web survey for the first time. Prior to this, it used to run as a module on the British Social Attitudes (BSA) survey. The target population and the sampling frame used are comparable with previous BSA surveys allowing comparisons to be made between years.

PCOS used a sample of addresses drawn from the Postcode Address File (PAF), a list of addresses (or postal delivery points) compiled by the Post Office. For practical reasons, the sample is confined to those living in private households. People living in institutions (though not in private households at such institutions) are excluded, as are households whose addresses were not on the PAF. 

2.1.1 Selection of addresses 

An unclustered sample of addresses was drawn from the PAF. Addresses located north of the Caledonian Canal and on the Isles of Scilly were excluded in order to be consistent with previous years. Prior to selection, all PAF addresses within England, Scotland and Wales were sorted by: (a) region; (b) population density; and (c) tenure profile (% owner occupation). A systematic (1 in N) random sample of addresses was then drawn. The list of sampled addresses was then split into a main sample and a reserve sample, the latter of which was to be issued if considered necessary to meet the target number of around 2,000 completed interviews. 8,300 addresses were selected for the main sample and 4,980 for the reserve sample. 

A survey invite was sent to each sample address inviting completion of the survey online. There may be more than one dwelling unit and/or household at an address. Without an interviewer to administer the survey, a random selection of dwelling unit/household is not possible but, given that the overall proportion of such addresses is around 1%, this is generally considered to be a minor issue that is unlikely to lead to any systematic bias in the responding sample. Whichever household opened the invitation letter is effectively the selected household for the survey. 

2.1.2 Selection of individuals 

Up to two adults aged 18 or over at each sampled address were invited to take part in the survey. The survey invitation posted to each address contained two unique log ins for the web survey. 

Although it is possible to provide instructions to randomly select one person per household in a push-to-web survey – as was previously done when PCOS was administered face-to-face - studies have shown that respondent compliance with the instructions is poor.1 Inviting up to two people to complete the survey reduces the number of households in which selection is necessary (those with three or more adults) and reduces the associated risk of self-selection bias. 

2.2    Reserve sample 

To ensure that a minimum of 2,000 interviews would be achieved, it was decided to draw a reserve sample to be issued in the event that response rates were lower than anticipated. 

The reserve sample was drawn in the same way, and at the same time, as the main sample. A total of 4,980 addresses were allocated between two equal batches of 2,490 addresses each. A subset of 500 addresses from this reserve was issued part way through fieldwork (see Section 4). When the reserve subsample was drawn, it was sorted by the same stratification variables used in both main and reserve sample, which ensures the subsample is representative. This was also checked after it was selected. The subsample was also already part of the reserve, which followed the same design as the main sample. 

3. Questionnaire 

3.1 Overview

The majority of the Public Confidence in Official Statistics (PCOS) 2023 questionnaire was the same as the 2021 questionnaire and similar to the questionnaire fielded on British Social Attitudes (BSA) 2018, allowing comparisons to be made across survey years. 

Relatively small changes were made to the 2023 questionnaire such as the inclusion of two Census questions on respondent disability and the removal of questions asking about COVID-19 statistics.

Full versions of the online and paper questionnaires can be found in Appendix A and Appendix B respectively. The questionnaire followed the broad structure below: 

  1. Opening individual questions 
  2. Experience of statistics generally
  3. Awareness of and trust in organisations 
  4. Awareness of and trust in the Office for National Statistics (ONS) 
  5. Questions about specific statistics produced by ONS 
  6. Attitudes to statistics generally 
  7. Awareness of the UK Statistics Authority (the Authority) and Office for Statistics Regulation (OSR) 
  8. Closing demographic questions 

Opening individual questions – respondents were asked to provide some basic information about themselves including: their age, sex and whether their gender identity is the same as it was at birth. Respondents were also asked about the number of adults (18+) and the number of children (below 18) in the household. 

Experience of statistics generally – respondents were asked about how often they saw statistics in the news and on social media and how often they used statistics in their lives. 

Awareness of and trust in organisations – respondents were asked about their awareness of a range of different institutions including Greenpeace, the Bank of England and the Department for Work and Pensions and their trust in institutions including the Civil Service, the UK Parliament, the Government, the media. These questions provide some context for questions asking about awareness of and trust in ONS. 

Awareness of and trust in ONS – respondents were asked about their awareness of ONS, use of statistics produced by ONS and participation in ONS surveys. Respondents were also asked for their level of trust in ONS statistics and the reasons for trusting/not trusting these statistics. 

Questions about specific statistics produced by ONS – Respondents were asked about the following statistics produced by ONS.

  • Census 
  • Consumer Prices Index (CPI) 
  • Employment and unemployment statistics 
  • Gross Domestic Product (GDP) 
  • Crime statistics 

For each of these statistics the survey asked respondents: 

  • Whether they had ever used the statistics 
  • Whether the change in the statistic over time reflected the changes in the UK 
  • Whether the statistics were free from political interference 
  • Whether the statistics provide useful information 
  • Whether the statistics were released quickly 

Attitudes to statistics generally –respondents were asked whether they consider statistics in general important, whether they are free from political interference, whether statistics are accurate, and whether government/newspapers present the statistics honestly. 

Awareness of the Authority and OSR – respondents were asked about their knowledge of the UK Statistics Authority and Office for Statistics Regulation and for their views on the role the Authority should have in regulating official statistics. 

Closing demographic questions – Information was collected on respondents’ religion, ethnicity, economic status, housing tenure and internet usage. 

3.2 Summary of changes since 2021

The questionnaires from 2023 and 2021 are very similar in content, with only two changes to note for the latter:

  • The inclusion of two standard Census questions on disability. The first question asks respondents whether they have any physical or mental health conditions or illnesses lasting or expecting to last for 12 months or more. The second question asks whether the existing conditions or illnesses affect the respondent’s ability to carry out day-to-day activities.
  • The removal of questions asking about COVID-19 statistics, as ONS no longer measures or publishes weekly positive testing results or number of deaths from COVID-19 due to the low risk of infection and serious illness.

3.2.1 Adaptations for self-completion mode for PCOS 2021 and 2023

Whereas 2023 and 2021 PCOS surveys are very similar in content, some more significant questionnaire changes were introduced in 2021 due to the change in mode required by transitioning PCOS from a BSA module with face-to-face data collection to an independent self-completion survey. These adaptations persisted in the 2023 survey version.

In order to allow respondents to complete the survey themselves without the aid of the interviewer some minor edits to the questionnaire were made in 2021. The wording of question stems was amended to remove reference to the interviewer reading out questions whilst question instructions were updated, for example to remove references to show cards used in the face-to-face interview.

Some additional questions around respondent’s attitudes towards statistics generally and their awareness of OSR were also added in 2021 and were kept in 2023.

Some questions were amended to make them more appropriate for fielding in 2021 or in response to feedback obtained during usability testing or from the Authority. 

  • ONSpa (whether respondent participated in ONS surveys): The list of surveys offered was updated in 2021, for example to include mention of the COVID-19 Infection Study. The reference of COVID-19 statistics has been removed in 2023.
  • TrONSY, TrONSN (Reasons for (not) trusting ONS): The list of response options was updated in consultation with the Authority. 
  • CenUse, CPIUe, GDPUse (whether used census, CPI, GDP statistics): Following feedback from usability testing, definitions of these statistic collections were added to the relevant question. 
  • FULong, FUOft (How long/recently use ONS statistics): The routing for these questions was updated to be asked of everyone who used ONS frequently or occasionally rather than just (as in 2018) frequently. 

3.3 Differences between online and paper questionnaire

The online and paper questionnaire had the same content and, wherever possible, the layout of questions was maintained across modes. However, some differences in question format between the two modes were necessary to ensure the usability of the paper questionnaire and keep it to a reasonable length. The main differences in question layout between the web and paper versions of the questionnaire were: 

  • Trust in institutions: On the web these questions were presented with one institution per page whereas in the paper questionnaire a grid format was used to save space. 
  • Reasons for trusting/not trusting ONS: On the web respondents were first asked to pick up to 3 reasons and then, on a separate screen, to select the most important reason from among those options chosen at the previous question. On paper, the two questions were presented side by side with respondents asked to select up to three reasons in the left-hand column and then select the most important one of these reasons in the right-hand column. 
  • Freedom from political interference: The web version of the questionnaire included a definition of ‘political interference’ at every question where the term was used. In the paper questionnaire this definition appeared just the first time the term was used. 
  • Economic activity: Respondents were asked what their main activity had been in the past week. In the online survey they were asked about their activity since a particular date (calculated by the interview programme based on the date of the interview). In the paper survey they were asked about ‘in the seven days ending last Sunday’.
  • Internet use: The paper version of the question on ‘frequency of internet use’ had a ‘do not have access to the internet’ option which was not displayed to people completing the survey online. 

3.4 Don't know responses

Neither the online nor paper questionnaires displayed ‘don’t know’ or ‘prefer not to say’ options at individual questions. However, it was made clear at the start of both questionnaires that respondents could skip any question. On skipping a question, online respondents could select either ‘don’t know’ or ‘prefer not to say’. If a respondent skipped a question in the paper questionnaire this was recorded as ‘not answered’ and subsequently treated as ‘don’t know’ in all analyses. 

The approach to recording ‘don’t know’ responses was the same in 2023 as in 2021 and 2018, that is ‘don’t know’ responses were hidden from respondents initially. Respondents were not prompted to choose ‘don’t know’ responses by being shown them on showcards/in the questionnaire but were allowed to skip questions (online/paper) or spontaneously give a ‘don’t know’ response (face-to-face) to move on. We would therefore have anticipated similar levels of ‘don’t know’ responses across modes. 

4. Fieldwork Procedures

4.1 Fieldwork period and modes

Fieldwork took place between October and December 2023. Both the main and reserve samples were issued. The reserve sample was issued during fieldwork as it was uncertain as to whether the target response rate would be met without a small number of additional responses (see Section 5.1).

Sampled addresses were initially directed to an online survey. However, as not all households have access to the internet, reminder mailings were sent out which gave people the option to complete a paper version of the questionnaire.

Invitation letters for the main sample to complete the web survey were sent out to respondents on 4th October with a reminder sent on 11th October 1 .  Paper questionnaires were issued on 30th October to people who had not yet responded online. Invitation letters for the reserve sample were sent out to respondents on 15th November with paper questionnaires sent out on 22th November. Fieldwork ended on 17th December.

4.2 Contact strategy

The main sample were contacted up to three times. All sample members received letters 1 and 2. Letter 3 was sent to households where fewer than two online responses had been received at the time of mailing. 

The reserve sample received Letter 1 and Letter 3 only. The number of reminders was reduced for the reserve sample in order to limit the amount of time fieldwork needed to be extended to accommodate the reserve sample. 

  • Letter 1 – Invitation letter and survey leaflet. 
  • Letter 2 – Reminder letter. 
  • Letter 3 - Reminder letter 2 and paper questionnaire. 

All communications had NatCen branding and were signed by a NatCen research director, rather than someone from the UK Statistics Authority (the Authority), in order to promote the survey’s independence. 

4.2.1 Invitation letter

Initial contact with respondents was via a letter inviting them to participate in the Public Confidence in Official Statistics (PCOS) survey 2023. This letter, which can be seen in Appendix C, contained a summary of what the survey entailed, the role of the Authority in the study, and answered some frequently asked questions that the respondents may have had about why they were selected and what would happen to their data.

The letter contained details of the survey website and access codes as well as a link to a participant web page on the NatCen website and a link to the project privacy notice. 

4.2.2 Survey leaflet

To help respondents better understand the survey and encourage participation an information leaflet was included alongside the invitation letter. This leaflet included more details about the survey as well as some of the findings from previous waves of PCOS. This leaflet is available in Appendix D. 

4.2.3 Reminder 1

The first reminder was sent seven days after the invitation and contained similar information to the invitation letter. The messaging at the start of the letter varied across the invitation and reminder letters however, stressing different reasons for the importance of the survey, in order to try and encourage participation from as many people as possible (see Appendix E).

Reminder 1 was sent to all addresses from the main sample and again contained two logins for the web survey. 

4.2.4 Reminder 2

After a further 19 days 2  , and 26 days after the invitation letter was sent, a second reminder was sent out to the main sample (see Appendix F). This letter was sent only to addresses that had not yet returned two completed questionnaires at the time of the reminder 2 sample creation. For the reserve sample, the second reminder was sent out 7 days later to all addresses in the reserve sample.

If the address had not yet returned any completed interviews the second reminder contained both of the original web logins. If one person at the address had already completed the survey only the other, unused, login was included in the reminder letter. 

The second reminder letter also included a paper version of the questionnaire (Appendix B) along with a pre-paid envelope for respondents to complete the survey on paper rather than online. As with the web logins, addresses were sent either one or two paper questionnaires depending on whether anyone in the household had already returned a completed questionnaire. 

4.3 Incentives and thank you letters

In order to encourage respondents to take part in the study, respondents were offered a £10 Love2Shop voucher for completing the survey. 

Where respondents provided an email address, they were sent a ‘Thank You’ email (Appendix I). The email expressed the appreciation of the Authority and issued each respondent a unique online voucher code to be redeemed.

For respondents that completed online but did not offer an email address, a ‘Thank You’ letter was sent out via the post. This letter again directed respondents to an online voucher code, but offered respondents the opportunity to call the NatCen Survey Inquiry team if they required help  redeeming their voucher. This thank you letter can be found in Appendix G. Similarly, respondents that completed on paper but did not offer an email address, were sent a ‘Thank You’ letter via post. This letter, however, provided respondents with a physical gift voucher as opposed to an online voucher code. This was to ensure that those who wished to complete offline were accommodated for throughout the entire survey process. This thank you letter can be found in Appendix H.

4.4 Survey length

For the online survey the median completion time was 8.30 minutes. The mean completion time was 8.47 minutes. 3

Table 4.1: Survey length for web completes in minutes
Web completes8.308.472.2011.00

5. Response Rates and Sample Composition

This section of the report discusses the response rate achieved on the Public Confidence in Official Statistics (PCOS) 2023 survey and the quality of the achieved sample, that is how well the final sample represents the underlying population.  The response rate achieved in 2023 was lower than the 2021 PCOS survey. This was as expected given the smaller reserve sample issued during fieldwork. However, as detailed below, the quality of the final weighted samples is broadly comparable, suggesting we can be confident in making comparisons across survey years. 

5.1 Response rates 

When discussing fieldwork figures in this section, response rates are referred to in two different ways:

Household response rate – This is the percentage of households contacted as part of the survey in which at least one interview was completed.

Person level response rate – This is the estimated response rate among all adults that were eligible to complete the survey.

In total across the main and reserve samples 8,800 addresses were sampled, from which 2,364 interviews were achieved having removed 56 cases following validation checks (see Section 6). Of these, 2,248 interviews (95%) came from the main sample and 116 (5%) from the reserve sample.

Overall, at least one interview was completed by 1,695 households, which represents an unadjusted household level response rate of 19.3%. In an online survey of this nature no information is known about the reason for non-response in each individual household. However, it can be assumed that around 9% of addresses in the sample were not residential and were therefore ineligible to complete the survey. Once ineligible addresses are removed, the adjusted household level response rate is 21.2%.

In total, 2,364 individuals completed the survey. Assuming an average household size of 1.9 adults , this represents an unadjusted individual response rate of 14.1%. Once ineligible addresses are removed, the adjusted person level response rate is 15.5%.

The response rate among the reserve sample was slightly lower than for the main sample, possibly because the fieldwork period was slightly shorter and there was one fewer reminder letter sent (see Section 4.2). However, the composition of the achieved reserve sample looks to be in line with that of the main sample (see Section 5.3);

Table 5.1 Response rates for main, reserve, and overall sample
 Main SampleReserve SampleOverall Sample
Issued addresses8,3005008,800
Assumed eligible households7,5534558,008
Assumed eligible adults14,35086415,215
Responding households1,613821,695
Responding adults 2,2481162,364
Online responses1,643751,718
Paper responses60541646
Adjusted household response rate21.4%18.0%21.2%
Adjusted individual response rate15.7%13.4%15.5%

5.2 Break-offs

A break-off occurs when a participant enters the online questionnaire but does not complete it. Software allows this abandoned survey data to be captured. These data can be analysed and used to identify problems with the survey, formatting issues on devices (which can arise on an ad-hoc basis due to device updates), indicate questions that respondents find difficult to answer or that there may be technical issues with. It is possible to quantify an overall break-off rate by dividing the number who abandoned the survey by the number who started the questionnaire. 

A total of 83 respondents entered the online questionnaire but did not complete it. This is a break-off rate of 5% for people who entered the online survey. 

This is lower than many other online surveys 4  and probably reflects the fact that the PCOS questionnaire was relatively short. 

The most common point where respondents were more likely to break off during the PCOS was at either the introduction page or when asked for their email for incentive purposes.

5.3 Sample composition

The composition of the final sample – and the extent to which it is representative of the underlying population – is an important component of survey quality. This section shows the breakdown of the 2023 sample by mode (online vs paper) and whether the participant was part of the main or reserve sample. 

Table 5.2 Unweighted web/paper and main/reserve sample distribution in PCOS 2023 data
Base: All respondentsNumber of completed interviews Percentage of total completed interviews
Web or paper completesWeb complete1,71873%
Paper complete64627%
Main or reserve sample completesMain sample2,24895%
Reserve sample1165%

Table 5.2 shows that 27% of respondents completed the survey on paper. As expected, the composition of the paper sample differs from the composition of the online sample. Offering paper as an alternative mode will have helped to address some potential biases in the online-only sample (e.g. the overrepresentation of people with a degree) although it will have exacerbated others (e.g. the overrepresentation of older people). Compared with people who completed online, people completing the survey on paper were more likely to be from older age groups, White, and Christian and were less likely to be degree educated or in professional/managerial jobs (Table 5.3). 

Table 5.3 Profile of achieved sample (unweighted): Online vs paper
Base: All respondents, excluding don’t knows and refusalSample type
Percentage of online completesPercentage of paper completes 
Unweighted base1,718646
Unweighted base1,718646
Other ethnicity13%8%
Unweighted base1,718646
Education levelDegree49%28%
Higher education below degree14%17%
A level or equivalent15%11%
Below A level15%23%
Other qual0%1%
No qualification6%17%
Unweighted base1,718646
ReligionNo religion46%26%
Christian, all denominations47%69%
Other religion6%4%
Unweighted base1,718646
Economic activityManagerial & professional occupations58%42%
Intermediate occupations11%11%
Employers in small org; own account workers2%3%
Lower supervisory & technical occupations6%9%
Semi-routine & routine occupations9%13%
Not classifiable14%23%
Unweighted base1,718646

Table 5.4 shows that, although the response rate to the reserve sample was lower than that for the main sample, this does not appear to have had a negative impact on the representativeness of the sample. The characteristics of the main and reserve samples were similar. 

Table 5.4 Profile of achieved sample (unweighted): Main vs reserve samples
Base: All respondents, excluding don’t knows and refusalSample type
Percentage of main samplePercentage of reserve sample 
Unweighted base2,248116
Unweighted base2,248116
Other ethnicity12%4%
Unweighted base2,248116
Education levelDegree44%35%
Higher education below degree14%21%
A level or equivalent14%13%
Below A level17%24%
Other qual0%0%
No qualification9%6%
Unweighted base2,248116
ReligionNo religion40%47%
Christian, all denominations53%50%
Other religion5%3%
Unweighted base2,248116
Economic activityManagerial & professional occupations54%46%
Intermediate occupations11%11%
Employers in small org; own account workers2%4%
Lower supervisory & technical occupations7%8%
Semi-routine & routine occupations10%10%
Not classifiable16%21%
Unweighted base2,248116

While the 2021 and 2018 PCOS surveys differed substantially in their sample design (with the 2018 PCOS conducting face-to-face interviews with 1 adult per household, vs 2021 PCOS shifting to online interviewing with up to 2 adults per household), this was not as great of a concern for PCOS 2023. Both the 2023 and 2021 PCOS surveys adhered to the same sample design and, as with previous iterations of the survey, any potential underrepresentation of certain groups can be rectified during the weighting process.

Table 5.5 shows the profile of the weighted sample from the 2021 and 2023 surveys compared against the population of all adults in Great Britain for a series of key demographic variables: sex, age, number of adults per household, ethnicity, region, tenure, education. and economic activity. 

Table 5.5: Weighted distribution of key demographic data
Base: All respondents excluding don’t know and refusalsSurvey Year
PCOS 2023PCOS 2021National estimates
Unweighted base2,3613,391-
Unweighted base2,3503,366-
Mean number of adults per household**
Other ethnicity17.3%12.4%14.0%
Unweighted base2,3163,272-
Region*North East4.1%4.2%4.1%
North West11.4%11.1%11.4%
Yorkshire and the Humber8.2%8.5%8.4%
East Midlands7.6%7.5%7.5%
West Midlands9.1%9.1%9.0%
South East14.2%14.1%14.2%
South West9.0%8.9%8.9%
Unweighted base2,3643,398-
Tenure**Owned outright32.4%33.9%34.3%
Mortgage owned/shared ownership32.1%33.8%33.9%
Unweighted base2,3473,353-
Economic activity**In employment59.4%61.0%61.8%
ILO unemployed8.9%7.2%2.5%
Unweighted base2,3203,268-
*Source of national figures: ONS mid-year population estimates 2022 (England and Wales), ONS mid-year population estimates 2021 (Scotland), includes those over 
18 and over
**Source of national figures: Labour Force Survey (April-June 2023) 

Looking at the data in Table 5.5 we can see that the weighted PCOS 2023 sample broadly matches the composition of both the previous PCOS survey as well as the national population. Given this, we can remain relatively confident about making comparisons across survey years. There may, however, still remain unobserved differences between the PCOS 2021 and PCOS 2023 samples, for example in political engagement or statistical knowledge, which it has not been possible to control for in the weighting. These unobserved differences may in turn still have a role to play in explaining differences over time. 

Weighting efficiency, another potential indicator of sample quality, is discussed further in Section 7.2. 

5.4 Inclusivity and sub-group analysis

There is interest in being able to breakdown the results of the survey by a variety of demographic characteristics. However, sample sizes limit the extent to which it is possible to draw robust conclusions about some sub-groups of interest. This is particularly the case with respect to breakdowns by ethnicity and/or religion. 

The achieved sample sizes for different ethnic and religious groups are given in Table 5.6. 

Table 5.6 Ethnic and religious profile of achieved sample (weighted)
Base: All respondents that stated their ethnicity or religionNumber of completed interviews Percentage of total completed interviews
Ethnicity full distributionEnglish / Welsh / Scottish / Northern Irish / British1,82879%
Gypsy or Irish Traveller10%
Any other White background743%
White and Black Caribbean111%
White and Black African60%
White and Asian261%
Any other Mixed / Multiple ethnic background261%
Any other Asian background342%
Any other Black / African / Caribbean background00%
Any other ethnic group1024%
Ethnicity groupedWhite1,91681%
Other ethnicity40117%
Religion full distributionNo religion100143%
Christian (including Church of England, Catholic, Protestant and all other Christian denominations)1,14249%
Any other religion131%
Religion groupedNo religion100142%
Christian, all denominations1,14248%
Other religion1988%

It can be seen that the sample sizes for non-white and non-Christian respondents are relatively small. This in turn means that the confidence intervals around estimates for these groups will be large. Tables 5.7 and 5.8 depict the confidence intervals for the highest point in the scale for four key questions. 

Table 5.7 Confidence intervals for key questions for ethnicity full distribution and grouped
 Confidence interval for each survey question
Awareness of ONS – ‘Knew it well’ONS usage – ‘Yes, frequently’Trust in ONS – ‘Trust it a great deal’Trust in ONS statistics – ‘Trust them greatly’
Ethnicity full distributionEnglish / Welsh / Scottish / Northern Irish / British(12.8-17.1)(2.7-4.9)(13.4-17.5)(14.6-18.6)
Gypsy or Irish Traveller----
Any other White background(6.2-22.9)(1.1-14.1)(7.2-25.8)(10.4-32.7)
White and Black Caribbean(9.3-53.0)---
White and Black African(2.1-63.2)-(10.0-81.2)(10.0-81.2)
White and Asian(3.8-37.2)(0.6-26.1)(4.8-42.8)(10.0-61.4)
Any other Mixed / Multiple ethnic background(3.3-27.0)(0.4-17.7)(8.8-60.7)(3.1-26.1)
Any other Asian background(1.9-34.3)(0.5-25.6)(1.8-31.9)(1.8-31.9)
Any other Black / African / Caribbean background----
Any other ethnic group(5.3-36.6)(0.1-5.6)(8.8-36.8)(9.3-33.4)
Ethnicity groupedWhite(12.8-16.9)(2.8-5.0)(13.5-17.5)(14.8-18.7)
Other ethnicity(9.6-21.2)(3.4-11.8)(13.7-27.6)(15.5-29.7)
Table 5.8 Confidence intervals for key questions for religion full distribution and grouped
 Confidence interval for each survey question
Awareness of ONS – ‘Knew it well’ONS usage – ‘Yes, frequently’Trust in ONS – ‘Trust it a great deal’Trust in ONS statistics – ‘Trust them greatly’
Religion full distributionNo religion(13.0-18.6)(3.7-8.1)(14.2-20.9)(16.7-23.6)
Christian (including Church of England, Catholic, Protestant and all other Christian denominations)(11.8-17.8)(2.3-4.6)(12.9-18.4)(12.8-17.5)
Any other religion(14.0-61.8)(2.5-31.5)(4.5-51.0)(3.2-52.9)
Religion groupedNo religion(13.0-18.6)(3.7-8.1)(14.2-20.9)(16.7-23.6)
Christian, all denominations(11.8-17.8)(2.3-4.6)(12.9-18.4)(12.8-17.5)
Other religion(6.7-18.0)(1.2-7.5)(9.5-24.2)(12.9-31.9)

Tables 5.7 and 5.8 shows that, especially for groups with small sample sizes, the confidence intervals for individual ethnic groups or religions are quite wide. This means that the precision of the survey estimates for these groups is low. The grouped options at the bottom of the table have smaller confidence intervals and greater precision. However, using the combined groups severely restricts the analytical scope of the data and the conclusions that can be drawn about different ethnic or religious groups. For this reason, and to be in line with recommendations from the UK Statistics Authority’s Inclusive Data Taskforce, data on ethnicity and religion has not been included in the analytical report.

6. Data management

As in previous years of the survey, the Public Confidence in Official Statistics (PCOS) 2023 data underwent coding and editing, and rigorous quality assurance, to ensure that the final data were as accurate as possible. The exact nature of the coding and editing in 2023 and 2021 reflected the switch to an online survey mode. 

6.1 Editing

6.1.1 Edits applied to paper questionnaires 

Unlike the web questionnaire, the paper questionnaires do not have computer-assisted routing. As a result, it was possible for some respondents to ignore the routing instructions and answer questions they shouldn’t have. Equally, some respondents may have missed questions that they should have answered. Most of these errors were dealt with through standard edit rules. For any question that the paper respondent should have answered but did not, the question received a ‘not answered’ code (code 9 or 99 depending on the scale of the question). For those questions where the respondent has provided answer when they shouldn’t have, the data was edited to be off route (code -1 or 97 ‘off route’). if a single code question had more than one category ticked, it was set to ‘don’t know’ (code 8 or 88).

6.1.2 Household harmonisation

The PCOS interview was completed at an individual level with most of the questions relating to the respondent’s behaviour or attitudes. However, some information was collected about the individual’s household, for example the number of adults living in the household. With two respondents in a household it was possible for the household- level information provided to vary between individuals. For weighting purposes, it was necessary to harmonise this information across individuals. 

In order to complete the harmonisation, a priority order was established to determine which answer within a household should be taken and applied to all interviews in a household.

  1. Take the most common answer within a household 
  2. If needed, take the answer or the older respondent and apply to all members of the household 
  3. If both respondents are the same age, take the answer supplied by the respondent using the first household log in and apply to all members of the household 
  4. If information on household size was not given or did not match the number of valid questionnaires returned, the household size was forced to equal the number of returned questionnaires. 

The number of cases for which harmonisation was required was: 

  • HHlAd: Number of adults aged 18+. 
    • 70 cases harmonised because of inconsistent responses within household. 
    • 32 harmonised because of missing information. 
    • 42 harmonised because response did not match number of returned questionnaires. 
  • HHlChl: Number of children under 18. 
    • 9 cases harmonised because of inconsistent responses within household.
    • 16 cases harmonised because of missing information.
  • Tenure: 
    • 76 cases harmonised because of inconsistent responses within household. 
    • 5 cases harmonised because of missing information.
  • Hedqual: Highest qualification of respondent. 
    • 340 cases harmonised based on whether anyone in the household was educated to degree level or above.
    • 10 cases harmonised because of missing information.

6.2 Coding

6.2.1 Back coding 

Throughout the questionnaire there were several occasions where respondents could answer ‘other’. Often when respondents answered these questions, they gave an answer that could fit into one of the response options given in the questionnaire. To ensure that this data was captured correctly, NatCen’s Data Unit reviewed the open ended ‘other’ responses and back coded them into one of the given response options where appropriate. The responses were also reviewed to see if additional response options were needed to capture common ‘other’ responses. This was not found to be the case for this survey. 

The questions where back coding applied are outlined below: 

  • RspGender - What is the gender you identify as? 
  • TrONSYO – Other reasons for trust in ONS’ statistics 
  • TrONSNO – Other reasons for not trusting in ONS’ statistics other 
  • RelOther - Other religion 
  • EthOther - Other ethnicity 
  • EconFwOther - Other economic activity 
  • HEdQualOther - Other educational qualification 
  • TenureOther - Other tenure type

6.2.2 NS-SEC coding 

Questions on the respondent’s employment status were used to derive the five-category version of the National Statistics Socio-economic Classification (NS-SEC). The derivation used information from the following variables in line with standard guidance from the Office for National Statistics 5 .

  • EconFW - Economic activity in last 7 days 
  • EmpStat - Employment status 
  • Employ - Number of people work at the place where respondent works
  • Superv - Responsibility for supervising the work of other employees 
  • EmpOCC - Type of work being completed

Using the data from the variables listed above, a new variable was derived that coded the data into the standard five-category NS-SEC variable which was then used in the analysis of the final data. 

6.3 Data quality checks

Without an interviewer to oversee the data collection process, self-completion surveys are susceptible to poor, duplicate, or falsified data. Because households were issued with two web logins and up to two paper questionnaires it was possible for the same person within the household to have completed the survey twice or for a household to have returned more than two completed surveys. This could be done in error (for example someone completing the paper questionnaire after forgetting that they had already completed the survey online or thinking their data had not been received) or by individuals wishing to claim multiple incentives for completed questionnaires. 

The following data quality checks were carried out to ensure that all of the data included in the final dataset was collected in a standardised way and from the right individuals. 

6.3.1 Identification of speeders

One way to identify poor quality, or potentially falsified, data is by looking at the length of time taken to complete the questionnaire. An expected interview length for each respondent who completed the survey online was calculated based on the median interview length for someone following a given route through the questionnaire. 

Any cases where the actual interview length was identified as an outlier, that is, significantly far from the lower quartile of responses in the sample, were excluded from the dataset. Nine cases were excluded for speeding.

6.3.2 De-duplications

Following removal of interviews identified as too short, the final data was cleaned to: 

  • Remove any duplicate questionnaires where it appeared from the data that the same individual had returned two questionnaires. 
  • Ensure that, after removal of duplicates, no more than two completed questionnaires per household were included in the final dataset. 

Cases were treated as duplicates if there were two or more completed surveys where 
the same respondent name and/or email address was given. 

Web completes were prioritised over paper completes and, within mode, the first completed survey was kept. Forty seven duplicate cases were removed along with six cases where there were more than two completed interviews per household. 

6.4 Data outputs

6.4.1 Data files 

A cleaned, weighted data file was sent to the UK Statistics Authority. This file contained all survey and derived variables, as well as information on which respondents had agreed to be recontacted. Respondents’ personal information was removed from this data file. 

6.4.2 Derived variables 

In addition to the data captured by the questions in the survey some derived variables were created. These variables combine data from single or multiple questions to create measures required for analysis. 

A full list of these variables is in Table 6.1. 

Table 6.1: Derived variables produced for PCOS 2023
Questionnaire variable nameQuestionnaire variable label
TrONSWY_All Most important reason for trusting ONS statistics 
TrONSWN_All Most important reason for not trusting ONS statistics 
EmployStat Employment status - Employed, self-employed, not in employment, other (not known) 
RAgeCat Age of respondent – 18-24, 25-34, 35-44, 45-54, 55-59, 60-64, 65+ 
HhTypeDv Household type DV – single adult, 2 adults no children; 3+ adults no children, 1 adult + children, 2 adults + children, 3+ adults + children 
TenureDv Housing tenure summary DV – buying/own outright, shared ownership, renting, living rent free, other/no information 
CountryGrp Country – England; Wales/Scotland combined 
ReligionGrpReligion – No religion; Christian, all denominations; Other religion 
EthnicityGrpEthnicity – White; Other ethnicity
NSSECEconomic activity - Managerial and professional occupations; Intermediate occupations; Employers in small organisations, own account workers; Lower supervisory and technical occupations; Semi-routine and routine occupations; Not classifiable

7 Weighting

7.1 Weighting process

This section outlines the weighting process employed to ensure the final data are representative of the population. 

Selection weights were not required for PCOS as there was no disproportionate probability of address selection in the sampling procedure. The weights instead focused on correcting for differences in i) the probability that a sampled household would respond (between-household non-response) ii) the probability that in a household with two or more adults, two adults would respond rather than just one (within-household non-response). After correcting for non-response, the final weights were calibrated to population totals. 

7.1.1 Between-household non-response weights

Household non-response weights were calculated using a logistic regression model. Variables tested for association with household-level non-response included: census indicators of area age profile, tenure, education profile, employment profile, ethnicity, car ownership, socio-economic classification (NS-SEC), population density, and indices of multiple deprivation. The final model included variables that significantly predicted household response: region, quintiles of socio-economic classification (NS-SEC), percentage of owner-occupied properties in the Output Area (quintiles), percentage of residents with a degree in the postcode sector (quintiles), percentage of families with children in Lower Layer Super Output Areas (quintiles), urban-rural classification and output area classification. The full model is shown below. 

The predicted probabilities from the model were used to create household non-response weights. These were checked and trimmed at the 99th percentile to remove outliers and improve efficiency. 

Table 7.1: Between Household non-response model
GOROdds Ratiop-valueConfidence Interval
North east 1- - 
North west 0.6970.013(0.52, 0.93) 
Yorkshire and the Humber 0.630.003(0.46, 0.86) 
East midlands 0.8240.217(0.61, 1.12) 
West midlands 0.620.002(0.46, 0.84) 
East of England 0.6240.003(0.46, 0.85) 
London 0.5950.002(0.43, 0.83) 
South east 0.7220.025(0.54, 0.96) 
South west 0.8610.317(0.64, 1.16) 
Scotland 0.8330.3(0.59, 1.18) 
Wales 0.6210.006(0.44, 0.87) 
NS-SEC quintiles - - 
11- - 
21.0030.982(0.80, 1.26) 
31.0530.708(0.80, 1.38) 
40.8630.364(0.63, 1.19)
51.0580.766(0.73, 1.53)
Quintiles of owner occupation rate  0.02 
11- - 
21.3520.004(1.1, 1.66) 
31.4080.004(1.12, 1.78) 
41.3070.04(1.01, 1.69) 
51.678<0.001(1.26, 2.24) 
Quintiles of education to degree level  0.025 
11- - 
21.1260.293(0.90, 1.41) 
31.3370.03(1.03, 1.74) 
41.610.002(1.19, 2.17) 
51.6390.008(1.14, 2.36) 
Quintiles of the percentage of families with children0.02  
Scotland contrast (no children measure) 0.074-
21.270.019(0.04, 1.56) 
31.2990.009(1.07, 1.58) 
41.2060.058(0.99, 1.47) 
Urban/rural classification                  
Urban 1.0280.752(0.87, 1.22) 
Output Area classification 0.101 
Rural residents 1- - 
Cosmopolitans              0.8880.526(0.62,1.28)
Ethnicity central 0.6820.083(0.44, 1.05) 
Multicultural metropolitans 0.6920.024(0.50, 0.95) 
Urbanites 1.0480.689(0.83, 1.32) 
Suburbanites 0.9750.83(0.77, 1.23) 
Constrained city dwellers 0.9230.632(0.67, 1.28) 
Hard pressed living 0.8630.267(0.67, 1.12) 
Intercept 0.173<0.001 -

7.1.2 Within-household non-response weights 

Within-household response weights were calculated using a logistic regression model weighted by the household non-response weights. The model estimated differences in the probability of more than one adult within a household responding to the survey. It was run for households that provided at least one response and had more than one eligible adult so, as well as the variables from the first non-response model, included additional variables harmonised at household level such as household tenure and number of adults in the household. The final model included variables that significantly predicted more than one response from responding households: region, number of eligible adults in the household (trimmed at 3 to avoid extreme weights), interview mode, ethnicity, and quintiles of NS-SEC. The full model is shown below.

The predicted probabilities from the model were used to create within-household non-response weights. These were checked for outliers and left untrimmed, then scaled and combined with the household non-response weights. The combined non-response weights were also checked for outliers and left untrimmed.

Table 7.2 Within Household non-response model
GOROdds Ratiop-valueConfidence Interval
North east 1- - 
North west 0.7240.385(0.35, 1.50) 
Yorkshire and the Humber 0.5780.152(0.27, 1.22) 
East midlands 0.4130.023(0.19, 0.88) 
West midlands 0.6080.199(0.28, 1.30) 
East of England 0.7280.407(0.34, 1.54) 
London 0.8050.596(0.36, 1.80) 
South east 0.6680.274(0.32, 1.38) 
South west 0.7440.437(0.35, 1.57) 
Scotland 0.8170.598(0.39, 1.73) 
Wales 0.670.356(0.29, 1.57) 
NS-SEC quintiles 0.008 
11- - 
20.9390.759(0.63, 1.40) 
30.9710.882(0.65, 1.44) 
40.7680.197(0.52, 1.15)
50.5390.003(0.36, 0.81)
Quintiles of ethnicity 0.072 
11- - 
20.7620.168(0.52, 1.12) 
31.190.384(0.80, 1.76) 
40.7970.277(0.53, 1.20) 
50.6840.136(0.42, 1.13) 
Number of eligible adults in household   
21- - 
30.7640.077(0.57, 1.03) 
Interview mode   
Web0.387<0.001(0.29, 0.53)
Intercept 5.781<0.001 -

7.1.3 Calibration weights

The final step in the weighting process was to calibrate the combined non-response weights to population estimates. These were taken from Labour Force Survey and ONS mid-year population estimates for those aged 18 and over. Calibration weighting adjusts the weights so that characteristics of the weighted achieved sample match population estimates, thus reducing residual bias. For PCOS 2023, five calibration variables were used: sex by age bands, region, education level by age, household tenure, and ethnicity. After calibration, the top six weights were trimmed to improve efficiency and the final weights were scaled to the responding sample size (n=2,364).

Table 7.3 Unweighted and weighted sample composition
 Unweighted respondentsWeighted respondentsPopulation estimates
Prefer not to answer30.1%20.1%  
North East1365.8%984.1%2,151,7894.1%
North West27911.8%27011.4%5,933,88411.4%
Yorks. & Humber1817.7%1958.2%4,384,2098.4%
East Midlands1888.0%1797.6%3,930,8577.5%
West Midlands1807.6%2159.1%4,712,6689.0%
East of England2098.8%2279.6%5,050,1289.7%
South East36815.6%33614.2%7,416,94814.2%
South West27211.5%2139.0%4,669,6918.9%
Age missing140.6%190.8%  

7.2 Estimated effective sample size and design effect

The effect of the sample design on the precision of the survey estimates is indicated by the effective sample size (NEFF). The effective sample size measures the size of an (unweighted) simple random sample that would achieve the same precision (standard error) as the design that has been implemented. The efficiency of a sample is given by the ratio of the effective sample size to the actual sample size. 

Weighting efficiency provides one measure of the representativeness of a survey sample. A perfectly representative sample will have a weighting efficiency of 100%. In contrast, a weighting efficiency of 50% indicates that a lot of difference in the likelihood of different groups responding was observed and the compensatory weighting was extensive. Although extensive weighting of this type will usually reduce nonresponse bias, it will also usually reduce the stability of the survey estimates (i.e. the standard errors will be wider because the effective sample size will be reduced) making it harder to draw robust conclusions about the underlying population.

The final PCOS weights have a design factor (DEFT) of 1.26, design effect (DEFF) of 1.58, and produce an estimated effective sample size (NEFF) of 1,496. Their efficiency is 63%. For comparison, the PCOS 2021 survey weights had a DEFT of 1.21, a DEFF of 1.45, an effective sample size of 2,337, and an efficiency of 69%.

A weighting efficiency of 63% is considered reasonable for a general population survey, producing an effective sample size that allows us to be confident in the precision of the estimates the push-to-web survey provides and make comparisons across survey years.

  1. The first interview was achieved on 5th October.
  2. The timings between mailings for the main sample were extended due to unforeseen mailing issues.
  3. This figure for the mean survey length is calculated using 1,718 interviews, all completed web cases.
  4. For example the Survey of Londoners 2019 had a break-off rate of 8.2%, similar to the Active Lives survey 2019 (8.8%).