Selection bias in epidemiological studies

What is selection bias?

Systematic error in how subjects are selected or loss to follow-up during the study.

Selection bias occurs when we found there is a discrepancy between target population and source population. This bias usually arises during the data collection procedure. 

In case control study, if selection of control group  is not appropriately represented by the source population where case rises than we have selection bias. 

In cohort study, if the population who are dropped out are showing differential pattern in exposure and disease groups then we can say there is a selection bias. 

In the experimental trial, loss to follow-up differs from one treatment group to another.

Selection bias is more concerning in a case-control study than in a cohort study because the study participants have all the available information from the source population.   

Example:

1. For example, participants included in an influenza vaccine trial may be healthy young adults, whereas those who are most likely to receive the intervention in practice may be elderly and have many comorbidities, and are therefore not representative. (2)

2. Of the approximately 24,000 residents on prevalence day, 1 January 1978, PD was diagnosed in 31 participants. Thirteen of those 31 had never been seen for medical care. In this survey, if another approach to the ascertainment of cases had used only the medical care system, all of those who had not received care (over 40%) would not have been identified. Furthermore, there would have been no definitive way of characterizing the bias introduced if only those identified via health records were used. (2)
3. Suppose that an investigator wishes to estimate the prevalence of heavy alcohol consumption (more than 21 units a week) in adult residents of a city. He might try to do this by selecting a random sample from all the adults registered with local general practitioners, and sending them a postal questionnaire about their drinking habits. With this design, one source of error would be the exclusion from the study sample of those residents not registered with a doctor. These excluded subjects might have different patterns of drinking from those included in the study. Also, not all of the subjects selected for study will necessarily complete and return questionnaires, and non-responders may have different drinking habits from those who take the trouble to reply. Both of these deficiencies are potential sources of selection bias. (5)

How we can detect selection bias/ sources of selection bias:

a. Volunteer and non-response error

Volunteer may posses different characteristics than the general population. Like physical activity related study people who are more health conscious likely to participate in the study. On the other hand, individual who are not participating in the study  may have different baseline characteristics than who participate in the study. 

b. Hospital patient bias (Berkson's bias)

In case-control study if controls arises from hospital patient who are hospitalized due to exposure that was affected the health condition which is study of interest. So, effect measure become attenuated and bias towards the null hypothesis which leads to no association.

c. Healthy worker bias:

In general working people are healthier than the general population. So, in the occupational exposure-related observational study, case and control should come from workers, if any of them come from not working group effect estimate will bias towards null or no association. 

Quantitative Assessment of Selection Bias:

Estimated OR is substantially different from the source population OR




Cae-control study:

Cohort study:

What is the impact of selection bias?

How could you minimize it?

In Observational studies, selection bias is difficult to address with analytical methods, but methods for dealing with missing data are available, including last observation (or baseline value) carried forward, mixed models, imputation, and sensitivity analysis using ‘worst case’ scenarios (assuming that those with no information all got worse) and ‘best case’ scenarios (assuming that all got better). Analysing data only from participants remaining in the study is called complete case analysis.

Certain external measures can sometimes be used to calibrate the data from a study, an example being standardised mortality rates. Moreover, inverse probability weighting can be used under certain assumptions.

To improve generalisability of study findings the selection of the population should be broad and reported in the recruitment/inclusion criteria.

Prevent/minimize selection bias when designing study.

Case-control studies 
  • Select controls representative of source that produced cases is an important issue.
  • Some may select 2 or more control groups.

To minimize selection bias in case-control studies:
  • Use incident cases (rather than prevalent)
  • Use population-based design rather than hospital-based
To minimize selection bias in cohort studies
  • High response
  • Minimize loss-to-follow-up

Practice questions: (Answers are at the end of  the article)  (1)

1) Researchers are planning to conduct a case-control study of the association between an occupational exposure and a health outcome. The researchers plan to study exposed workers from one factory and compare them with unexposed retirees who have never worked in a factory. A reviewer of the research proposal is worried about selection bias and in particular about the possibility of the healthy worker effect. Which of the following best represents the reviewer’s concern?

a) Retirees should not be compared to factory workers because factory workers are under more stress than retirees 

b) Retirees should not be compared to factory workers because factory workers’ incomes differ from those of retirees 

c) Retirees should not be compared to factory workers because factory workers are likely to need to maintain a certain level of health in order to work in a factory while retirees would not necessarily be as healthy 

d) Retirees should not be compared to factory workers because factory workers likely live in a different city than the retirees

2) Researchers conducted a prospective cohort study of the association between air pollution exposure and asthma. Some study participants were lost to follow-up (dropped out of the study) over time. The researchers were able to obtain data on the exposure and the health outcome for participants who remained in the study as well as for participants who dropped out of the study. The researchers discovered that the rate of loss to follow-up did not differ when comparing exposed and unexposed groups. The researchers also found that the rate of loss to follow-up did not differ when comparing people who developed asthma and people who did not develop asthma. Based on this information, which one of the following statements is most likely to be true? 

a) Selection bias likely occurred in this study because both exposure groups experienced loss to follow-up 

b) Selection bias likely did not occur in this study because exposure status and health outcome status did not influence whether or not people dropped out of the study 

c) Selection bias likely occurred in this study because both of the outcome groups (people with asthma and people without asthma) experienced loss to follow-up 

d) Selection bias likely did not occur in this study because people cannot choose if they are exposed to air pollution or not exposed to air pollution

Reference:

1. ERIC notebook

2. https://catalogofbias.org/biases/selection-bias/

3. SPH BUMC

4. SPH Emory

5. BMJ: Measurement error and bias

 

Answer:

1. C, The healthy worker effect is a type of selection bias that may occur in occupational exposure studies when the exposed cases are workers but the non-exposed study participants (controls) are not workers. In general, working individuals are healthier than non-working individuals. Health problems may actually be a reason for not working. In addition, retirees are typically older than the working population and may have more age-related health problems.

2. B, Selection bias likely did not occur in this study because exposure status did not influence whether or not people dropped out of the study. Furthermore, the health outcome status did not influence whether or not people dropped out of the study. Remember that selection bias may occur in a cohort study if the rate of participation or the rate of loss to follow-up differ by both exposure and health outcome status. Selection bias is not affected by if the exposure is an avoidable exposure or a non-avoidable exposure.



Disclaimer: The content presented here includes material from various websites, some of which has been rewritten while other parts are used as-is, with references provided. This content is intended solely for educational and learning purposes.



#Selectionbias #Epidemiology 

Comments

Popular posts from this blog

Effect decomposition and Table 2 fallacy

Which seed do you eat?