Estimating uncertainty in observational studies of associations between continuous variables: example of methylmercury and neuropsychological testing in children
 Michael Goodman^{1_37}Email author,
 Leila M Barraj^{2_37},
 Pamela J Mink^{2_37},
 Nicole L Britton^{2_37},
 Janice W Yager^{3_37},
 W Dana Flanders^{1_37} and
 Michael A Kelsh^{4_37}
DOI: 10.1186/1742557349
© Goodman et al. 2007
Received: 23 December 2006
Accepted: 26 September 2007
Published: 26 September 2007
Abstract
Background:
We suggest that the need to account for systematic error may explain the apparent lack of agreement among studies of maternal dietary methylmercury exposure and neuropsychological testing outcomes in children, a topic of ongoing debate.
Methods:
These sensitivity analyses address the possible role of systematic error on reported associations between lowlevel prenatal exposure to methylmercury and neuropsychological test results in two well known, but apparently conflicting cohort studies: the Faroe Islands Study (FIS) and the Seychelles Child Development Study (SCDS). We estimated the potential impact of confounding, selection bias, and information bias on reported results in these studies using the Boston Naming Test (BNT) score as the outcome variable.
Results:
Our findings indicate that, assuming various degrees of bias (in either direction) the corrected regression coefficients largely overlap. Thus, the reported effects in the two studies are not necessarily different from each other.
Conclusion:
Based on our sensitivity analysis results, it is not possible to draw definitive conclusions about the presence or absence of neurodevelopmental effects due to in utero methylmercury exposure at levels reported in the FIS and SCDS.
Introduction
The potential effect of children's lowlevel exposure to methylmercury in the environment is a complex research issue that continues to receive considerable attention from researchers, government agencies, and the public [1]. The US Environmental Protection Agency (EPA) derived a reference dose for methylmercury in 2001, based on an analysis by the National Research Council (NRC) of the National Academy of Sciences [2]. The NRC performed benchmark dose analysis on a number of endpoints from three longitudinal prospective studies: the Seychelles Islands, the Faroe Islands, and the New Zealand studies [2]. Adverse effects were reported in the latter two studies [3–5], but not in the Seychelles study [6, 7].
This lack of consistency among studies and particularly the discrepancy between the Seychelles Child Development Study (SCDS) and the Faroe Islands Studies (FIS) was noted in several previous publications [8, 9]. However, most of these publications either focused on qualitative differences in the types of exposures, population characteristics and choice of endpoints between two studies [2, 10], or examined the impact of nondifferential measurement error in exposure assessment [11, 12]. By contrast, the quantitative evaluation of systematic error in these studies does not appear to have received sufficient attention.
Current methodological literature emphasizes the importance of estimating, as opposed to merely acknowledging (or dismissing), the potential role of unaccounted systematic error in observational epidemiology [13–31] and in other fields of science [32–34]. Following these recommendations, we decided to build upon our previously published work on quantitative evaluation of potential bias in environmental epidemiologic studies [35, 36] and conduct a series of sensitivity analyses to evaluate the potential impact of systematic error on the reported associations between lowlevel maternal dietary exposure to methylmercury and children's neuropsychological testing results in the SCDS and FIS.
We used the score of the Boston Naming Test (BNT) as the outcome variable because it seems to have received substantial attention as an endpoint of interest (NRC 2000) and because both the SCDS and the FIS have used it in their analyses. The other cohort study, conducted in New Zealand [3, 5, 37], did not administer the BNT.
Methods
Our evaluation of the FIS and SCDS included two components: a qualitative review and comparison of the methods and results, and a quantitative analysis of selected sources of systematic error. The qualitative review evaluated the FIS and SCDS study methods with respect to their target population, selection of participants, exposure assessment, outcome ascertainment and data analyses. Particular attention was paid to identification of potential sources of systematic error, which were then evaluated in quantitative analyses.
The quantitative analyses presented in this article are conceptually similar to those described in our earlier publication [36] and involved calculating the impact of systematic error from three potential sources (confounding, selection bias, and information bias) on the observed relation between methylmercury exposure and a continuous neuropsychological outcome of interest.
For a systematic error of certain magnitude, it is possible to estimate the corrected linear regression coefficient by accounting for this error. The impact of systematic error can also be expressed as the difference between the observed and the corrected regression coefficients (b_{obs}b). It is important to keep in mind that the sensitivity analyses presented here do not address the impact of systematic error on the epidemiologic measure of association between methylmercury exposure and neuropsychological testing, but rather its impact on a regression coefficient in a given study. The actual measure of association can be further affected by the model assumptions, which are beyond the scope of this paper.
As mentioned previously, the BNT score was used as the outcome variable (Y) because both the SCDS and the FIS used it in their analyses. The BNT is a 60item test that asks the examinee to provide the name of an object depicted in blackandwhite line drawings. The response that is judged to be correct and the amount of time to respond are recorded. The test can be administered with or without cues. Semantic cues, if used, are provided if no response is made within 20 seconds. If the examinee is still unable to produce the name, a phonemic cue may be provided. The total score is then the number of items correctly named spontaneously or after cues. For the Seychelles study, a score of 43 was considered normal (standard deviation of 5) [7]. Scores on the BNT are a measure of word knowledge/vocabulary, verbal learning, word retrieval, and semantic language and have been associated with reading comprehension and written comprehension [38].
The possible effect of unadjusted confounding on FIS and SCDS results was assessed by measuring the impact of potentially important covariates not considered in these studies. To estimate the impact of selection bias, we calculated the difference in BNT results that would be observed in the FIS and SCDS assuming that the distributions of exposure and BNT scores among persons omitted from these studies were different than the analogous distributions among study participants. Finally, the potential role of information bias was quantified for a given range of outcome misclassification (in either direction) differentially affecting the low exposure and the high exposure groups in each study. The derivation of the corrected linear regression estimate (b) for each specific type of systematic error was conducted as follows.
Confounder Adjustment
where:
_{Exp} is the mean value of the outcome measure (e.g., BNT test score) among the exposed;
_{Nonexp} is the mean value of the outcome measure among the nonexposed;
s _{ Y } is the standard deviation of the outcome measure;
_{Exp} is the mean value of the potential confounder among the exposed;
_{Nonexp} is the mean value of the potential confounder among the nonexposed;
s _{ Z } is the standard deviation of the potential confounder;
and
r(Z, Y) is the Pearson correlation coefficient for variables Z and Y.
where s_{X} and s_{Y} are estimates of the standard deviations of X and Y, and r(XY), r(XZ) and r(ZY) represent estimates of the correlation coefficients between X and Y, X and Z, and Z and Y, respectively. If we use formula (1) to express b_{obs}, that is the estimate of the regression parameter unadjusted for the effect of confounding, then the difference (b_{obs}b_{conf}) in this case represents the impact of confounding by Z on the observed linear regression coefficient.
Selection bias
Selection bias may occur if the participants are systematically different from persons not included in the study with respect to their exposure and outcome levels. Thus, the regression slope derived from the data collected among the participants would differ from the estimate based on all eligible subjects. Let:

n represent the total number of all eligible subjects;

n_{s} (p_{s}) represent the number (proportion) of sampled subjects among the n eligible subjects;

n_{n} (p_{n}) represent the number (proportion) of nonsampled subjects among the n eligible subjects;

_{s} and _{s} represent the estimates of the mean exposure and outcome among the sampled subjects;

_{n} and _{n} represent the estimates of the mean exposure and outcome among the nonsampled subjects;

s_{Xs} and s_{Xn} represent the estimates of the standard deviation of the exposure levels among the sampled and nonsampled subjects, respectively (we assumed, for simplicity, that s_{Xn} = s_{Xs});

b_{s} represent the estimate of the regression parameter derived using the data from the n_{s} sampled subjects;

b_{n} represent the estimate of the regression parameter for the n_{n} nonsampled subjects, assumed here to be a multiple of b_{s}, that is b_{n} = νb_{s};

b_{sel} represent the estimate of the corrected regression parameter based on all eligible subjects.
Then:
can be estimated by substituting the hypothetical (assumed) estimates for the nonsampled subjects.
Thus (b_{obs}b_{sel}) in this case represents the impact of selection bias on the observed linear regression slope.
Information bias
In this study we assessed the impact of one type of information bias (differential outcome misclassification), which may occur when the data about the outcome are obtained differently for subjects in different exposure categories. Thus, the reported (or "observed") outcome (Y_{obs}) for a proportion of the subjects is different from the "true" outcome (Y). We assume that the absolute amount of over or underestimation in the observed outcome for a subject with exposure X is proportional to the difference between X and (the estimate of mean exposure).
Let:

p_{1} represent the proportion of subjects whose observed outcome is Y_{obs} = Y + (X )a_{1,} where a_{1} > 0. Then, p_{1} is the proportion of subjects whose bias in their observed outcome results in a positive bias in the observed slope;

p_{2} represent the proportion of subjects whose observed outcome is Y_{obs} = Y  (X )a_{2,} where a_{2} > 0. Then, p_{2} is the proportion of subjects whose bias in their observed outcome results in a negative bias in the observed slope;

b_{obs} represent the estimate of β_{1} in the regression model defined in equation (1) above, derived using Y_{obs}.
Thus, Y_{true} = Y_{obs} a_{1}(X ) for a subset (p_{1}) of all subjects, and Y_{true} = Y_{obs} +a_{2}(X ) for a subset (p_{2}) of all subjects, while Y_{true} = Y_{obs} for the remaining subjects.
where:
If we assume that the exposure values (X) corresponding to the fractions p_{1} and p_{2} of subjects defined above are random subsamples of all X's, then, the second and third terms in equation (10) above become:
, and , respectively. Thus, equation (7) becomes:
, which reduces to:
b_{inf} = b_{obs}  p_{1}a_{1} + p_{2}a_{2} or b_{obs} = b_{inf} + (p_{1})(a_{1})  (p_{2})(a_{2}),
thus, (p_{1})(a_{1})  (p_{2})(a_{2}), represents the magnitude of information bias (b_{obs}b_{inf}).
Monte Carlo simulations
Summary of input parameters and assumptions in the Monte Carlo simulation of the FIS results adjusted for outcome misclassification, selection bias and confounding
Input Parameters  Distribution  Source (reference) 

Outcome misclassification (information bias)  
Observed exposure: mercury concentration in cord blood (mg/L),  Mean_{x} = 31.99, SD_{x} = 25.53  BudtzJorgensen et al. 2005; based on median and 99^{th} percentile in a lognormal distribution (39) 
Observed outcome: Score on Boston naming test  Mean_{y} = 25, SD_{y} = 5.3  Mean: Grandjean et al. 1997 (4), SD from BudtzJorgensen et al. 2004 
Observed b_{1}  N (0.019, 0.0063)  BudtzJorgensen et al. 2005 (39) 
Observed b_{0}  = 25  31.99 × Observed b_{1}  Derived using standard linear regression formula (b_{0} = b_{1} ) 
P1: proportion of exposed with a1 (negative) adjustment  U (0.1,0.3)  Hypothetical (no data available) 
P2: proportion of exposed with a2 (positive) adjustment  U (0.1,0.3)  
a1: relative adjustment in outcome for proportion p1 of subjects  U (0.0,0.31)  Hypothetical (no data available), limits chosen to allow BNT score vary between 0 and 60 
a2: relative adjustment in outcome for proportion p2 of subjects  U (0.0,0.48) 
Selection bias  

Observed exposure: mercury concentration in cord blood (mg/L),  See above  
10,000 vectors (Mean_{y}, Sd_{y}, b_{0}, b_{1}) adjusted for information bias  Output of Information Bias module  
Number of subjects included in the analysis  866  Grandjean et al. 1997; N in Boston Naming Test "no cues" (4) 
Number of eligible subjects  1362  Calculated as 1022/0.75 (1022 are ~75% of all births (44) 
Number of subjects excluded from the analysis  496  Derived as 1362866 
Relative difference between mean exposure of subjects not included and mean exposure of included subjects  U (5%,5%)  Hypothetical (no data available) 
Relative difference between mean outcome of subjects not included and mean outcome of included subjects  U (10%,10%)  Hypothetical (no data available) 
Slope multiplier (to get to slope of nonincluded subjects)  U (0,2)  Hypothetical (no data available) 
Confounding  

10,000 vectors (Mean_{x}, SD_{x}, Mean_{y}, SD_{y}, b_{0}, b_{1}) adjusted for information and selection bias  Output of Selection Bias Module  
Pearson correlation between confounder (WAIS) and exposure  U (0.5, 0.5)  Hypothetical (no data available) 
Pearson correlation between confounder (WAIS) and outcome  U (0.2, 0.8)  Hypothetical (no data available) 
Summary of input parameters and assumptions in the Monte Carlo simulation of the SCDS results adjusted for outcome misclassification, selection bias and confounding
Input Parameters  Distribution  Source (reference) 

Outcome misclassification (information bias)  
Observed exposure: mercury concentration in maternal hair (mg/g)  Mean_{x} = 6.9, SD_{x} = 4.5  Myers et al., 2003 (7) 
Observed outcome: Score on Boston naming test  Mean_{y} = 26.5, SD_{y} = 4.8  Myers et al., 2003 (7) 
Observed b_{1}  N (0.012, 0.046)  Myers et al., 2003 (7) 
Observed b_{0}  = 26.5  6.9 × Observed b_{1}  Derived using standard linear regression formula (b_{0} = b_{1} ) 
P1: proportion of exposed with a1 (negative) adjustment  U (0.1,0.3)  Hypothetical (no data available) 
P2: proportion of exposed with a2 (positive) adjustment  U (0.1,0.3)  
a1: relative adjustment in outcome for proportion p1 of subjects  U (0.0,1.95)  Hypothetical (no data available), limits chosen as to allow BNT score vary between 0 and 60 
a2: relative adjustment in outcome for proportion p2 of subjects  U (0.0,1.95) 
Selection bias  

Observed exposure: mercury concentration in cord blood (mg/L)  See above  
10,000 vectors (Mean_{y}, Sd_{y}, b_{0}, b_{1}) adjusted for information bias  Output of Information Bias module  
Number of subjects included in the analysis  643  Myers et al. 2003 (7) 
Number of eligible subjects  1480  Calculated as 740 × 2 (740 are ~50% of eligible population (7) 
Number of subjects excluded from the analysis  837  Calculated as 1480  643 
Relative difference between mean exposure of subjects not included and mean exposure of included subjects  U (5%,5%)  Hypothetical (no data available) 
Relative difference between mean outcome of subjects not included and mean outcome of included subjects  U (10%,10%)  Hypothetical (no data available) 
Slope multiplier (to get to slope of nonincluded subjects)  U (0,2)  Hypothetical (no data available) 
Confounding  

10,000 vectors (Mean_{x}, SD_{x}, Mean_{y}, SD_{y}, b_{0}, b_{1}) adjusted for information and selection bias  Output of Selection Bias Module  
Pearson correlation between confounder (WAIS) and exposure  U (0.5, 0.5)  Hypothetical (no data available) 
Pearson correlation between confounder (WAIS) and outcome  U (0.2, 0.8)  Hypothetical (no data available) 
Results
Qualitative review of confounding
Despite rather lengthy lists of covariates that were considered in each study, the possibility remains of confounding due to unmeasured covariates or due to residual confounding. For example, no data were collected on nutritional factors (e.g., selenium, polyunsaturated fatty acids) in either study [7]. Although the authors of the FIS considered confounding to have had minimal impact due to the homogeneity of the community under study and the limited potential for other neurotoxic exposures [4], it is possible that the results of this study were affected by lack of information on home environment, such as that measured by the CaldwellBradley Home Observation for Measurement of the Environment (HOME) [40, 41]. HOME was administered to the Seychellois participants and was found to be associated with many neuropsychological tests including the Boston Naming Test [6, 7]. Other variables that were either not measured, or measured but not considered consistently in the analyses, include factors related to the testtaking environment (e.g., the child's anxiety level), which have been associated with performance on the WISC III Digit Spans subtest [41]; educational factors (e.g., quality of school/teachers); paternal intelligence; parental education; exposure to other chemicals that have been associated with neurobehavioral effects (e.g., lead, PCBs); as well as dietary components, such as selenium and omega3 fatty acids, which are expected to have a beneficial effect on neurodevelopment [42].
Both studies assessed caregiver (SCDS) or maternal (FIS) intelligence by the Raven's Progressive Matrices test rather than using a comprehensive test, such as the Wechsler Adult Intelligence Scale (WAIS). Raven's Progressive Matrices measures nonverbal reasoning ability and is a useful test for those who do not speak English. Its correlation with other intelligence tests ranges from 0.5–0.8 [41].
Qualitative review of selection bias
Participants in the Faroe Islands study were recruited among 1,386 children from three hospitals in Torshavn, Klaksvik, and Suderoy between March 1, 1986 and December 31, 1987 [43]. Blood samples and questionnaire data were obtained from 1,023 infantmother pairs, representing 75% of the eligible singleton births [4]. Reasons for nonparticipation were not described; however, it appears that patients born in two smaller hospitals were less likely to participate. It is also important to point out that the hospital with the lowest percent participation (33%) had the highest median blood mercury concentration [45].
Nine hundred seventeen of the 1,022 children returned for neuropsychological testing at approximately age seven [4]. Scores for the Boston Naming Test (no cues) were reported for 866 children, or 63% of the overall target population.
The 740 infantmother pairs who remained in the cohortforanalysis in the SCDS after exclusions represent approximately 50% of the target population [46]. The authors did not record specific reasons for nonparticipation, but indicate that some mothers were probably not informed of the study by the nurses in the hospital, some may have declined due to lack of sufficient information about the study or lack of interest, and some may have been afraid to participate in the study. Shamlaye et al. (1995) reported birth characteristics for SCDS participants and the target population and found small, nonsignificant differences in birth weight, gestational age, male:female ratio, and maternal age between the two groups [47]. Six hundred fortythree children completed the Boston Naming Test at age 108 months (9 years) in this study, which represents approximately 43% of the estimated target population.
Qualitative review of information bias
Approximately half of all FIS participants underwent testing in the morning and half underwent testing in the afternoon. Most (but not all) children were examined in Torshavn. If the time of testing or the need to travel before testing were related to exposure, this could have introduced additional bias due to diurnal variation and/or fatigue. According to the Faroese transportation guide, longdistance bus service combined with the ferry services, links virtually every corner of the country. However, it appears that a trip to Torshavn may take up to several hours [48]. Some of the FIS participants were examined in local hospitals close to their homes. Although this may have alleviated the potential bias associated with travel, it may have introduced additional bias due to differences in testing environment.
The methods description does not indicate whether or not investigators administering the test were blinded with respect to the participants' exposure status. According to the study authors, the participation rate in the capital was lower and the participants' geometric mean mercury concentration was about 28% higher (~23 μg/L vs. ~18 μg/L) than that of nonparticipants. This may indicate that residence was related to both exposure level and the need to travel, as well as to the AM/PM testing status.
A reanalysis of the FIS data showed that, after controlling for residence (town vs. country), the linear regression slope for BNT without cues changed from 1.77 (p < 0.001) to 1.51 (p = 0.003), whereas the slope for BNT with cues changed from 1.91 (p < 0.001) to 1.60 (p = 0.001) [2]. However, this adjustment would only partially address the above problems. There may still be substantial room for residual misclassification because the analysis did not take into consideration distance from Torshavn or duration of travel.
Similar concerns, although to a lesser extent, apply to the SCDS results. The testing was performed "mostly in the morning." This does not exclude the potential impact of diurnal variation on the results; however, this impact would have been probably lower than that in the FIS, where the AM/PM testing ratio was 1:1.
All testing for SCDS was performed on Mahe. Some families apparently had to travel to the testing site. Similarly to the FIS, it is possible that children who had to travel were more tired prior to testing. However, one of the criteria for inclusion into the main study was Mahe residence and prolonged travel does not appear likely as Mahe extends 27 km north to south and 11 km east to west [49]. The SCDS authors state that none of the families and none of the investigators administering the test were aware of the participants' methylmercury exposure status.
Quantitative analysis results
Illustrative examples of FIS and SCDS BNT results corrected for unaccounted confounding
Scenario  Confounder SD  Confounder Correlation with  Regression slope  

Exposure  Outcome  Observed  Corrected  
FIS  
Scenario 1  15.0  0.10  0.20  0.019  0.015 
Scenario 2  15.0  0.10  0.20  0.019  0.023 
Scenario 4  15.0  0.50  0.20  0.019  0.002 
Scenario 3  15.0  0.50  0.20  0.019  0.053 
Scenario 5  15.0  0.10  0.80  0.019  0.002 
Scenario 6  15.0  0.10  0.80  0.019  0.036 
Scenario 7  15.0  0.50  0.80  0.019  0.085 
Scenario 8  15.0  0.50  0.80  0.019  0.136 
SCDS  
Scenario 1  15.0  0.10  0.20  0.012  0.01 
Scenario 2  15.0  0.10  0.20  0.012  0.03 
Scenario 3  15.0  0.50  0.20  0.012  0.13 
Scenario 4  15.0  0.50  0.20  0.012  0.16 
Scenario 5  15.0  0.10  0.80  0.012  0.07 
Scenario 6  15.0  0.10  0.80  0.012  0.10 
Scenario 7  15.0  0.50  0.80  0.012  0.55 
Scenario 8  15.0  0.50  0.80  0.012  0.58 
Illustrative examples of FIS and SCDS BNT results corrected for selection bias.
Scenario  Shift in Exposure^{a}  Shift in outcome^{b}  Slope multiplier^{c}  Regression slope  

Observed  Corrected  
FIS  
Scenario 1  5%  10%  2.0  0.019  0.024 
Scenario 2  5%  10%  0.0  0.019  0.013 
Scenario 3  5%  10%  1.5  0.019  0.021 
Scenario 4  5%  10%  2.0  0.019  0.027 
Scenario 5  10%  10%  0.5  0.019  0.013 
Scenario 6  10%  10%  1.5  0.019  0.025 
Scenario 7  10%  10%  0.0  0.019  0.009 
Scenario 8  10%  10%  0.5  0.019  0.018 
SCDS  
Scenario 1  5%  10%  2.0  0.012  0.008 
Scenario 2  5%  10%  0.0  0.012  0.016 
Scenario 3  5%  10%  1.5  0.012  0.004 
Scenario 4  5%  10%  2.0  0.012  0.030 
Scenario 5  10%  10%  0.5  0.012  0.014 
Scenario 6  10%  10%  1.5  0.012  0.037 
Scenario 7  10%  10%  0.0  0.012  0.017 
Scenario 8  10%  10%  0.5  0.012  0.031 
Illustrative examples of FIS and SCDS BNT results corrected for information bias.
Scenario  Proportion misclassified  Magnitude of misclassification  Regression slope  

P_{h}  P_{h}  a_{l}  a_{2}  Observed  Corrected  
FIS  
Scenario 1  30%  10%  0.30  0.40  0.019  0.069 
Scenario 2  10%  30%  0.30  0.40  0.019  0.071 
Scenario 3  10%  10%  0.30  0.40  0.019  0.009 
Scenario 4  30%  30%  0.30  0.40  0.019  0.011 
Scenario 5  30%  10%  0.10  0.20  0.019  0.029 
Scenario 6  10%  30%  0.10  0.20  0.019  0.031 
Scenario 7  10%  10%  0.10  0.20  0.019  0.009 
Scenario 8  30%  30%  0.10  0.20  0.019  0.011 
SCDS  
Scenario 1  30%  10%  0.30  0.40  0.012  0.062 
Scenario 2  10%  30%  0.30  0.40  0.012  0.078 
Scenario 3  10%  10%  0.30  0.40  0.012  0.002 
Scenario 4  30%  30%  0.30  0.40  0.012  0.018 
Scenario 5  30%  10%  0.10  0.20  0.012  0.022 
Scenario 6  10%  30%  0.10  0.20  0.012  0.038 
Scenario 7  10%  10%  0.10  0.20  0.012  0.002 
Scenario 8  30%  30%  0.10  0.20  0.012  0.018 
When evaluating the possible role of unmeasured confounders in the FIS and SCDS analyses, we assumed that the correlation coefficient between confounder and exposure ranged from 0.5 to +0.5 and the correlation coefficient between confounder and outcome (BNT score) ranged from 0.2 to 0.8. The results are presented in Table 3. Based on these assumptions, the corrected regression coefficient for the FIS would become as extreme as 0.136 (Scenario 8), assuming a moderately positive correlation (r = 0.5) between the confounder and exposure and a strong correlation (r = 0.8) between the same confounder and the BNT results. On the other hand, a moderate negative correlation with exposure (r = 0.5) and a strong correlation (r = 0.8) with the outcome would reverse the direction of the association from b_{obs} = 0.019 to b_{conf} = +0.085 (Scenario 7). In the SCDS analyses, the same range of correlation coefficients would produce a corresponding range of corrected linear regression slopes between 0.58 (Scenario 8) and 0.55 (Scenario 7).
Table 4 illustrates the potential impact of selection bias on study results. Assuming that the differences between the mean exposures and outcomes of eligible persons who were excluded from the study and the mean exposures and outcomes of those who were included ranged between 10% and +10%, and regression slope among persons excluded from the study ranged between 0 and 0.038 (b_{obs} × 2), the corrected slope for FIS may range between 0.027 (Scenario 4) and 0.009 (Scenario 7). The same selection bias scenarios in the SCDS would result in a change of direction from 0.012 to +0.017 (Scenario 7) or in a stronger than observed association, with a regression slope of 0.037 (Scenario 6).
The analyses of information bias demonstrated the effect on study results with a relatively small proportion of misclassified participants (e.g., 10%) and the relatively modest magnitude of misclassification (a_{1} and a_{2} between 0.1 and 0.4). For the eight scenarios presented in Table 5, the corrected regression slopes ranged from 0.069 (Scenario 1) to 0.071 (Scenario 2) for FIS; and from 0.062 (Scenario 1) to 0.078 (Scenario 2) for SCDS.
Discussion
A comparison of the two studies included in our analysis revealed a number of similarities. Both were prospective evaluations of neuropsychological endpoints in children whose prenatal methylmercury exposure status was ascertained at birth. Both used objective biomarkerbased measures of exposure. Both conducted multivariate analyses in an attempt to separate the effects of methylmercury from other factors that influence neuropsychological function.
Yet, despite similarities, the results and conclusions of these two studies were inconsistent. For example, testing of the language function showed a statistically significant improvement with increasing methylmercury exposure among Seychellois children at about 51/2 years of age when measured by the Preschool Language Scale and no significant association at nine years of age when measured by BNT. In contrast, the Faroese study group displayed a statistically significant decline in BNT scores with increasing methylmercury exposure at the age of seven. Other discrepancies between the two sets of results were present in the domains of the visualspatial function, memory, learning achievement, and sustained attention. Only in one domain (motor function) did both studies report statistically significant inverse associations between test scores and methylmercury exposure, but those associations were not consistent. In the SCDS, the association was for the "nondominant" hand grooved pegboard test among males only, whereas the FIS reported the association for the "preferred" hand finger tapping.
The proposed interpretations of the observed disagreement between the two studies have been based primarily on the assumption that the differences in results have an underlying biological explanation. Recent reviews paid substantial attention to the fact that the two studies reported their main findings using different measures of methylmercury exposure: cord blood versus maternal hair [2, 10]. As cord blood concentrations measure recent exposures, the National Academy of Sciences review on methylmercury toxicity suggested that the FIS results may reflect a more recent (and presumably more relevant) period of exposure [2]. Another proposed explanation is the difference in the source and rate of methylmercury exposure: daily consumption of fish in the Seychelles as opposed to episodic consumption of whale in the Faroes.
Prior to the publication of the most recent SCDS update, it appeared plausible that the differences between the two study results could also be explained by the lack of comparability in the neuropsychological test batteries. However, the last testing of the SCDS participants included many of the same tests previously used by the FIS investigators – specifically, those with significant findings – and the above explanation no longer appears likely.
Our analyses indicate that each of the potential sources of systematic error under certain conditions is capable of changing the results from significant to nonsignificant and vice versa. Moreover, under some scenarios even the direction of the observed associations can be reversed. Although the scenarios in our sensitivity analyses cover a wide range of assumptions, they are not entirely hypothetical. The differences in exposure levels between participants and nonparticipants in the FIS have been reported [4, 45] and, in fact, exceed the differences assumed in our selection bias simulation. The low (just over 40%) participation rate in the SCDS also falls within the proposed scenarios. We demonstrated the potential effect of confounding by home environment and the need for a comprehensive parental IQ evaluation in our earlier publication [36]. The correlation coefficients between potential confounders and exposure are similar to those reported in the FIS. The potential misclassification due to fatigue, timing and sequencing of testing and lack of adequate blinding also finds support in the literature [38, 41].
For all of the above reasons, the uncertainty around the FIS and the SCDS regression slope estimates is probably larger than is suggested by the reported 95% confidence intervals. The discrepant results of the two studies may, in fact, fall within an expected range and departures from null in either direction can be explained by a combination of random and systematic error.
The interpretation of sensitivity analyses presented here, just like the interpretation of any epidemiological analyses, requires careful consideration of caveats and underlying assumptions. Many sensitivity analyses, including ours, are limited by insufficient information (e.g., lack of data on the correlation between confounder and exposure) and have to rely on hypothetical distributions of the parameters of interest. When no data were available, we assumed a uniform distribution in the Monte Carlo analyses. We recognize that the uniform distribution may not accurately reflect the uncertainty since all values within the range are given equal probabilities. In the future, alternative approaches such as the use of triangular or beta distributions, which give more weight to the more "probable" values, may need to be explored. The assumptions of normal distribution and independence of various sources of bias also need to be considered and alternative analytical methods for circumstances that do not fit these assumptions may need to be developed. For example, our adjustment for unmeasured confounders does not condition on the variables for which adjustment was made. It is important to point out that adjusting for the measured covariates may reduce the residual confounding attributable to the unmeasured confounder. All of the above considerations may affect the results of sensitivity analyses; however, in the absence of sensitivity analyses, one implicitly assumes that systematic error had no effect on study results, an assumption that may be even more difficult to defend.
In summary, despite caveats, we feel that our analyses served their purpose of illustrating the proposed methodology. We conclude that sensitivity analyses serve as an important tool in understanding the sources of such disagreement as long as the underlying assumptions are clearly stated. It is important to recognize that disagreement across studies is one of the unavoidable features of observational epidemiology.
Declarations
Acknowledgements
This research was funded by the Electric Power Research Institute (EPRI), a private, independent, nonprofit center for public interest energy and environmental research.
Authors’ Affiliations
References
 Stern AH Gochfeld, M.: Effects of methylmercury exposure on neurodevelopment. JAMA 1999, 281(10):896–897.View ArticlePubMedGoogle Scholar
 NRC NRC: Toxicological Effects of Methylmercury. Washington, DC , National Academies Press 2000.Google Scholar
 Crump KS Kjellstrom T, Shipp AM, Silvers A, Stewart A.: Influence of prenatal mercury exposure upon scholastic and psychological test performance: Benchmark analysis of a New Zealand cohort. Risk Analysis 1998,18(6):701–713.View ArticlePubMedGoogle Scholar
 Grandjean P Weihe P, White RF, Debes F, Araki S, Yokoyama K, Murata K, Sorensen N, Dahl R, and Jorgensen PJ.: Cognitive deficit in 7yearold children with prenatal exposure to methylmercury. Neurotoxicol Teratol 1997,19(6):417–428.View ArticlePubMedGoogle Scholar
 Kjellstrom T Kennedy P, Wallis S, Stewart A, Friberg L, Lind B, Wutherspoon, and Mantell C.: Physical and Mental Development of Children with Prenatal Exposure to Mercury from Fish. Stage 2: Interviews and Psychological Tests at Age 6. Solna, National Swedish Environmental Protection Board 1989.Google Scholar
 Davidson PW Myers, GJ, Cox C, Axtell C, Shamlaye C, SloaneReeves J, Cernichiari E, Neddham L, Choi A, Wang Y, Berlin M, and Clarkson TW.: Effects of prenatal and postnatal methylmercury exposure from fish consumption on neurodevelopment: Outcomes at 66 months of age in the Seychelles Child Development Study. JAMA 1998,280(8):701–707.View ArticlePubMedGoogle Scholar
 Myers GJ Davidson, PW, Cox C, Shamlaye CF, Palumbo D, Cernichiari E, SloaneReeves J, Wilding GE, Kost J, Huang LS, Clarkson TW.: Prenatal methylmercury exposure from ocean fish consumption in the Seychelles Child Development Study. Lancet 2003,361(9370):1686–1692.View ArticlePubMedGoogle Scholar
 Dourson ML, Wullenweber AE, Poirier KA: Uncertainties in the reference dose for methylmercury. Neurotoxicology 2001,22(5):677–689.View ArticlePubMedGoogle Scholar
 Jacobson JL: Contending with contradictory data in a risk assessment context: The case of methylmercury. Neurotoxicology 2001,22(5):667–675.View ArticlePubMedGoogle Scholar
 Myers GJ Davidson, PW, Cox, C, Shamlaye, C, Cernichiari, E, Clarkson, TW.: Twentyseven years studying the human neurotoxicity of methylmercury exposure. Environ Res 2000,83(3):275–285.View ArticlePubMedGoogle Scholar
 BudtzJorgensen E, Keiding N, Grandjean P, Weihe P, White RF: Consequences of exposure measurement error for confounder identification in environmental epidemiology. Stat Med 2003,22(19):3089–3100.View ArticlePubMedGoogle Scholar
 Keiding N, BudtzJorgensen E, Grandjean P: Prenatal methylmercury exposure in the Seychelles. Lancet 2003,362(9384):664–665.View ArticlePubMedGoogle Scholar
 Greenland S: Basic methods for sensitivity analysis of biases. Int J Epidemiol 1996,25(6):1107–1116.View ArticlePubMedGoogle Scholar
 Greenland S: Basic methods for sensitivity analysis and external adjustment. Modern Epidemiology (Edited by: Rothman KJGS). Philadelphia, PA 1998, 343357.Google Scholar
 Greenland S: Sensitivity analysis, Monte Carlo risk analysis, and Bayesian uncertainty assessment. Risk Anal 2001,21(4):579–583.View ArticlePubMedGoogle Scholar
 Greenland S: The impact of prior distributions for uncontrolled confounding and response bias: a case study of the relation of wire codes and magnetic fields to childhood leukemia. Journal of the American Statistical Association 2003, 98:47–54.View ArticleGoogle Scholar
 Greenland S: Multiplebias modeling for analysis of observational data. J R Statist Soc A 2005,168(2):267–306.View ArticleGoogle Scholar
 Gustafson P: Measurement Error and Misclassification in Statistics and Epidemiology. New York , Chapman and Hall 2003.View ArticleGoogle Scholar
 Lash TL Fink, AK.: Semiautomated sensitivity analysis to assess systematic errors in observational data. Epidemiology 2003,14(4):451–458.PubMedGoogle Scholar
 Lash TL, Silliman RA: A sensitivity analysis to separate bias due to confounding from bias due to predicting misclassification by a variable that does both. Epidemiology 2000,11(5):544–549.View ArticlePubMedGoogle Scholar
 Maclure M, Schneeweiss S: Causation of bias: the episcope. Epidemiology 2001,12(1):114–122.View ArticlePubMedGoogle Scholar
 Maldonado G: Informal evaluation of bias may be inadequate (abstract). American Journal of Epidemiology 1998, 147:S82.Google Scholar
 Maldonado G Delzell, E, Tyl RW, Sever LE.: Occupational exposure to glycol ethers and human congenital malformations. Int Arch Occup Environ Health 2003,76(6):405–423.View ArticlePubMedGoogle Scholar
 Maldonado G: Quantifying the impact of study imperfections on study results (abstract). American Journal of Epidemiology 2005, 161:S100.Google Scholar
 Maldonado G, Delzell E, Poole C: A unified approach to conducting and interpreting occupational studies of congenital malformations (abstract). American Journal of Epidemiology 1999, 149::S59.Google Scholar
 Maldonado G, Greenland S: Estimating causal effects. Int J Epidemiol 2002,31(2):422–429.View ArticlePubMedGoogle Scholar
 Marais ML Wecker, WE: Correcting for omittedvariables and measurementerror bias in regression with an application to the effect of lead on IQ. J Am Stat Assoc 1998,93(442):494–517.View ArticleGoogle Scholar
 Phillips CV: Quantifying and reporting uncertainty from systematic errors. Epidemiology 2003,14(4):459–466.PubMedGoogle Scholar
 Phillips CV, G M: Using Monte Carlo methods to quantify the multiple sources of error in studies (abstract). American Journal of Epidemiology 1999, 149:S17.Google Scholar
 Phillips CV, LaPole LM: Quantifying errors without random sampling. BMC Med Res Methodol 2003, 3:9.View ArticlePubMedGoogle Scholar
 Steenland K, Greenland S: Monte Carlo sensitivity analysis and Bayesian analysis of smoking as an unmeasured confounder in a study of silica and lung cancer. Am J Epidemiol 2004,160(4):384–392.View ArticlePubMedGoogle Scholar
 Leamer EE: Sensitivity analyses would help. Am Econ Rev 1985, 75:308–313.Google Scholar
 Morgan MG, Henrion M: Uncertainty. A Guide to Dealing With Uncertainty in Quantitative Risk and Policy Analysis. New York , Cambridge University Press 1990.Google Scholar
 Vose D: Risk Analysis. A Quantitative Guide. 2nd Edition New York , John Wiley & Sons 2000.Google Scholar
 Goodman M Kelsh M, Ebi K, Iannuzzi J, Langholz B.: Evaluation of potential confounders in planning a study of occupational magnetic field exposure and female breast cancer. Epidemiology 2002,13(1):50–58.View ArticlePubMedGoogle Scholar
 Mink PJ, Goodman M, Barraj LM, Imrey H, Kelsh MA, Yager J: Evaluation of uncontrolled confounding in studies of environmental exposures and neurobehavioral testing in children. Epidemiology 2004,15(4):385–393.View ArticlePubMedGoogle Scholar
 Kjellstrom T Kennedy P, Wallis S, and Mantell C.: Physical and Mental Development of Children with Prenatal Exposure to Mercury from Fish. Stage 1: Preliminary Tests at Age 4. Solna , National Swedish Environmental Protection Board 1986.Google Scholar
 Baron IS: Neuropsychological Evaluation of the Child. New York , Oxford University Press 2004.Google Scholar
 BudtzJorgensen E, Debes F, Weihe P, Grandjean P: Adverse Mercury Effects in 7 YearOld Children as Expressed as Loss in “IQ”. Final report to the EPA. Odense , University of Southern Denmark 2005.,2005(December 16 ): Google Scholar
 Bradley RH Caldwell BM: The relation of infants' home environments to achievement test performance in first grade: A followup study. Child Dev 1984,55(3):803–809.PubMedGoogle Scholar
 Sattler JM: Assessment of Children: Cognitive Applications. 4th Edition San Diego , Jerome M. Sattler, Publisher, Inc. 2001.Google Scholar
 Steuerwald U, Weihe P, Jorgensen PJ, Bjerve K, Brock J, Heinzow B, BudtzJorgensen E, Grandjean P: Maternal seafood diet, methylmercury exposure, and neonatal neurologic function. J Pediatr 2000,136(5):599–605.View ArticlePubMedGoogle Scholar
 Grandjean P Weihe P, Jorgensen PJ, Clarkson T, Cernichiari E, Videro T.: Impact of maternal seafood diet on fetal exposure to mercury, selenium, and lead. Arch Environ Health 1992,47(3):185–195.View ArticlePubMedGoogle Scholar
 Dahl R, White RF, Weihe P, Sorensen N, Letz R, Hudnell HK, Otto DA, Grandjean P: Feasibility and validity of three computerassisted neurobehavioral tests in 7yearold children. Neurotoxicol Teratol 1996,18(4):413–419.View ArticlePubMedGoogle Scholar
 Grandjean P, Weihe P: Neurobehavioral effects of intrauterine mercury exposure: potential sources of bias. Environ Res 1993,61(1):176–183.View ArticlePubMedGoogle Scholar
 Marsh DO Clarkson TW, Myers GJ, Davidson PW, Cox C, Cernichiari E, Tanner MA, Lednar W, Shamlaye C, Choisy O, Hoareau C, Berlin M: The Seychelles study of fetal methylmercury exposure and child development: Introduction. Neurotoxicology 1995,16(4):583–596.PubMedGoogle Scholar
 Shamlaye CF Marsh, DO, Myers GJ, Cox C, Davidson PW, Choisy O, Cernichiari E, Choi A, Tanner MA, Clarkson TW.: The Seychelles child development study on neurodevelopmental outcomes in children following in utero exposure to methylmercury from a maternal fish diet: background and demographics. Neurotoxicology 1995,16(4):597–612.PubMedGoogle Scholar
 Strandfaraskip Landsins: Ferdaælanin, http://www.ssl.fo. [http://www.ssl.fo]
 Africa Guide: Seychelles http://www.africaguide.com/country/seychel. [http://www.africaguide.com/country/seychel]
Copyright
This article is published under license to BioMed Central Ltd. This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
Comments
View archived comments (4)