A. Consideration of Biological Information
The Panel's discussion was begun by the two non-epidemiologists with reproductive toxicology experience commenting on the biological basis for judging whether there was a causal relationship at work.
an estrogen-mediated event. In reality, the endocrine biology is far more complex, involving many other hormones, such as progesterone, cortisol, thyroid, and growth hormones. Also, proliferative effects on the breast, including swelling, occur during menstruation. Further complicating the situation, there are psychopharmacologic changes that accompany childbearing, particularly in the hypothalmic-pituitary-adrenal axis, and women modify their behavior in connection with pregnancy -- abstinence from alcohol and tobacco being examples. Still a further consideration is what happens with the immune system during pregnancy. Since the immune system is suppressed during pregnancy, one might suppose that a full-term pregnancy could enhance the opportunity for carcinogenesis. Thus, it seemed naive to regard the biology as supporting a conclusion that induced abortion causes breast cancer based on a focus on estrogen. The changes that occur in early pregnancy, and therefore the effects that could occur from interference with those changes, are not minimal; but pregnancy involves a complex cascade of biological events, and those changes, while significant, are not very different from many other hormonal excursions that occur during other aspects of life.
B. Comments on Individual Assigned Epidemiologic Studies
1. Nishiyama et al. (1982, RR of approximately 2.5)
should be expanded, for example to say that the investigators should explain how they adjusted.
2. Howe et al. (1989, overall RR of 1.9)
· The study had some positive aspects, but some substantial weaknesses: It lacked information on most potential confounding factors, and the method of exposure ascertainment was confusing and possibly badly flawed in using fetal death certificates, which would not give good information on spontaneous abortions. As a result, the reviewers would give the study little weight.
· Another possible weakness was that the study did not seem to have uniform criteria for inclusion/exclusion of cases and controls.
· Although the Principles were helpful on the better studies, on this one, because the reviewers felt the significant flaws were so obvious, the Principles did not add much. For someone relatively new to the field, the Principles might cause points such as the ones relevant to this study to be brought to attention; but for an experienced epidemiologist, such significant flaws are conspicuous.
· There were comments that the Principles might be trying to do too much in a condensed form. The Principles themselves seemed fine; but it seemed that many of the subquestions did not fit depending on the type of study (cohort or case-control) or the type of exposure, and perhaps what was needed was different layers of principles -- a general set, and then more targeted sets for different types of studies and types of exposures. For example, with regard to the study being reviewed, the subquestion that seemed most pertinent to the significant issues was A-3(g) (on accuracy of exposure measurement or estimation), but it did not capture the issues very well. There seemed to be too much in the subquestions about whether something or other was reported in the study, when the real question was whether there was likely to be bias, such as false negative or false positive attribution of exposure. Another question should be whether the exposure measurement was reflective of the relevant time period. The bottom-line question should be whether the exposure measurements were accurate and relevant.
3. Adami et al. (1990, overall RR of 0.9)
· That it was a multi-center study was a strength, and it was basically sound methodologically. However, there appeared to be heterogeneity in the case-control sampling and response rates; data collection methods could have influenced results; and there was no stratified analysis by center.
· The relative homogeneity of the populations in terms of racial makeup and cultural factors seemed to be a distinct advantage for this study, as opposed to one conducted in the United States, where one would expect large differences among the study populations with
regard to such factors. The Principles did not seem to capture this point well.
· The main Principles seemed to be generally sound, but, again, the reviewer had difficulty with the subquestions, in large part because they often contained compound questions, so that it was difficult to give a single Yes or No answer.
· For a number of Principles subquestions, such as blinding and quality control, one could not answer the question, but one would assume that that aspect of the study was done right if they knew the study team was reputable. The quality control aspects are important, but no journal is going to publish that much detail. There is a question of what is realistic to expect, particularly if it is a routine matter, as opposed to, say, a formal validation study. One would like to know there is a more detailed write-up somewhere that documents everything, but one cannot expect such detail in the journal.
discussion in an individual study would raise questions regarding the motives and biases of the study author(s).
4. La Vecchia et al. (1993, RR of 0.9)
but it would be very dangerous to give non-epidemiologists the idea that they could apply the Principles and make accurate appraisals. On the other hand, the Principles as they stand, while they should be applied by expert epidemiologists, seem to largely reflect basic precepts that all expert epidemiologists should be familiar with. If the Principles are to further the state of the art, or bring more consistency to individual epidemiologic studies, evaluation of studies, and evaluation of bodies of studies, they should be refined. It was emphasized that the Principles should not be employed by non-epidemiologists in an official capacity. This would be like trying to have epidemiologists evaluate toxicology studies. Assuming the Principles are to be used by epidemiologists to advise non-epidemiologists, the aim should be to make evaluations more systematic and consistent; but the Principles should be advisory, not dogmatic.
5. Laing et al. (1993, overall RR of 3.1)
6. Daling et al. (1994, overall RR of 1.36)
factors in a small subgroup can cause the risk ratio to inflate. This artifact needs to be better understood by investigators.
7. Lipworth et al. (1995, overall RR of 1.5)
comparable; it is whether they were selected from the same population as the cases in some identifiable fashion, or by random method.
8. Daling et al (1996, RR in range of 1.2 to 2.0)
9. Lindefors-Harris et al. (1991, response bias study comparing two studies with range of overall RR's of 1.1 to 2.0)
was also a comment that it had been heard that the reporting to this registry was poor.
10. Rookus et al. (1996 study in which authors concluded that regional reporting bias could account for an RR of 1.9)
contraceptive use, one would expect that they would deny any use, rather than distort the duration of use.
11. Melbye et al. (1997 registry study with overall RR of 1.0, and RR of 1.39 at >12 wks.)
12. Brind et al. (1996 meta-analysis with synthetic RR of 1.3)
C. Discussion of Overall Views and Desired Work Product
contending that dose-response and specificity should be considered very important, if not essential, factors in considering causality.10
decision principles, they found there was much more uncertainty than they had previously realized.
10There were further comments exchanged on this subject following the Denver meeting; however, it is not possible to say whether the issues were resolved completely. Some of the epidemiologists clearly attach less significance to some of Hill's factors, such as dose response and specificity, than this toxicologist. Several comments by the moderator seem in order here: First, some of the epidemiologists seemed to view "Hill's factors" (they dislike the term "criteria" as connoting hard and fast rules) as referring to dose-response as a simple monotonic upward gradient; whereas the toxicologist, and probably most or all toxicologists, are of the view that this factor pertains to some recognizable pattern of dose-response, which might in different cases be, as examples, a "J", "hockey-stick", threshold, or saturation pattern -- but in any event not a random or zig-zag response. Second, some of the epidemiologists seemed to view Hill's factor of specificity as pertaining to a single organ or disease endpoint; whereas the toxicologist is again looking for a recognizable and plausible consistency, which might involve multiple sites with one or more sites predominating -- but again with a pattern rather than a random response. (It seemed to be agreed that exposures to heterogeneous mixtures are a different matter.) There appears to be considerable overlap among "Hill's factors" of dose-response, specificity, consistency, coherence, and biological plausibility, which can lead to difficulties in discussing individual factors. It also appears that there is a need for epidemiologists and toxicologists/pharmacologists to communicate better on these issues, and that the London panel's recommendations to convene multi-disciplinary panels to evaluate the evidence was well-advised. A somewhat detailed discussion of Hill's factors appears in Appendix B of the London report; and there is less specific reference to some of those factors in Appendices D, E, G, and H.