Federal Focus, Inc.

Briefing on Data Access

February 26, 1999

An Industry Perspective on the Proposed Revision

Roger O. McClellan
Chemical Industry Institute of Technology

I am pleased to participate in this briefing on public access to data. Contrary to the published title in the program, I will not provide an industry perspective on the proposed revisions. To do so would be presumptive for a number of reasons, the most significant of which is that it is a misconception to think there is a monolithic entity called "industry." In my experience, industry is quite a heterogeneous body, whether it is defined by major product, by sales from a few tens of thousands of dollars annually to tens of billions of dollars, or by ownership in sole proprietorships to a multinational public stock corporation. In any event, this is not a matter of industry, academics, or government. It is rather a matter of the scientific community and its relationship to the broader public. I offer here my own personal views, based on my experience as a research scientist, as a senior executive of a multidisciplinary team that had been funded by both public and private monies, and, most significantly, my service on a number of advisory committees reviewing the scientific basis for major federal policy decisions.

My comments are based on five personal beliefs: One, institutions and scientists receiving federal funds receive them not as one-way transactions, but rather as compacts with the public to use the funds to conduct research and develop information that will serve the public good. Two, science is best conducted in an open and transparent atmosphere that includes rigorous peer review and publication of information in scientific journals that are widely disseminated. Three, after publication, investigators should be willing to share primary data sets with appropriate attention given to protecting the rights of individual subjects and potential proprietary interests of the original investigators and their institutions. Four, the scientific process, as we have just heard, is an iterative process, with various significant checks and balances that are applied both at the level of individual investigators and their immediate associates, but which also involves interactions between those individuals and other individuals and teams. This process places a premium on the sharing of data to facilitate the validation of key analyses and interpretations, including the development of alternative analyses. Five, federal policies and regulations, and especially those concerning public health and the environment, should be based on the best available scientific information and interpretations.

My personal involvement began nearly a decade ago when I served as Chair of EPA's Clean Air Scientific Advisory Committee, the committee charged by Congress with advising the EPA (Environmental Protection Agency) Administrator on a scientific basis for the National Ambient Air Quality Standards. In the early 1990s, a number of papers appeared in the literature dealing with the association between airborne particulate matter and health effects. Many of the papers used new analytical approaches to analyze very complex data sets, air quality, and multiple health parameters. Controversy soon arose when different investigators analyzing similar or very closely related data sets reached very different conclusions. Recognizing that these papers would have a critical role in EPA's criteria document, the position paper that would ultimately be used to revise the National Ambient Air Quality Standard for particulate matter, I, as the past Chairman of the Clean Air Scientific Advisory Committee, and its then Chairman, George Wolf, wrote to Administrator Browner on May 16, 1994, asking the agency to take a lead role in making key data sets on air quality and health responses available for analysis by multiple analytic teams. Certainly there were many, many studies that would appear in the criteria document, but there were certain studies that loomed as very large and central to our deliberations. One of those data sets had been collected by investigators at Harvard and their collaborators. It was those studies that we thought would be useful in 1994. Unfortunately, the agency did not take the leadership role that Dr. Wolf and I had envisioned. Fortunately, the Health Effects Institute (HEI), jointly funded by the EPA and the automotive industry, did step forward to provide leadership for the conduct of analyses by a single, excellent team of investigators from Johns Hopkins University led by Professor Jon Samet. Dr. Wolf and I would have preferred that several teams were involved in the exercise. The analyses were accepted for publication, and I think played a key role in final decisions made and ultimately the promulgation of a revised particulate matter standard. That action was strengthened by the re-analyses that were done. Those re-analyses are still in progress, nearly five years after our request to Administrator Browner. I am confident that the work will be well done. It will be published, I am confident, in peer-reviewed journals, and I am also confident it will have a critical impact in the next round of review of the National Ambient Air Quality Standards for particulate matter, standards that have wide-ranging potential impacts on public health and on the economy at large.

Why was the process so protracted? I submit that the major problem is that the scientific community - and I repeat - the scientific community has not developed adequate procedures to deal with the issue of critical data sets and their sharing among others in the scientific community, especially when those data bear on important public health decisions. In some sense, we, the scientific community, have abdicated our responsibility. To whom? To the U.S. Congress and to the Office and Management and Budget.

Consider this: Who is best qualified to develop the guidance for sharing data bearing on important public health policy decisions when the data are acquired with public funds, or, for that matter, private funds? The U.S. Congress? What about OMB? If the Congress and the OMB are not the institutions to take the lead role, which is? I submit it is the scientific community. One might ask exactly what is the position of the Congress and the OMB. They have moved to a familiar vehicle - the Freedom of Information Act. From there, they have moved to Circular A-110. I personally do not think that those are the right vehicles for this. However, the Congress and the OMB have stumbled upon familiar tools. Why? In part, because we, the scientific community, did not give them a better field of vision. So, I urge the scientific community to ask the Congress and the OMB to call a "time-out." Give us the opportunity in the scientific community to examine the appropriate processes, to do our jobs before the Congress and the OMB become involved. I believe that we in the scientific community should be willing to accept the responsibility of dealing with this very complex issue. Why? Because it is critically important to individual scientists and investigators, to graduate students, and to research institutions; it is central to the total scientific enterprise. Most significantly, it is important to the American public, and the relationship of the scientific community to the American public.

How can we proceed? First, I think the scientific community must engage in a positive dialogue. There have been some positive suggestions made; but, quite frankly, this is a reactionary approach, not a positive, proactive approach to the critical issue of how best to share data. We now are talking about data that will be increasingly important and commonplace in our community as we engage in larger and larger, multidisciplinary studies. We will take advantage of the advances in terms of modern molecular biology, the advance in informatics, epidemiology, all the scientific fields to create larger data sets and data sets that have important impacts on our understanding of public health. We must encourage that kind of active dialogue. We need to go further; we need to encourage our various professional organizations to hold meetings to discuss and indeed debate the issues and solutions. And I emphasize the solutions.

Second, provision must be made for the National Research Council and the Institute of Medicine to form a joint committee to review the issue in its broadest context. This committee must develop guidance for access to data that are developed with public funding, for the sharing and re-evaluation of data, and for the ultimate publication of the resulting analyses so they can be used in public policy setting and rulemaking.

This NRC/Institute of Medicine effort could build on previous activities, such as the 1985 NRC committee on sharing research data. Since 1985, many advances have been made in the field of informatics and in the individual scientific fields that produce and introduce new dimensions to the issue of access for data, data sharing, and data re-analysis. In developing these guidelines, I urge the committee to examine the issue in its broadest context. This must include exploring the guidance for access to data related to public health matters, irrespective of whether it was developed with public or private funds. Some industries have indicated a willingness for open sharing of data concerned with public health matters. A good example is the recent announcement by the Chemical Manufacturers Association to make publicly available on the Internet the results of tests on some 3,000 high-production-volume chemicals that will be tested over the next five years. I think they have made a bold statement in promising that kind of public access. It will be an enormous challenge to develop the means for making that available, and most importantly, for the public to be able to understand what has been placed in its hands.

In my opinion, this is an especially critical time for the scientific community to demonstrate leadership in these matters. Why? In part, because those of us in the health community are asking the public to double, over a short period, the investment of public funds in health research.

Q [John Gardenier, National Center for Health Statistics]: Would it be a concern to you to find out that there are extremely adverse ethical consequences that were certainly not intended? There is some potential that the law could force people to violate the contracts under which the data was collected and force the American public to give data to the federal government that they gave to researchers only on the explicit condition that they would not be given to the federal government. Worse than that, once within the federal government, to the extent that it may fall outside the protections of FOIA, that data might be available to the general public, which would be a gross violation of the public's civil rights.

A [Kathy Casey]: I think what you refer to, in part, is existing confidentiality agreements that the agencies have with particular grantees, or joint agreements between grantees who are, in part, privately funded. There are legitimate concerns as to whether there would be some sort of retroactive effect on existing understandings that have now produced data and/or publications as a result. The retroactivity or prospectivity of this provision, and whatever OMB may do in terms of making research data subject to FOIA and/or beyond FOIA, should be addressed.

SCIENCE POLICY

Risk Assessment

London Principles

Endocrine Effects

Other Areas

YOUNG ADULT PROGRAMS

Ed-Mentor

Jazz Band

AGRI-BIOTECH PROGRAM

Symposium

EVENTS & WRITINGS

Symposia

Publications