April 2, 1999

Mr. F. James Charney
Policy Analyst
Office of Management and Budget, Room 6025
New Executive Office Building
Washington, DC 20503

Dear Mr. Charney:

We are writing to communicate the views of the American Association for the Advancement of Science (AAAS) on the proposed revision to OMB Circular A-110 published in the Federal Register on February 4, 1999. AAAS is the world's largest multidisciplinary science association, with more than 250 affiliated scientific, engineering, and medical societies representing all disciplines of science. We have long supported data access and sharing in science. For example, just this past January, in response to the law that precipitated OMB's proposal, the AAAS Council adopted a resolution stating that "it supports the public disclosure of scientific data that form the evidentiary basis for scientific findings and regulatory decisions, at the appropriate time and with appropriate safeguards…." Despite our long standing commitment to access and sharing of data, we have deep concerns about the proposed changes to Circular A-110, especially with the use of the Freedom of Information Act (FOIA) as the mechanism for implementing the new requirement.

We acknowledge OMB's efforts to limit the scope of the proposed rule. Nevertheless, the proposed revisions to Circular A-110 represent a fundamental shift in federal policy in a direction that will create serious unintended consequences for scientists, their institutions, federal funding agencies, and the wider public. While the objective of improving the rule-making process to make it more transparent and intelligible to the public is laudable, we believe that the proposed revision is poorly constructed and too vague to achieve that goal.

The revision proposes that "data relating to published research finding produced under an award that were used by the Federal Government in developing policy or rules" be made available to the public under FOIA. This represents an expansion of FOIA to include materials that have not traditionally been considered under the control of the government, whereas a 1980 ruling of the U.S. Supreme Court (Forsham v. Harris, 445 U.S. 169) rejected such expansions of FOIA's mandate. As such, it places new burdens on researchers as well as their institutions with respect to the interpretation of the rule. This has enormous implications for the scientific community and the public's well-being. For these reasons, we would like to express our specific reservations about the proposed revision and, where appropriate and feasible, offer recommendations for addressing our concerns.

How will "Data" be Defined?

As a professional society that includes scientists from all disciplines, we are acutely aware of the differences across scientific fields in the types of data collected. A definition that fails to take into account, for example, the difference between data generated by a survey instrument and perishable data, such as blood and tissue samples or rare fossil remains, will inevitably prove to be disruptive to the course of research and adversely affect the production of valuable knowledge for society. Both NASA and NSF have aggressively promoted openness and sharing in the research they support. Nevertheless, these organizations decided to restrict access to pieces of a Martian meteorite in order to reduce the risk of contamination. This common sense recognition of important differences undermines the notion of a one-size-fits-all definition of data for regulatory purposes. At this point, however, we do not even know what would be included within the meaning of "data."

Given the realities of conducting scientific research across a wide range of disciplines, we recommend that the proposal state explicitly that the definition of data shall be determined as part of the grant negotiating process between federal agencies and grantee institutions. Researchers are entitled to know what is expected of them and the agencies should be authorized to specify what obligations researchers assume for archiving and releasing data when they accept federal funding. This negotiating process will identify appropriately different levels of access based on the sensitivity of the data.

At What Point Must Data be Released?

In the proposed revision the timing of release of data is linked to the publication of research findings. This suggested timing raises a number of ambiguities. What does OMB consider "publication"? In some scientific fields, abstracts presented at scientific meetings are published as part of the conference proceedings, even though the findings may be preliminary or incomplete. Would such an activity trigger the data release requirement? Would posting research findings on a scientist's home page on the World Wide Web be considered a publication? What about longitudinal studies that produce a series of publications over time? Would the first publication based on early data require the release of all data as the study progressed? If not, would grantees be expected to make new releases with each publication?

Our concerns about this matter are sparked by the possibility of the premature release of undocumented and unverified data. In 1985, the National Research Council issued a report that reflects our own views on data release. It declared that "Scientists have a special responsibility to share data as quickly and as widely as possible where the data are or will become relevant to public policy" (Committee on National Statistics, Sharing Research Data, p. 27). But the report then stated: "This recommendation is not intended to support the public release of analyses prior to appropriate review." Good reasons for this caution should be readily apparent.

The premature release of research data before careful analysis and without independent scientific review could increase the risk of disclosing unreliable or misleading findings, perhaps leading to public confusion and bad policy. In longitudinal studies conducted over several years, disclosure of data collected in the early stages may discourage people from participating in the study, or alter their behavior in a way that confounds the study. Moreover, raw data are virtually useless without documentation and interpretation, thus leading us to question how much would be gained by the infusion of massive amounts of raw data into the public arena. We strongly recommend that any reference to published research findings state that "publication" refers to "appearance in a scientific journal after formal peer review."

When are Data Regarded as "Used" in Developing Federal Policy or Rules?

The proposed revision does not define how it will be determined that the government has "used" a federally-funded study to develop a policy or rule. There is a difference between data collected for the expressed purpose of developing a policy or regulation and data that simply provide background. Yet the proposal offers no guidance on the degree of direct linkage required to prompt release of data. What is the threshold that would trigger the requirement? We recommend that the threshold be any new regulation or policy submitted for public comment; that any such regulation or policy include explicit reference to specific studies used to develop it and that the agency sponsoring the research be so notified; and that the sponsoring agency, in consultation with the regulatory agency, shall determine which data produced under federally-funded grants are relevant and therefore subject to release. These specific proposals would reduce the potential for nuisance requests to agencies and harassment of researchers who may never have intended that their studies would impact policy.

Since policy and science are often inextricably linked, we must also question how scientists can be expected to know whether research done today will be used to develop future policies. How long will researchers have to retain their data? This issue is further clouded because the proposed revision is unclear about whether it will be applied retroactively. Will the release requirement be made applicable to all existing research data used to establish past rules or policies? If not, how will data covered by the revision be distinguished from earlier data in ongoing research projects that include both types? This is not a trivial matter, since many federally-funded laboratories are conducting research initiated decades ago. Either way the law is interpreted, a costly administrative burden will be incurred.

How will Reasonable Fees for Covering Costs be Determined and Allocated?

Both the legislation and OMB proposal allow for a "reasonable fee" to cover the "cost of obtaining the data." But both are conspicuously silent on how the fees will be determined and apportioned among the agencies, researchers, and their institutions. Indeed, it is the agencies that may charge the fees without any requirement that these fees be shared with those who bear the burden of archiving and preparing the data for release. If the requirement for the public release of data is implemented, then AAAS strongly recommends the inclusion of a cost-recovery system ensuring that grantees are appropriately reimbursed for the costs associated with archiving and releasing research data to the public.

What will be the Effects of the OMB Proposal on Collaboration and Participation in Research?

One of the strengths of the research enterprise in the United States is the partnerships among scientists, between scientists in academe and in industry and between them and foreign partners, and between researchers and volunteer subjects. We worry about the effects of the OMB proposal on those relationships and the consequences for science. Most of the foreseeable problems in this regard stem from the limitations of FOIA as a mechanism for making grantee research data available to the public.

Data and funding associated with a particular study can originate from many sources in addition to a federal grant. For example, they may originate with industry or foreign partners, or with collaborating researchers who have their own institutional funding. Once commingled, it may be difficult to distinguish data produced with federal funds from those produced with other funds. We are concerned that such partners may grow reluctant to enter or to continue a collaboration that could lead to the public release of data they would prefer to disclose at their discretion. Despite the exemption in FOIA that protects "commercial or financial information," the ambiguity associated with determining which data in a university-industry partnership would be subject to release is likely to make industry nervous about pursuing such collaborations. This result would, of course, be contrary to the objectives of the Bayh-Dole Act (P.L. 96-517), enacted in 1980 to spur the commercialization of research results by granting patent rights to universities for inventions developed with federal funds.

Our nation owes a great deal to those who have voluntarily participated as subjects in research done to increase our knowledge of human biology and behavior. Although FOIA exempts "the disclosure of [information] that would constitute a clearly unwarranted invasion of personal privacy," there is considerable fear among scientists and funders that this exemption may not be sufficient to offer adequate assurances of protection to research subjects. There are several reasons for these concerns.

Under FOIA, it is the agency--not the scientists who interact with research subjects--that would determine what data to mask in order to protect personal privacy. Subjects might be less than forthcoming in the details they reveal, or even reluctant to participate at all, if they thought that the federal government, let alone members of the public, might have access to their data. It is likely, for example, that it would become far more difficult to recruit participants for clinical studies on self-destructive or dangerous behaviors such as the use of illegal substances or violence.

FOIA's exemption is limited in another way. In much research, the focus of study is not individuals but institutions or communities, which are not protected under FOIA. Furthermore, once one identifies the institution or community in which research subjects are patients or in which they reside, only a short additional step leads to a personal identification. If research subjects lose faith in the ability of scientists to protect their privacy, we risk losing an indispensable source of knowledge.

How will the OMB Proposal Protect Sensitive Research Data?

In addition to proprietary and human subjects' data, other types of research data, if disclosed, could also adversely affect the conduct of science. Data released to the public that could lead to the identification of historically and scientifically valuable archeological sites could invite looting and destruction. Similarly, data enabling one to identify the location of rare botanical species outside the United States could lead to unwanted bioprospecting and could damage the relationship between researchers and the host community. There appears to be no mechanism through FOIA to protect such identification from those determined to use released data to serve their private interests.

Next Steps

In light of the complexity and uncertainty with the new law and OMB's proposal, we recommend that a study be conducted of the issues raised by the scientific community and of alternative approaches to achieving public access to research data in a way that balances the public's right to know with safeguards for the conduct of science. AAAS is prepared to offer its services in conducting such a study; the National Research Council or the General Accounting Office may also be well positioned to undertake this effort.

We agree with the recent statement by the National Science Board ("On the Sharing of Research Data," February 23, 1999) that "Current sharing practices promote free and open exchange of research data in a context that supports the rapid creation of knowledge,…." Both NSF and NIH are well known for their aggressive data sharing policies affecting grantees. Indeed, NIH announced late last year that it plans to spend $100 million over the next five years to secure public access to genetic data (Science, December 11, 1998, p. 1967). Scientific societies have incorporated expanded data sharing provisions into their professional standards. For example, the Code of Ethics of the American Sociological Association states the "Sociologists share data and pertinent documentation as a regular practice" and that data sharing should be anticipated "as an integral part of a research plan…." Scientific journals also play an important role in fostering data sharing. In 1998, for example, both Science and Nature, declaring that "the interests of the scientific community and the public are best served by making freely available the data on which the ideas in our papers are based," began to require "unrestricted release" of certain types of data "on or before the date of publication for all new manuscripts…."

Those are only a few examples of current practices. The study we recommend should examine these and related practices in order to determine whether models exist among them that, if modified or expanded, could serve the objectives of the new law without unleashing the problems that we identified above. AAAS recommends that the comment period be deferred until such a study is completed. At a minimum, OMB should extend the period of comment in view of the complex issues involved and the absence of any public discussion at the time the law was enacted.

We hope that these comments are helpful and we appreciate your giving them careful attention. AAAS is prepared to work with you and others to achieve a policy acceptable to both lawmakers and to the scientific community.


M.R.C. Greenwood
Chair, Board of Directors, AAAS
Chancellor, University of California, Santa Cruz

Stephen Jay Gould
President, AAAS

Mary L. Good
President-elect, AAAS
Managing Member, Venture Capital Investors, LLC

Alexander Agassiz
Professor of Zoology, Harvard University



