Briefing on Data Access
February 26, 1999
The University Community and the A-110 Proposed Revision
Mary Ellen Sheridan
University of Chicago
I feel that first I should give a disclaimer that as we move
down the panel, there is less and less original content.
I have been asked to comment on the potential impact within
the academic research community of Senator Shelby's amendment and OMB's proposed
implementation of this statute. First, let me firmly state that the university
scientific community is the strongest voice for data access, for the right
to use data in an unfettered way, and for the public dissemination of new
knowledge. From the standpoint of research administrators, we spend most of
our time negotiating so that we ensure that privilege, and that we uphold
that academic principle when we interact with industry, the federal government,
or any private sponsor. Hence, the remarks of caution that I will give should
not be equated with constructing barriers to effective public access to scientific
research. However, the sweeping language of the statute fires grave concerns
among our academic researchers about the need to balance the integrity of
the scientific process and inquiry, as well as public accountability. The
generation of new knowledge is a delicate but rigorous process. What we remember
about the essence of science and how we learned the scientific method underlie
the issues that we will raise about the integrity of the process. Because
this is so critical to us, both as scientists and in our duty as educators
to foster the public understanding of science, including measures that safeguard
the integrity of the process, it may be useful to restate those fundamentals.
The traditional steps of the scientific method are these: formulation of a
hypothesis, development of rigorous experiments with controls and variables,
collection of data, analysis of that data against the hypothesis, and a process
of refinement and reformulation of the hypothesis. Scientific research is
hard work. Collection of data is not a singular event; it is a process with
checks and counterchecks, scrupulous attention to detail, and exacting measurements.
It involves replication; it involves reproducibility. The interpretation of
results is an evolutionary process. It does not happen just once and then
disappear.
Let us not be confused. The storage of data, access to stored
data, and understanding of that data are not fungible. Data alone are not
science. Data in isolation, in fact, are fairly fragile. Premature data are
even more fragile. Hence, the benefit of scientific inquiry is dependent on
robust data. The dissemination of new knowledge is the vigorous defense of
conclusions. Others have commented on the process by which scientists begin
to discuss their data, the informal sharing of results among colleagues, posters
and presentations, and professional meetings. Gordon conferences are an excellent
example in which attendees sign a statement that they will not provide any
public disclosure of the information. This is probably the best of opportunities
for scientists to speculate on what is happening.
And science is a speculative process. I am concerned about the
informal, but regimented, annual reports that are required by federal agencies
of their grantees. Our scientists are more often than not using words such
as "we have preliminary indications," "we have preliminary data that support
our hypothesis," or "we have evidence that." If "publication" were interpreted
to mean those windows to program officers and agencies of the excitement indicated
by the process of science, we would have devastating outcomes. Probably the
most devastating of them would be silence, because investigators would feel
that they could not share with the agencies that had funded them what was
happening in the laboratory; that it would be premature. And so, we are extremely
concerned about how "publication" could or would be interpreted.
There is art in science. There are subjective judgments. Sometimes
those become that leap of faith that garners medals, and sometimes they are
based on sloppy or misguided thinking that leads to cold fusion fiascoes.
The submission for publication is the reality check. That is the bedrock of
academic science and biomedical research; the submission of information when
scientists feel confident enough that they have the valid data, the appropriate
conclusions, or when their preliminary tests have been satisfied in conversations
with colleagues in Gordon conferences. The test of judgment of science is
manifested in the peer vetting and publication process. This validates the
methodology. It examines and interprets the data, gives verifications of reasonableness,
and provides significance to the results. We are extremely concerned that
the danger of early intervention into this scientific process is entirely
possible under both this statute and the implementation, as well intentioned
as they may be. Incomplete data clearly will be subject to error. Data point
17 may be of little consequence in the long run, and in longitudinal data,
early indicators may not be borne out as valid evidence of conclusions that
could be drawn. We are concerned that lack of expertise in properly interpreting
data may lead to very damaging misinterpretations of science. Incomplete outcomes
have led, and may continue to lead in the future, to public harm and public
loss of confidence in the credibility of scientific research. We are also
concerned that early intervention and interception of data could lead to mischief
and allegations of scientific misconduct. I think the potential for the interruption
of the research process and the dissemination of peer-reviewed science is
the leading concern of the academic community.
We do have other concerns. There are collegial competitors within
the academic community, within hospitals and biomedical research. Most are
well behaved, and the chase is for the best quality and productivity of our
laboratories. We fear, however, the unscrupulous people who would take advantage
of another's insights, creativity in concept, methodology and hard work to
leap frog to the head of the pack. It could happen among our academic communities.
We are probably more gravely concerned about the opportunity for commercial
organizations to both use FOIA, and to be taken advantage of by FOIA.
First of all, the private sector could use the FOIA tool for
raiding basic research and scientific data that are being federally funded.
We are concerned that the FOIA tool could sanction a new type of corporate
welfare, franchising the university community for industrial outsourcing of
basic research through federal dollars. Second, we are concerned that scientists
have no recourse but to contain their federal research to the conservative
and the mundane, and to take their most inventive and cutting-edge research
outside federal reach. Not to their basement laboratories; that is probably
unlikely to happen. Scientists cannot typically afford to do that. However,
they may look for sources of support for the most leading-edge, creative research
that will assure that they do not open their research to these FOIA interventions.
The other commercial-sector concern we have is the disruption
of the partnerships that we have built with industry. Federal agencies, which
advocate public/private/university partnerships, may find these irreversibly
damaged if the private sector participants believed that corporate information
that they have shared with and brought to these partnerships would be accessible
through FOIA.
We believe there is potential harm to students. Graduate students
would be working for years on dissertations, only to discover that the area
of research on which they are working has become of interest. A public interest
group or the private sector could reach through into that data, undermine
the students' dissertations and cause them problems with respect to getting
their dissertations published in peer-reviewed journals, which may claim that
there has already been a public exposure of the information and that they
would not be able to publish the dissertation.
I respect Senator Shelby's opinion in selecting FOIA as the
mechanism for data sharing. Note, however, that the FOIA legislation and its
exemptions were not designed to apply to the data held by grantees. The National
Institutes of Health's implementation of FOIA was that grantee organizations
are the intended possessors of the data generated with federal support, and
in fact, their FOIA instructions exempt data held by grantees through FOIA.
FOIA was intended to access records and other information in the hands of
the federal agencies. Our concerns about the selection of FOIA as an instrument
are first that FOIA intends to govern the actions of the agency. There is
no opportunity for a grantee organization to ensure that the data are transferred
to the public appropriately, or that promises of confidentiality given in
good faith to human subjects, the private sector, or international collaborators
would be honored by the agency. Under this statute, all of the decision-making
authority for determining how and which data are shared is transferred from
the university or the grantee organization to the federal agency, along with
the raw data. Even a well-intentioned and honorable agency could find the
exemption for medical records or privacy being challenged in its applicability
to research data with dreadful consequences. The loser would not only be the
biomedical researchers, but the public, as volunteers step away from participating
in clinical studies and other informing and enriching research that is dependent
upon human subjects. Just imagine yourself reading an informed consent form
that would now be required to read, "This research is funded by the federal
government. Therefore, information about your participation in this project
may be available to requestors through the Freedom of Information Act."
What about data already in existence where confidentiality commitments
would surely be breached in the mere transfer of the data to the federal agency?
We have grave concerns about the reach-through to already existing data that
were collected under assurances of confidentiality or assurances of respect
for proprietary information that were given to our collaborators, whether
they are human subjects or whether they are actual participants in the research
project with us. The exemption for the protection of business and commercial
information may be weak in its applicability to universities as nonprofit
organizations. Yet, the intellectual property that is developed in the course
of federally sponsored research does have commercial potential. Public dissemination
through FOIA of preliminary data may jeopardize the protection of that intellectual
property through patenting and, moreover, may damage the ultimate transfer
of that technology for commercial development and the obvious subsequent public
benefit.
We are concerned about the definition of "data." We must expand
beyond the laundry list of types of data that NIH has to include all types
of geophysical science data, astronomical data, and engineering data. The
type of data that is available in scientific research is unbounded in the
way in which information can be gathered and recorded. This is the way it
exists in our laboratories, in our university offices, and in our researchers'
offices. There are already policies and guidelines from the federal agencies
that are the chief sponsors of basic research that give direction for accessibility
of tools, research material, and technologies that reflect our sponsors' expectations
of grantees. These guidelines are published, and they influence the transfer
of materials and the sharing of data within those research communities. These
are good examples of the way in which data sharing policies can and should
be developed. The selection of FOIA is not the way to do that.
It is very common that journals have expectations of data access
and data sharing. More and more journals are doing that. It is one way to
ensure that there is an opportunity to access data and to vett it in different
ways than perhaps the researchers themselves have interpreted it. We are concerned
about the articulation of what is in the public's interest. We talk about
public accessibility, the public good of data sharing; we are concerned that
really it is the benefits of science that are in the public's best interest.
The academic community is very strong and very consistent in its defense of
free and open publication. We want data to be free and open. We want people
to be able to see the data, but to see the data that are fit to be seen. Publication
and peer-review journals are the benchmark in most instances, but very long-term
research may need other markers. Continuously evolving longitudinal databases
provide different questions that will need different answers to the appropriate
way to share data.
Further, when scientific data are used for the underpinning
of regulatory rulemaking that has great economic impact, other protections
may be in the public interest, and they should apply. These are probably best
governed by the governmental agencies and those who funded the research in
the first place. The validation of results and the defense of interpretation
are the sponsor's responsibilities, as it bases its rulemaking and new policies
and studies on such scientific outcome.
Remember prior to the Bayh-Dole Act results were more commonly
in the public domain. However, the societal benefits were substantially retarded
because of that, and in recognizing that the grantee organization owned the
data and could protect the intellectual property that is associated with that
data and commercialize it, we were actually achieving a public good, and not
necessarily becoming more secretive with data. Academic institutions are analyzing
the commercial potential of data. Bayh-Dole alone did not bring that out.
I think the biotechnical revolution raised the consciousness of many scientists
as to the commercial potential of basic research. This raises interesting
questions. The way we answer these questions must take into account the integrity
of the scientific process and the public good.
The cost is not merely financial, for it is not merely a matter
of photocopying laboratory notebooks or databases. The cost comes in making
the data understandable in the documentation and in diverting scientists from
productive science to transporting of raw data. Who will be responsible for
transmitting data and educating the requestor when data is undecipherable
in its raw form? How would sponsors - and I do not mean the federal government;
many of these are partnerships - feel about how we had identified and appropriately
protected the data accessible through FOIA?
We are very concerned about the diversion of the grantee resources
away from the actual conduct of science to the presentation of data suitable
for public review. This intrusion of a regulation into an orderly process
of scientific productivity is extraordinarily costly. The effort required
by our senior faculty and key investigators to appropriately respond to these
FOIA requests will be enormous. I think it will be beyond the cost.
There may be some recourse. We appreciate that OMB, in reading
the statute, listened to our concerns and that it attempted to formulate an
implementation that addressed publication, rulemaking, and policy use. However,
we are convinced, as time passes, that the best-intentioned implementation
will not adequately protect us when we begin with bad statute. We believe
that OMB should take the opportunity now to use its full powers of deliberation
and turn to agencies and external bodies for additional guidance. This is
not something that will be resolved in the short term. We know that there
are opportunities and needs for data sharing policies to be developed. OMB
could make adequate provision for grantee organizations to have a key participatory
role in the determination of which data are turned over to federal agencies.
It cannot be raw data that we have committed to keep confidential. OMB must
find a way to link the grantee organizations and the federal sponsoring agencies
if there is any mechanism for this. The link is not through FOIA. Thank you.