Towards a Science Commons

Nature quote

"- but there are also huge amounts of data that do not need to be kept behind walls. And few organizations seem to be aware that by making their data available under a Creative Commons license ... they can stipulate both rights and credits for the reuse of data, while allowing its uninterrupted access by machines."

Nature vol 438 December 2005

A brief history of why Creative Commons launched the Science Commons project.

The sciences depend on access to and use of factual data. Powered by developments in electronic storage and computational capability, scientific inquiry is becoming more data-intensive in almost every discipline. Whether the field is meteorology, genomics, medicine or high-energy physics, research depends on the availability of multiple databases, from multiple public and private sources, and their openness to easy recombination, search and processing.

Traditions in intellectual property

In the United States, this process has traditionally been supported by a series of policies, laws and practices that were largely invisible even to those who worked in the sciences themselves.

First, US intellectual property law (and, until recently, the law of most developed countries) did not allow for intellectual property protection of “raw facts.” One could patent the mousetrap, not the data on the behavior of mice, or the tensile strength of steel. A scientific article could be copyrighted; the data on which it rested could not. Commercial proprietary ownership was to be limited to a stage close to the point where a finished product entered the marketplace. The data upstream remained free for all the world to use.

Second, US law mandated that even those federal government works that could be copyrighted, fell immediately into the public domain — a provision of great importance given the massive governmental involvement in scientific research. More broadly, the practice in federally funded scientific research was to encourage the widespread dissemination of data at or below cost, in the belief that, like the interstate system, this provision of a public good would yield incalculable economic benefits.

Third, in the sciences themselves, and particularly at universities, a strong sociological tradition — sometimes called the Mertonian tradition of open science — discouraged the proprietary exploitation of data (as opposed to inventions derived from data) and required as a condition of publication the availability of the datasets on which the work was based.

Innovation in technology and legal friction

Each of these three central tenets evolved from concepts that existed even before the industrial revolution — at the innately slow rate of change of the legal system. Similarly, scientific publication has a long-standing tradition. Modern technologies, especially the evolving use of the World Wide Web as a library, have forever changed the mechanisms for delivery and replication of documents. In many fields, results are published nearly as quickly as they are discovered. But copyright law has evolved at a different rate. Progress in modern technology, combined with a legal system that was crafted for the analog era, is now having unintended consequences. One of these is a kind of legal “friction” that hinders the reuse of knowledge and slows innovation.

To counterbalance, a large and vibrant global community has formed to support open access for scientific literature — that is, making it “digital, online, free of charge, and free of most copyright and licensing restrictions.” Major research foundations, such the Wellcome Trust and the Howard Hughes Medical Institute, have adopted groundbreaking policies to require open access to research results. The US National Institutes of Health now requires open access to funded research. Faculty at educational institutions are passing their own resolutions to ensure that their work is published openly. And most major journals have granted authors the right to self-publish versions of their peer-reviewed papers.

Yet in many cases the legal questions remain unsettled. How can we facilitate reuse of research while ensuring that authors and publishers retain attribution? What’s the best way to enable the integration of data collected under different jurisdictions? What kind of legal and policy infrastructure do we need to ease the transfer of materials necessary for verifying results and extending research?

The different rates of change between modern technology and the law create friction in other places as well. For example, in the genetic realm, patent law has moved perilously close to being an intellectual property right over raw facts — the Cs, Gs As and Ts of a particular gene sequence. In other areas, complex contracts of adhesion create de facto intellectual property rights over databases, complete with “reach through agreements” and multiple limitations on use. Legislatively, the EU has adopted a “database right” that does, in fact, accord intellectual property protection to facts. This changes one of the most fundamental premises of intellectual property: that one could never own facts, or ideas, only the inventions or expressions yielded by their intersection.

The federal government’s role is also changing. Under the important, and in many ways admirable, Bayh-Dole statute, researchers using federal funds are encouraged to explore commercial use of their research. Universities have become partners in developing and reaping the fruits of research. This process has yielded amazing results, converting raw, basic science into useful products in many industries. But in some cases the quest to commercialize has moved upstream, to the fundamental levels of research and data, and that has created complex legal requirements. The legal issues may be complicated when the intellectual property is a novel “method” for assaying biological activity, but there are even more questions about patents covering the genes, proteins and their inferred functions.

The sheer cost of legal work can take research “out of play” — simply because it can be more expensive to do the lawyer work than the product might reap on the open markets. This stifles scientific innovation, as the value of scientific information increases exponentially when it is connected with other scientific information, and is of the least possible value when segregated by law.

The search for a solution

These facts have not gone unnoticed. Numerous scientists have pointed out the irony that, at a time when we have the technologies to permit global access and distributed processing of scientific data, legal restrictions are making it harder to connect the dots. Learned societies such as the National Academies of Science, federal granting agencies like the National Science Foundation and other groups have all expressed concern about the trends that are developing. Any solution will need to be as complex as the problem it seeks to solve, which is to say it will be interdisciplinary, multinational and involve both public and private initiatives.

Science Commons

Science Commons applies the Creative Commons philosophy and methodology to the world of scientific research. We leverage the existing CC infrastructure: open licenses for copyrighted works, commons deeds and metadata, international affiliations and more. We also extend the infrastructure into science where appropriate — creating CC-inspired contract suites for materials like cell lines or mice, or building open source platforms for knowledge management and data visualization. Part of our role is to serve as a social engineering group for scientists, and part attorney arguing on behalf of the public domain in science.
 Our aim is to clear the legal and technical pathway for accelerating discovery worldwide.

Learn more about our projects to open and mark research and data for reuse, streamline and automate the materials transfer process and integrate data from disparate sources.