Blog archive for April, 2008

A Wellcome future for science

April 28th, 2008 by dwentworth

When he gives talks for research foundations about ways to spur innovation, John Wilbanks often shares the story of John Snow, the anesthesiologist who in the mid-1800s used maps to figure out how a series of cholera epidemics were spreading. By marking the precise locations where the outbreaks occurred, Snow was able to demonstrate that they clustered around water sources, showing that the “morbid poison” was spreading through tainted water.

What does this have to do with modern-day research foundations? If you envision research outcomes as pieces of a map — in biomedical research, a map of the human body — you can easily see the advantages of ensuring that when they are published, they are published openly. A single research paper may not hold the answer to stopping an epidemic or curing a disease; placed in context, however, it could make finding the solution trivial.

On that note, below is the first profile in our series on people and organizations working at the frontiers of open science: a look at the pioneering work of the UK-based Wellcome Trust.

The Wellcome Trust, a global leader among charity organizations, is working to keep the results of the research it funds “widely and freely available to all.” Importantly, it defines this freedom explicitly in terms that embrace the advantages that computers and network technology give us. The most recent update to its position statement on open access encourages — and in cases where it has paid an open access fee, requires — that funded research is licensed so that it can be “freely copied and re-used (for example for text and data-mining purposes), provided that such uses are fully attributed” (emphasis, mine). This might read as a minor parenthetical; in fact, the explicit freedom to use computer technology to derive value from the literature can help make the difference between having a map and continually, painstakingly redrawing it.

This is just one example of the smart choices the Wellcome Trust has been making to cultivate what it calls a “richer research culture.” Below is a brief overview of the Trust’s trail-blazing work over the past five years (with thanks to Peter Suber for his meticulous documentation of the work in the Timeline of the Open Access Movement):

  • 2003: The Wellcome Trust commissions a report asking how the economics of scientific publishing impact the long-term interests of the research community. The findings are released in tandem with a landmark position statement supporting open and unrestricted access to the published output of research. Writes Suber: “When a foundation awards a research grant, it is showing its belief that the results of that research will be useful to the wider world. With its commitment to open access, WT is showing its belief that open access to those research results will make them even more useful.”
  • 2004: The Trust announces its intention to establish a European PubMed Central, and to require that its grantees deposit an electronic copy of research publications in PubMed Central no later than six months after publication.
  • 2005: The Trust makes history by becoming the world’s first research funding agency to implement an open access mandate.
  • 2007: In January, the UK PubMed Central (UK PMC) is launched, a collaboration among of the Trust and nine other leading UK organizations. Wellcome Trust Director Mark Walport promises that the launch is “only the start” of an effort to develop the site as the resource of choice for the international biomedical research community. In the spring, the Trust signals its support for sharing preliminary research and findings, joining the British Library, the European Bioinformatics Institute (EBI) and Science Commons as a partner in Nature Precedings.
  • 2008: The UK PMC holds a workshop on further developing the site. Its workshop report [PDF] hints at future developments to enhance the usefulness of the literature; in a summary of a discussion about text mining, Dr. Sophia Ananiadou explains that semantic markup helps text mining tools work “optimally,” and would allow researchers to “use a simple natural language query that will retrieve specific facts matching that query, rather than just a set of whole documents to be read.”

The Trust shows no signs of stopping pushing the envelope, making strategy and policy decisions that reflect its ongoing commitment to maximizing the “downstream” impact of research.

We can’t wait to see what’s next.

Science Commons and SPARC release guide for creating open access policies at institutions

April 28th, 2008 by dwentworth

Science Commons and SPARC today released a new guide for faculty who want to ensure open access to their work through their institution.

The how-to guide, Open Doors and Open Minds, is aimed at helping institutions adopt policies to increase the practical exposure to the scholarly works being produced, such as that adopted by the Harvard Faculty of Arts and Sciences in February. It provides information on copyright law, offers specific suggestions for licensing options and provides a ten-point list of actions people can take to craft and implement a policy that maximizes the impact of research.

From the SPARC media release:

“The Harvard policy is a recognition that the Internet creates opportunities to radically accelerate distribution and impact for scholarly works,” said John Wilbanks, Vice President of Science at Creative Commons. “As more universities move to increase the reach of their faculty’s work, it’s important that faculty members have a clear understanding of the key issues involved and the steps along the path that Harvard has trail-blazed. This paper is a foundational document for universities and faculty to use as they move into the new world of Open Access scholarly works.”

“Everyone – faculty, librarians, administrators, and other advocates – has the power to initiate change at their institution,” said Heather Joseph, Executive Director of SPARC. “By championing an open access policy, helping to inform your colleagues about the benefits of a policy change, and identifying the best license and most effective path to adoption, it can be done.”

The guide is available both at the SPARC site and in the Science Commons Reading Room.

SPARC Europe and DOAJ launch the SPARC Europe Seal for Open Access Journals

April 25th, 2008 by dwentworth

SPARC Europe and the Directory of Open Access Journals (DOAJ) have announced the launch of the SPARC Europe Seal for Open Access Journals.

The seal is aimed at increasing the usefulness and “discoverability” of open access (OA) journals, clarifying the kinds of reuses that are allowed and using metadata to make the content easier to find. To qualify for the seal, a journal must use the Creative Commons By (CC-BY) license and provide metadata for all their articles to the DOAJ, which will then make the metadata OAI-compliant. From the media release:

“Legal certainty is essential to the emergence of an internet that supports research. The proliferation of license terms forces researchers to act like lawyers, and slows innovative educational and scientific uses of the scholarly canon,” said John Wilbanks, Executive Director of Science Commons. “Using a seal to reward the journals who choose to adopt policies that ensure users’ rights to innovate is a great idea. It builds on a culture of trust rather than a culture of control, and it will make it easy to find the open access journals with the best policies.”

Bravo to SPARC Europe and DOAJ for setting a standard that can help spur innovation by expanding the zones of legal certainty for research.

You can find additional notes and commentary by Peter Suber @ Open Access News.

Nguyen on keeping data open and free

April 23rd, 2008 by dwentworth

In the wake of Creative Commons’ announcement last week that the beta CC0 waiver/discussion draft 2 has now been released, Science Commons Counsel Thinh Nguyen has written a short paper to help explain why we need legal tools like the waiver to facilitate scientific research. Writes Nguyen:

Any researcher who needs to draw from many databases to conduct research is painfully aware of the difficulty of dealing with a myriad of differing and overlapping data sharing policies, agreements, and laws, as well as parsing incomprehensible fine print that often carries conflicting obligations, limitations, and restrictions. These licenses and agreements can not only impede research, they can also enable data providers to exercise “remote control” over downstream users of data, dictating not only what research can be done, and by whom, but also what data can be published or disclosed, what data can be combined and how, and what data can be re-used and for what purposes.

Imposing that kind of control, Nguyen asserts, “threatens the very foundations of science, which is grounded in freedom of inquiry and freedom to publish.” The situation is further complicated by the fact that different countries have different laws for protecting data and databases, making it difficult to legally integrate data created or gathered under multiple jurisdictions. Using a “copyleft” license doesn’t mitigate the difficulty, since any license is premised on underlying rights, and those rights can be highly variable and unpredictable.

Finding a solution to these problems was the impetus behind the Science Commons Open Data Protocol, which Nguyen describes as “a set of principles designed to ensure that scientific data remains open, accessible, and interoperable.” In a nutshell, the idea is to return data to the public domain, “relinquishing all rights, of whatever origin or scope, that would otherwise restrict the ability to do research (i.e., the ability to extract, re-use, and distribute data).” The CC0 waiver and the Open Data Commons Public Domain Dedication and License (PDDL) are tools to help people and organizations do that, implemented under the terms of the Protocol.

Of course, there are many existing initiatives to return data to the public domain. What the Protocol aims to do, however, is bring all of these initiatives together. Explains Nguyen:

What we seek is to map out and enlarge this commons of data by seeking out, certifying, and promoting existing data initiatives as well as new ones that embrace and implement these common principles, so that within this clearly marked domain, scientists everywhere can know that it is safe to conduct research.

You can read the entire paper, Freedom to Research: Keeping Scientific Data Open, Accessible, and Interoperable [PDF], in the Science Commons Reading Room.

Workshop report: strategies for open, permanent access to scientific information

April 21st, 2008 by dwentworth

Last spring, Science Commons participated in a workshop in Brazil aimed at identifying strategies for ensuring open, permanent access to scientific information in Latin America, with a particular focus on access to health and environmental information for sustainable development. Organized by the international Committee on Data for Science and Technology (CODATA) and Brazil’s Centro de Referência em Informação Ambiental (CRIA), the workshop featured sessions on topics ranging from ways to overcome barriers to open access in countries around the world to the challenges of successfully integrating environmental, geospatial and biodiversity data.

The workshop report is now online and available at the conference website. Our thanks to CRIA Director Dora Ann Lange Canhos for passing it along.

Are you part of open science?

April 16th, 2008 by dwentworth

A few weeks ago, I asked you for your ideas for people and organizations to profile here at Science Commons, with the goal of highlighting efforts to open new frontiers for innovation and discovery in science. I got some great responses, including a marvelously detailed, thoughtful email from Valentin Zacharias, a doctoral student and researcher at FZI who identified groups in five broad areas of open science:

  • broad efforts to bring more scientific knowledge online, such as E.O. Wilson‘s Encyclopedia of Life project
  • initiatives to define open access (OA) and develop resource sites, such as the Directory of OA Journals (DOAJ) and the Registry of OA Repositories (ROAR)
  • efforts to share pre-publication research — science as it happens — including everything from preprint servers to open notebook projects
  • initiatives to create evaluation mechanisms and bibliometrics
  • efforts to integrate and make scientific content understandable by computers, such as the Semantic Web approaches we use at Science Commons

This is, of course, only the tip of the proverbial iceberg — or to use a more apropos metaphor, the stack. There is an incredibly diverse range of projects that use “open” approaches to building knowledge and accelerating discovery. In the profiles I’ll be publishing here, a connecting thread will be the question of whether and how we can enable independent contributions to feed into one another. Or to put it another way, how do we build an integrated commons of research and tools that’s truly useful for scientists?

I hope you’ll stay tuned. And if you’re part of an open science project and you haven’t yet sent me a pointer, please do. I’d love to hear from you.

One small step for open access…

April 13th, 2008 by dwentworth

NPR’s Science Friday program has now posted its interview with Harold Varmus on the landmark NIH open access mandate, which went into effect this past Monday. Varmus, the former NIH director who co-founded the Public Library of Science (PLoS), talks about what the mandate means for the future of biomedical research, fielding questions about everything from freeing dark data to expanding access to orphan disease research to reclaiming our scientific heritage in the literature.

Among the many other excellent points he makes, Varmus argues that more research funders should adopt open access policies — not only to magnify the impact of the research they fund, but also to open it to innovative uses:

You can imagine that if you were a funder of science anywhere in the world, you would want the results that you paid for to be out there for everyone — not just to see, but to work with. Indeed, the way in which one works with the information is extraordinarily important in this day in which we use the computer to mine research data for new ways to think about things.


In case you missed it earlier this week, here’s Varmus’s PLoS editorial on the mandate: Progress Toward Public Access to Science. For more information about the NIH mandate going forward, including university-sponsored resources for authors, check out SPARC‘s NIH implementation page.

caBIG: sharing data to save lives

April 9th, 2008 by dwentworth

The Scientist has a not-to-be-missed piece this month on the National Cancer Institute’s caBIG, the Cancer Biomedical Informatics Grid. The article, Heading for the BIG Time (free registration required), was written by caBIG founder Kenneth Buetow, and serves as an excellent introduction to the reasons why we need a collaborative infrastructure for knowledge sharing in science — one that works, to use Buetow’s phrase, as a “smart World Wide Web” for research.

“From my position as a senior cancer researcher at the NCI, groundbreaking observations and insights in biomedicine are accumulating at a dizzying rate. However, from the perspective of the approximately 1.4 million US patients who will hear their physicians say, ‘You have cancer,’ progress is unacceptably slow,” explains Buetow. “Something needed to be done to expedite the transformation of scientific findings into clinical solutions.”

That’s a daunting challenge, says Buetow, especially given the nature of the disease and the way research is currently done. “Cancer is an immensely complex disease, and in order to get a sense of the big picture, scientists need to combine observations from genomics, proteomics, pathology, imaging and clinical trials,” he says. “There was, however, no systematic way to do this.”

The caBIG solution: figuring out what kinds of data could be shared, and then connecting more than 60 NCI centers of cancer research using a strategy of “standards-based interoperability,” where information is shared and accessed using common standards and tools. The results so far are promising: the caBIG community is growing, and people are already adding new tools that increase the utility of the shared data.

We’re proud to have been a part of the NCI’s ongoing discussions about intellectual property and data sharing recently, as well. John Wilbanks was a closing panelist at the most recent Data Sharing and Intellectual Capital meeting of the Grid. We look forward to continuing – and deepening – our relationship with the Grid over the coming months.

Design a book cover, protect the public domain

April 7th, 2008 by dwentworth

James Boyle, the new Chairman of the Board at Creative Commons and a founder of Science Commons, is holding a contest to design a cover for his new book, The Public Domain: Enclosing the Commons of the Mind. In the book, Boyle argues that more and more of material that used to be free to use without having to pay a fee or ask permission is becoming private property — at the expense of innovation, science, culture and politics.

Details, including specs and a link to some great source material for imagery, are available at the Worth1000 website. Both the book and the cover will be distributed under a CC Attribution-NonCommercial license.

Hal Abelson on commons-based problem solving

April 4th, 2008 by dwentworth

If you’re curious about the current state of play in efforts to make knowledge sharing easier so we can solve problems faster, look no further. Science Commons Advisory Board member Hal Abelson — a founding director of Creative Commons, the Free Software Foundation and Public Knowledge — provides a big-picture perspective on these efforts in the latest podcast interview from MIT Libraries in its terrific series on Scholarly Publishing & Copyright.

“The way I first got into this is with software,” explains Abelson in the interview. “Prior to [the early ’80s], if there was a program around, you could contact the author and get a copy, and you could make it better — that was the key thing, to make it better. Around the mid-’80s, that stopped…Then it became kind of clear that this attitude — not sharing your software, and thus jeopardizing any kind of collective enterprise that could exist — was also going to be true for other kinds of copyrighted works, as they increasingly got online.”

You can download the full podcast here [MP3]. If you’d like to subscribe to the series, you can paste the following link into your iTunes or another podcast reader: