Weblog

Nguyen on keeping data open and free

April 23rd, 2008

In the wake of Creative Commons’ announcement last week that the beta CC0 waiver/discussion draft 2 has now been released, Science Commons Counsel Thinh Nguyen has written a short paper to help explain why we need legal tools like the waiver to facilitate scientific research. Writes Nguyen:

Any researcher who needs to draw from many databases to conduct research is painfully aware of the difficulty of dealing with a myriad of differing and overlapping data sharing policies, agreements, and laws, as well as parsing incomprehensible fine print that often carries conflicting obligations, limitations, and restrictions. These licenses and agreements can not only impede research, they can also enable data providers to exercise “remote control” over downstream users of data, dictating not only what research can be done, and by whom, but also what data can be published or disclosed, what data can be combined and how, and what data can be re-used and for what purposes.

Imposing that kind of control, Nguyen asserts, “threatens the very foundations of science, which is grounded in freedom of inquiry and freedom to publish.” The situation is further complicated by the fact that different countries have different laws for protecting data and databases, making it difficult to legally integrate data created or gathered under multiple jurisdictions. Using a “copyleft” license doesn’t mitigate the difficulty, since any license is premised on underlying rights, and those rights can be highly variable and unpredictable.

Finding a solution to these problems was the impetus behind the Science Commons Open Data Protocol, which Nguyen describes as “a set of principles designed to ensure that scientific data remains open, accessible, and interoperable.” In a nutshell, the idea is to return data to the public domain, “relinquishing all rights, of whatever origin or scope, that would otherwise restrict the ability to do research (i.e., the ability to extract, re-use, and distribute data).” The CC0 waiver and the Open Data Commons Public Domain Dedication and License (PDDL) are tools to help people and organizations do that, implemented under the terms of the Protocol.

Of course, there are many existing initiatives to return data to the public domain. What the Protocol aims to do, however, is bring all of these initiatives together. Explains Nguyen:

What we seek is to map out and enlarge this commons of data by seeking out, certifying, and promoting existing data initiatives as well as new ones that embrace and implement these common principles, so that within this clearly marked domain, scientists everywhere can know that it is safe to conduct research.

You can read the entire paper, Freedom to Research: Keeping Scientific Data Open, Accessible, and Interoperable [PDF], in the Science Commons Reading Room.

Workshop report: strategies for open, permanent access to scientific information

April 21st, 2008

Last spring, Science Commons participated in a workshop in Brazil aimed at identifying strategies for ensuring open, permanent access to scientific information in Latin America, with a particular focus on access to health and environmental information for sustainable development. Organized by the international Committee on Data for Science and Technology (CODATA) and Brazil’s Centro de Referência em Informação Ambiental (CRIA), the workshop featured sessions on topics ranging from ways to overcome barriers to open access in countries around the world to the challenges of successfully integrating environmental, geospatial and biodiversity data.

The workshop report is now online and available at the conference website. Our thanks to CRIA Director Dora Ann Lange Canhos for passing it along.

Are you part of open science?

April 16th, 2008

A few weeks ago, I asked you for your ideas for people and organizations to profile here at Science Commons, with the goal of highlighting efforts to open new frontiers for innovation and discovery in science. I got some great responses, including a marvelously detailed, thoughtful email from Valentin Zacharias, a doctoral student and researcher at FZI who identified groups in five broad areas of open science:

  • broad efforts to bring more scientific knowledge online, such as E.O. Wilson‘s Encyclopedia of Life project
  • initiatives to define open access (OA) and develop resource sites, such as the Directory of OA Journals (DOAJ) and the Registry of OA Repositories (ROAR)
  • efforts to share pre-publication research — science as it happens — including everything from preprint servers to open notebook projects
  • initiatives to create evaluation mechanisms and bibliometrics
  • efforts to integrate and make scientific content understandable by computers, such as the Semantic Web approaches we use at Science Commons

This is, of course, only the tip of the proverbial iceberg — or to use a more apropos metaphor, the stack. There is an incredibly diverse range of projects that use “open” approaches to building knowledge and accelerating discovery. In the profiles I’ll be publishing here, a connecting thread will be the question of whether and how we can enable independent contributions to feed into one another. Or to put it another way, how do we build an integrated commons of research and tools that’s truly useful for scientists?

I hope you’ll stay tuned. And if you’re part of an open science project and you haven’t yet sent me a pointer, please do. I’d love to hear from you.

One small step for open access…

April 13th, 2008

NPR’s Science Friday program has now posted its interview with Harold Varmus on the landmark NIH open access mandate, which went into effect this past Monday. Varmus, the former NIH director who co-founded the Public Library of Science (PLoS), talks about what the mandate means for the future of biomedical research, fielding questions about everything from freeing dark data to expanding access to orphan disease research to reclaiming our scientific heritage in the literature.

Among the many other excellent points he makes, Varmus argues that more research funders should adopt open access policies — not only to magnify the impact of the research they fund, but also to open it to innovative uses:

You can imagine that if you were a funder of science anywhere in the world, you would want the results that you paid for to be out there for everyone — not just to see, but to work with. Indeed, the way in which one works with the information is extraordinarily important in this day in which we use the computer to mine research data for new ways to think about things.

Amen.

In case you missed it earlier this week, here’s Varmus’s PLoS editorial on the mandate: Progress Toward Public Access to Science. For more information about the NIH mandate going forward, including university-sponsored resources for authors, check out SPARC‘s NIH implementation page.

caBIG: sharing data to save lives

April 9th, 2008

The Scientist has a not-to-be-missed piece this month on the National Cancer Institute’s caBIG, the Cancer Biomedical Informatics Grid. The article, Heading for the BIG Time (free registration required), was written by caBIG founder Kenneth Buetow, and serves as an excellent introduction to the reasons why we need a collaborative infrastructure for knowledge sharing in science — one that works, to use Buetow’s phrase, as a “smart World Wide Web” for research.

“From my position as a senior cancer researcher at the NCI, groundbreaking observations and insights in biomedicine are accumulating at a dizzying rate. However, from the perspective of the approximately 1.4 million US patients who will hear their physicians say, ‘You have cancer,’ progress is unacceptably slow,” explains Buetow. “Something needed to be done to expedite the transformation of scientific findings into clinical solutions.”

That’s a daunting challenge, says Buetow, especially given the nature of the disease and the way research is currently done. “Cancer is an immensely complex disease, and in order to get a sense of the big picture, scientists need to combine observations from genomics, proteomics, pathology, imaging and clinical trials,” he says. “There was, however, no systematic way to do this.”

The caBIG solution: figuring out what kinds of data could be shared, and then connecting more than 60 NCI centers of cancer research using a strategy of “standards-based interoperability,” where information is shared and accessed using common standards and tools. The results so far are promising: the caBIG community is growing, and people are already adding new tools that increase the utility of the shared data.

We’re proud to have been a part of the NCI’s ongoing discussions about intellectual property and data sharing recently, as well. John Wilbanks was a closing panelist at the most recent Data Sharing and Intellectual Capital meeting of the Grid. We look forward to continuing – and deepening – our relationship with the Grid over the coming months.

Design a book cover, protect the public domain

April 7th, 2008

James Boyle, the new Chairman of the Board at Creative Commons and a founder of Science Commons, is holding a contest to design a cover for his new book, The Public Domain: Enclosing the Commons of the Mind. In the book, Boyle argues that more and more of material that used to be free to use without having to pay a fee or ask permission is becoming private property — at the expense of innovation, science, culture and politics.

Details, including specs and a link to some great source material for imagery, are available at the Worth1000 website. Both the book and the cover will be distributed under a CC Attribution-NonCommercial license.

Hal Abelson on commons-based problem solving

April 4th, 2008

If you’re curious about the current state of play in efforts to make knowledge sharing easier so we can solve problems faster, look no further. Science Commons Advisory Board member Hal Abelson — a founding director of Creative Commons, the Free Software Foundation and Public Knowledge — provides a big-picture perspective on these efforts in the latest podcast interview from MIT Libraries in its terrific series on Scholarly Publishing & Copyright.

“The way I first got into this is with software,” explains Abelson in the interview. “Prior to [the early ’80s], if there was a program around, you could contact the author and get a copy, and you could make it better — that was the key thing, to make it better. Around the mid-’80s, that stopped…Then it became kind of clear that this attitude — not sharing your software, and thus jeopardizing any kind of collective enterprise that could exist — was also going to be true for other kinds of copyrighted works, as they increasingly got online.”

You can download the full podcast here [MP3]. If you’d like to subscribe to the series, you can paste the following link into your iTunes or another podcast reader: http://feeds.rapidfeeds.com/6772/

Voices from the future of science

April 2nd, 2008

Over the past few months, you may have noticed that some of the posts here have been attributed to a mysterious “dwentworth.” That’s me — Donna Wentworth — and I’m here to start bringing more of your voices to the Science Commons blog.

The introduction may seem a little late, but it’s for good reason: I’ve had a lot learn. I’ve been writing about innovation and the net for ten years now — first at Harvard’s Berkman Center for Internet & Society and Corante, and then at the Electronic Frontier Foundation and Google. I’ve been a supporter of Creative Commons from the very beginning, as well as a fan of the eloquent James Boyle. Back in the fall, when I stumbled on a video of Jamie’s presentation at Google on Science Commons, I was riveted. Here, I thought, is where Creative Commons can make a difference on a whole new level — where innovative ways of licensing and sharing knowledge could actually end up saving lives.

A talk with my old friend from the Berkman Center, John Wilbanks, who now leads Science Commons, got me even more excited. Always full of infectious enthusiasm, John made it easy for me to see the possibilities for the “open science” movement — where a series of small but important changes could set in motion a profound transformation in the way research is carried out. I decided to join the SC staff to see what I could do to help, including bringing more people into a common discussion about what the next steps should be. But before jumping in, I needed to take a look around, see where the conversations were already happening and figure out how these conversations are being translated to action.

Here’s where you come in. If you’re reading this blog, chances are you’re passionate about the future of science. You may even be a part of the vast, incredibly diverse community of people that actually make science happen: scientists, publishers, research company representatives, research foundation officers, computer scientists, entrepreneurs, librarians and more. Some of you may be bloggers yourselves, who track developments in your area of science and ended up at Science Commons once or twice.

My hope is that you’ll join me in turning up the volume on the conversation surrounding open science. As part of this effort, I’m going to start profiling individuals and organizations working to open new frontiers for innovation and discovery in science. I am also building a community blog roll — or a public aggregator, if that works better — for open science. The goal isn’t to endorse particular viewpoints or blogs, but instead to showcase the work that’s already being done to midwife a new way of sharing and building scientific knowledge, as well as to start identifying ways we can all work together.

With more open research coming online, the freedom to integrate information from disparate sources, and the ability to use computers to sort through and make sense of it, scientists will be more empowered than ever to find the answers they’re looking for. Let’s figure out how we can get there faster.

Please take a few minutes to send me an email or add a comment to this post with your choices for people and organizations to profile here at Science Commons, as well as your favorite blogs and other resources on open science. I look forward to hearing from you.

Before the boom: 13 percent of cancer literature is free

March 31st, 2008

Heather Morrison, who’s been tracking the growth of open access to medical literature, has posted baseline figures for the percentage of literature on cancer that’s freely available online in full text format, pre-NIH mandate:

Cancer:
13% of the literature in PubMed on cancer links to Free Fulltext.
By publication date range:
7% – within last 30 days
10% – within the last year
17% – within the last two years
21% – within the last 10 years

Opening access is the foundation for making the medical literature useful in the digital era, facilitating machine-assisted research using Semantic Web technologies — something that will become even more critical once the mandate goes into effect and the percentage of open literature starts rising.

(Hat tip to Gavin Baker @ Open Access News)

What can universities do to promote open access?

March 28th, 2008

Open access leader Peter Suber answers that question in the characteristically thorough and engaging lecture he gave on March 17th at Harvard’s Berkman Center for Internet & Society. The talk, co-sponsored by the Berkman Center, Science Commons and Harvard’s Center for Research on Computation and Society, gives a tour of five ways that universities can promote open access to research:

  • launching and filling their own OA repositories
  • supporting peer-reviewed OA journals
  • supporting OA monographs from their university presses
  • fine-tuning their promotion and tenure criteria to support excellent research even in unconventional places
  • educating faculty about copyright and OA itself

The Berkman Center has now posted video and audio of the entire lecture and ensuing discussion, and the slides are available here. And if you’re interested in responses, check out Stevan Harnad’s detailed commentary, as well as Suber’s reply at Open Access News.