Blog archive for December, 2007

Ensuring the freedom to integrate — why we need an “open data” protocol

December 20th, 2007 by dwentworth

It’s only been a few days since we announced the release of the Protocol for Implementing Open Access Data, and we’re thrilled to see so many people posting responses. The protocol is a guide for people and organizations that want to “mark” their scientific data as open — free to use without restrictions. It was developed collaboratively — an inspiring sign of the momentum that’s been building behind developing common ways to open and integrate scientific databases around the world.

One of the great things about being part of the open science community is that we can work together to help more people understand why the “freedom to integrate” matters. Here are a few excerpts from excellent posts that do just that:

Richard Wallis @ Talis, on the problem the protocol aims to solve: “There is plenty of data out there, but it is often trapped in silos or hidden behind logins, subscriptions, or just plain difficult to get hold of. There again you can also find some data ‘just out-there’ — *can I use it, whose is it, will it stay there, will I be sued for using it, if I start depending on it, will I suddenly have to start paying* — all questions that come to mind.”

Eric Kansa, Society for American Archaeology, Digital Data Interest Group (DDIG), on the legal issues: “Data sources are…highly global, and therefore subject to all sorts of legal jurisdictions and rules…The solution that Science Commons advocates is to essentially move all open science data repositories to a common legal baseline, which is basically the public domain. [Following the protocol, scientific] data repositories that want to be ‘open’ [could] shape their terms of use, copyright, and other policies so their content is essentially public domain and freely remixable with other resources…Comment: Wow!

Glyn Moody, UK journalist and author, on the real-world implications: “The solution is at once obvious and radical…It is this pragmatism, rooted in how science actually works, that makes the current protocol particularly important: It might actually be useful.”

Deepak Singh, co-founder of Bioscreencast, on what the protocol could mean for science: “I consider just the announcement to be a monumental moment. Will it change how scientific information is shared and disseminated? I don’t know. But my hope has always been that Science Commons would lead the way.”

Thanks to everyone who has taken the time to read the protocol, ask questions and help spread the word. For those of you who would like to learn more about the origins of the protocol and the two new licenses that represent its first implementations — the ODC PDDL and CC Zero — here are a few additonal links you might want to check out:

Announcing the Protocol for Implementing Open Access Data

December 16th, 2007 by John Wilbanks

Today, in conjunction with the Creative Commons 5th Birthday celebration, Science Commons announces the Protocol for Implementing Open Access Data (“the Protocol”).

The Protocol is a method for ensuring that scientific databases can be legally integrated with one another. The Protocol is built on the public domain status of data in many countries (including the United States) and provides legal certainty to both data deposit and data use. The protocol is not a license or legal tool in itself, but instead a methodology for a) creating such legal tools and b) marking data already in the public domain for machine-assisted discovery.

You can read the Protocol here.

We built the Protocol after a year- long process of meetings and consultations with a broad set of stakeholders, including representatives of the geospatial and biodiversity science communities. We solicited input from international representatives from China, Uganda, Brazil, Japan, France, Netherlands, Germany, Italy, the United Kingdom, Colombia, Peru, Belgium, Catalonia and Spain.

We expect to convert this work into a working group with founding members from our existing communities of practice. However, the world is moving very quickly in terms of data production, and as such we created the Protocol as a guide and as a tool to bring together the existing data licensing regimes into a single space.

As part of that decision, Science Commons has worked with data licensing thought leaders and is pleased to announce partnerships with Jordan Hatcher, the lawyer behind the Open Database License; Talis, the company behind the Open Database License process; and the Open Knowledge Foundation, creators of the Open Knowledge Definition.

Jordan has drafted the Open Data Commons Public Domain Dedication and License – the first legal tool to fully implement the Protocol. It is available at his Web site. This draft is remarkable not just for the Public Domain Dedication but for the encoding of scholarly and scientific norms into a standalone, non-legal document. This is a key element of the Protocol and a major milestone in the fight for Open Access data. Talis, a company with a strong history in the open science data movement, played a key role in birthing Jordan’s work, and we’re pleased to work with them as well.

We are also pleased to announce that the Open Knowledge Foundation has certified the Protocol as conforming to the Open Knowledge Definition. We think it’s important to avoid legal fragmentation at the early stages, and that one way to avoid that fragmentation is to work with the existing thought leaders like the OKF.

We will be launching a wiki for comments on the Protocol soon, and will announce a strategy for versioning the Protocol in 2008.

Video from Berlin 5 now online

December 7th, 2007 by Kaitlin Thaney

Photos and video from Berlin 5 Open Access conference are now online. The event was held in Padova, Italy this past September.

If you watch nothing else, I’d recommend watching the closing session of the conference.

Alma Swan takes on the task of the closing session, weaving together beautifully the main points touched on over the course of the event, ending on a positive note. She begins with a brief refresher on OA before delving into the meat of the session: strategies and tactics for Open Access. In this, she asks the ever important question, ” Can we make them work?” (Part 1, Part 2)

There are so many clips to recommend, covering a wide spectrum in terms of topic, but others to note: Peter Murray-Rust on Open Data, Ilaria Capua’s tale of her own “access” issues and Salvatore Mele on CERN’s new project, SCOAP3.

Looking for more information about the sessions? I blogged a short run down of the event, as did Peter Murray-Rust (in much more detail). Abstracts can also be found on the conference Web site.

Introducing BMC Proceedings

December 7th, 2007 by Kaitlin Thaney

Announced yesterday on BioMed Central’s blog, the publishing group will soon be launching a new open access journal for conference proceedings – BMC Proceedings. According to the post, the journal will go live later this month, and will accept an array of conference material, ranging from abstracts to review articles and other related content from events in biology or medicine.

Proceedings material will also be indexed and submitted to Pubmed and PubMed Central. All content will be peer-reviewed, as well.

This is wonderful news. BioMed Central is one of more than 250 peer-reviewed scholarly journals that license their content under a CC-license. Proceedings will follow suit, adding another useful OA resource to the arsenal. Stay tuned for the launch!

Update: Soon after writing about BMC Proceedings, BioMed Central announced (via their blog) another open access journal they have in the queue. The new OA journal, BMC Research Notes, will “provide a home for short publications, case series, incremental updates to previous work, results of individual experiments and similar material that currently lack a suitable outlet.  The intention is to reduce the loss suffered by the research community when such results remain unpublished.” The launch is said to take place in early 2008. You can read more about it here

NPG introduces a CC license for genome research

December 6th, 2007 by Kaitlin Thaney

In a move to make genome research more accessible, Nature Publishing Group (NPG) has introduced a new editorial policy that will put genome research published by Nature under a CC-BY-NC-SA license. The license grants readers the ability to share and remix the material under the following conditions: the work must be attributed to the author as specified by the author of licensor, cannot be used for commercial purposes, and that any derivative works be licensed under the same or a similar license. NPG’s editorial policy can be read in full here.

An editorial posted today discusses some of the reasoning behind enacting this new author license policy.

From the Nature editorial, “Shared genomes” (December 6, 2007):

“In the continuing drive to make papers as accessible as possible, NPG is now introducing a ‘creative commons’ licence for the reuse of such genome papers. The licence allows non-commercial publishers, however they might be defined, to reuse the pdf and html versions of the paper. In particular, users are free to copy, distribute, transmit and adapt the contribution, provided this is for non-commercial purposes, subject to the same or similar licence conditions and due attribution.

In 1996, as human genome sequencing was getting under way, leading players stated: “It was agreed that all human genomic sequence information, generated by centres funded for large-scale human sequencing, should be freely available and in the public domain in order to encourage research and development and to maximise its benefit to society” (see [the Bermuda principles]). These principles have continued to guide the field, and NPG has consistently made genome papers freely available in keeping with them. This new licence allows us to formalize the arrangement.”

This is definitely a step in the right direction for Open Access, and we always cheer use of CC licenses, although I wish they’d chosen the Attribution license. The Non-Commercial and ShareAlike provisions of CC-BY-NC-SA seem to be in conflict with some of the terms in the Budapest Declaration. But anything that gets us closer to OA and supports the open licensing of foundational research papers is good medicine, indeed.

What’s “open source knowledge management”?

December 5th, 2007 by dwentworth

One of the biggest challenges we face at Science Commons is explaining what we do — and, much more important, why it matters.

In many cases, the core problem is translation; what makes sense to a computer scientist is gibberish to a life scientist, and vice versa. We want to connect with as many people as possible, both within the scientific community and beyond. To that end, we’ve decided to publish a series of posts to bring more clarity to the terms and phrases we use. To make sure these posts are truly useful, we’ll be asking for your feedback. Got questions? Criticism? A better definition of a term than the one we’re proposing? We hope you’ll send us an email or add your comments to the post.

First up is a term we’ve been using to describe our Neurocommons project: “open source knowledge management.” This is a hybrid term that splices together concepts from the worlds of business and software development.

The first part, “open source,” is derived from “open source software.” Open source software is software that’s published with licensing to allow anyone to look under the hood at the underlying “source” code to see how it works. Any developer can copy the code and modify it — either to improve the original software, or build on it to create something brand new.

“Knowledge management,” or KM, is a term often used by businesses to describe the systems they have for organizing, accessing and using information — everything from the data in personnel files to the number of products on store shelves. One reason that it’s “knowledge” management rather than “information” management is that the word knowledge connotes use of information, not just its availability. Having the ability to use information is what makes it valuable. One classic example is Wal-Mart, which used real-time data about its inventory to realize tremendous, game-changing efficiency gains and cost-savings.

So how is our Neurocommons project an “open source knowledge management” project? In a nutshell, Science Commons is developing all of the key elements for a free, web-enabled KM system for biological research that anyone can use, and anyone can build on. Right now, scientists don’t know “what’s on the shelf” — either in terms of research data or materials. They don’t have an easy way to sort through or make sense of the terabytes of data being produced in laboratories around the world. They certainly don’t have “one-click” access to materials like cell lines. We want to change that. Our goal isn’t (simply) to increase efficiency in the research cycle and magnify the impact of investments in research. Ultimately, we hope to speed the pace of discovery — unlocking the value of research so more people can benefit from the work scientists are doing.

If you’d like to learn more about the Neurocommons project, check out our project page. If you have questions, send us an email. We’d love to hear from you.