Joshua Lederberg (1925-2008)

February 14th, 2008

Joshua Lederberg, a legendary scientist and one of the members of our Advisory Board here at Science Commons, passed away on February 2. He was 82. (See Rockefeller University’s write-up.)

Dr. Lederberg was a legend for lots of reasons. He won his Nobel at the tender age of 33 – two years younger than I am now – for his work on the organization of genetic material in bacteria.

Though it’s easy to forget these days, back in the late 1950s it was not yet understood that bacteria possess recombinant mechanisms like humans do. His work laid the foundation for a generation of discovery. He also was deeply involved in early artificial intelligence for science (the Dendral project) and the space exploration program, in addition to serving as president of Rockefeller University.

Dr. Lederberg was in many ways a paradigmatic great scientist. He was restless and curious about his work, and his interdisciplinary bent was a bracing reflection of the world of biology before the revolutions wrought in the 1970s and later in biotechnology, which led us to the contemporary world of hyperspecialization. His papers are online at the National Library of Medicine, in an early take on open access.

We didn’t get a lot of his time here at Science Commons. He was a very busy man, even in his last years. But the time that he did graciously share with us formed a huge part of our early thinking, helping us to focus on the things that let scientists be scientists – the infrastructure that invisibly lifts a researcher out of the muck of finding content and into the air, where a researcher can make discoveries, and the systems that facilitate the kind of cross-disciplinary friendships he built throughout his career.

Thank you, Dr. Lederberg, and farewell.

Finding the “sweet spot” for openness in healthcare

February 14th, 2008

The Committee for Economic Development has released a report [PDF] that looks at ways to harness “openness” to transform healthcare in the US. It gives a broad overview and analysis of the healthcare production chain, identifying areas where increased knowledge-sharing could yield enormous benefits — not least of which is the development of evidence-based medicine.

As Daniel Griffin at the Information World Review blog points out, the report appropriately defines openness in this context as a spectrum rather than a binary, and helpfully distinguishes between information that’s accessible and information that’s “responsive” and “malleable” (remixable). Writes Griffin:

Ultimately [the authors] say [openness] boils down to two things; the first is that information must be accessible, this means that data should be both available and free from restrictions while secondly responsiveness of that information refers to how malleable or redistributable the information is and therefore the more it can be considered ‘open’.

These are important distinctions to make, and the report serves as an excellent introduction to the opportunities and challenges of opening access to biomedical research. Our thanks to Eliott Maxwell for passing it along.

Science Commons gets its 15 minutes

February 7th, 2008

…or 14 minutes and 35 seconds, to be exact.

Over at the MIT Libraries News site, Ellen Duranceau has posted a new podcast interview [MP3] with our own John Wilbanks. The topic: how to overcome barriers to knowledge-sharing in science — legal, technical and cultural.

The podcast is part of a terrific series of talks with thinkers on open access publishing and innovation. You can check out the Web site or subscribe to the podcasts by pasting the following link into your iTunes or another podcast reader: http://feeds.rapidfeeds.com/6772/

Speeding drug discovery through the Semantic Web

January 28th, 2008

One of the most important reasons Science Commons exists is to help people find cures for disease faster. So we are delighted and honored to have made the FasterCures list of Ten to Watch in 2008 — recognizing “the top ten organizations, people, and ideas that are changing the face of biomedical research in 2008.”

FasterCures is a nonprofit organization dedicated to “saving lives by saving time.” FasterCures President Greg Simon, who oversaw a wide range of national scientific research initiatives as Chief Domestic Policy Advisor to former Vice President Al Gore, writes:

Collaborative science is the name of the game these days, as science gets bigger and more multi-disciplinary and the data available for research grows explosively. The technological opportunity presented by the Semantic Web for networking data and researchers will be transformative. Watch the Neurocommons Project, a demonstration of the power of the Semantic Web approach based on open access information.

What is a “Semantic Web” approach, and how can it help us meet the challenge of so-called network science?

In the simplest terms, it’s a way to mark research data so computers can help us make sense of it. The driving concept is that collaboration in science needs to make a shift from human-mediated to computer-mediated, from single-database access to data integration, from reading papers by people to reading papers with machines, and so on.

As part of the Neurocommons Project, we mark research that’s free to use — open access information — using the Semantic Web RDF language. This means that computers — not people — can sort through the data, giving researchers the ability to swiftly process much larger data sets. And that means that the research won’t simply be more accessible, it will also be easier to use — leading, we hope, to more (and faster) breakthroughs that benefit everyone.

Could the key to feeding the world be locked up in a company fridge somewhere?

January 10th, 2008

That’s the question Australia’s Science Show asks to introduce a newly available podcast discussion featuring Science Commons’ own John Wilbanks and Brian Fitzgerald, who heads up Creative Commons Australia. The question isn’t nearly as tongue-in-cheek as it sounds; the discussion is about how to unlock the value of scientific research when so much of it is routinely balkanized — hidden away behind walls of secrecy, cost and technical obscurity.

At Science Commons, we work to make scientific research easier to find, share and use. This includes providing tools to “mark” research with usage rights, so scientists can work and collaborate within zones of legal certainty. But as Wilbanks explains in the lecture, “we need freedom to innovate, not simply freedom to operate” — something that requires more than developing licenses and contracts:

I think it’s clear that we face an exponential set of problems but we don’t have an exponential innovation capacity…we need to think about what we can do to enable that innovation to emerge.

If you’d like to learn more about what Science Commons is doing to spur innovation and discovery in science, follow the link above to the podcast (and podcast transcript) and browse our slides for the lecture.

Why we need to figure out what we already know

January 4th, 2008

Over at Corante’s In the Pipeline , organic chemist Derek Lowe has a post that vividly demonstrates one of the unfortunate realities of the research environment: researchers can spend years “discovering” what’s already been discovered. In a recent case that Lowe cites, a group of researchers somehow managed to publish two papers documenting a chemical reaction that was discovered no less than a century ago:

Professor Manfred Cristl of Wurzburg, who apparently knows his pyridinium chemistry pretty well, recognized [the reaction] as an old way to make further pyridinium salts…. He recounts how over the last couple of months he exchanged awkward e-mails with the two sets of authors, pointing out that they seem to have rediscovered a 100-year-old reaction, and have they really looked at their spectral data closely, eh?

This kind of mistake is incredibly embarrassing; you can easily imagine a chemistry professor using it as a cautionary tale to help students understand the importance of a thorough literature search. But it also has deeper implications. One of the most common myths about the Web is that it makes mistakes like this impossible. If Google knows all, why not ask it to tell us everything there is to know about pyridinium chemistry?

Of course, it doesn’t work like that, for a number of reasons. One is that Google uses text-based matching to find web pages, and its method for determining whether a particular page is relevant relies in large part on the number of people who have linked to it. This is a useful for finding lots of things — especially consumer products. It doesn’t work nearly as well for scientific research.

At Science Commons, we want to enable scientists to use the Web to get precise answers to complex research questions — not 388,000+ search results that contain the word pyridinium. We hope to do much more than help researchers avoid embarrassing mistakes. We want to empower them to build on existing knowledge, so they can take the next steps toward discovery without having to repeatedly double-back.

John Wilbanks, who leads Science Commons, has a personal blog over on Nature Network, where he’s been sharing his ideas for realizing this vision. He writes: “[We] have to date published our knowledge in formats designed for a different world. One idea, one lab, one gene, one protein, one paper, one database. [If] we can mark up the knowledge better, we can do a lot better without a gee-whiz theoretical breakthrough, just by better using what we do indeed already know.”

If you’d like to learn about how Science Commons is “marking” knowledge, check out our Neurocommons project — an initiative aimed at demonstrating how a Semantic Web approach to making information useful can help us figure out what we already know.

Ensuring the freedom to integrate — why we need an “open data” protocol

December 20th, 2007

It’s only been a few days since we announced the release of the Protocol for Implementing Open Access Data, and we’re thrilled to see so many people posting responses. The protocol is a guide for people and organizations that want to “mark” their scientific data as open — free to use without restrictions. It was developed collaboratively — an inspiring sign of the momentum that’s been building behind developing common ways to open and integrate scientific databases around the world.

One of the great things about being part of the open science community is that we can work together to help more people understand why the “freedom to integrate” matters. Here are a few excerpts from excellent posts that do just that:

Richard Wallis @ Talis, on the problem the protocol aims to solve: “There is plenty of data out there, but it is often trapped in silos or hidden behind logins, subscriptions, or just plain difficult to get hold of. There again you can also find some data ‘just out-there’ — *can I use it, whose is it, will it stay there, will I be sued for using it, if I start depending on it, will I suddenly have to start paying* — all questions that come to mind.”

Eric Kansa, Society for American Archaeology, Digital Data Interest Group (DDIG), on the legal issues: “Data sources are…highly global, and therefore subject to all sorts of legal jurisdictions and rules…The solution that Science Commons advocates is to essentially move all open science data repositories to a common legal baseline, which is basically the public domain. [Following the protocol, scientific] data repositories that want to be ‘open’ [could] shape their terms of use, copyright, and other policies so their content is essentially public domain and freely remixable with other resources…Comment: Wow!

Glyn Moody, UK journalist and author, on the real-world implications: “The solution is at once obvious and radical…It is this pragmatism, rooted in how science actually works, that makes the current protocol particularly important: It might actually be useful.”

Deepak Singh, co-founder of Bioscreencast, on what the protocol could mean for science: “I consider just the announcement to be a monumental moment. Will it change how scientific information is shared and disseminated? I don’t know. But my hope has always been that Science Commons would lead the way.”

Thanks to everyone who has taken the time to read the protocol, ask questions and help spread the word. For those of you who would like to learn more about the origins of the protocol and the two new licenses that represent its first implementations — the ODC PDDL and CC Zero — here are a few additonal links you might want to check out:

Announcing the Protocol for Implementing Open Access Data

December 16th, 2007

Today, in conjunction with the Creative Commons 5th Birthday celebration, Science Commons announces the Protocol for Implementing Open Access Data (“the Protocol”).

The Protocol is a method for ensuring that scientific databases can be legally integrated with one another. The Protocol is built on the public domain status of data in many countries (including the United States) and provides legal certainty to both data deposit and data use. The protocol is not a license or legal tool in itself, but instead a methodology for a) creating such legal tools and b) marking data already in the public domain for machine-assisted discovery.

You can read the Protocol here.

We built the Protocol after a year- long process of meetings and consultations with a broad set of stakeholders, including representatives of the geospatial and biodiversity science communities. We solicited input from international representatives from China, Uganda, Brazil, Japan, France, Netherlands, Germany, Italy, the United Kingdom, Colombia, Peru, Belgium, Catalonia and Spain.

We expect to convert this work into a working group with founding members from our existing communities of practice. However, the world is moving very quickly in terms of data production, and as such we created the Protocol as a guide and as a tool to bring together the existing data licensing regimes into a single space.

As part of that decision, Science Commons has worked with data licensing thought leaders and is pleased to announce partnerships with Jordan Hatcher, the lawyer behind the Open Database License; Talis, the company behind the Open Database License process; and the Open Knowledge Foundation, creators of the Open Knowledge Definition.

Jordan has drafted the Open Data Commons Public Domain Dedication and License – the first legal tool to fully implement the Protocol. It is available at his Web site. This draft is remarkable not just for the Public Domain Dedication but for the encoding of scholarly and scientific norms into a standalone, non-legal document. This is a key element of the Protocol and a major milestone in the fight for Open Access data. Talis, a company with a strong history in the open science data movement, played a key role in birthing Jordan’s work, and we’re pleased to work with them as well.

We are also pleased to announce that the Open Knowledge Foundation has certified the Protocol as conforming to the Open Knowledge Definition. We think it’s important to avoid legal fragmentation at the early stages, and that one way to avoid that fragmentation is to work with the existing thought leaders like the OKF.

We will be launching a wiki for comments on the Protocol soon, and will announce a strategy for versioning the Protocol in 2008.

Video from Berlin 5 now online

December 7th, 2007

Photos and video from Berlin 5 Open Access conference are now online. The event was held in Padova, Italy this past September.

If you watch nothing else, I’d recommend watching the closing session of the conference.

Alma Swan takes on the task of the closing session, weaving together beautifully the main points touched on over the course of the event, ending on a positive note. She begins with a brief refresher on OA before delving into the meat of the session: strategies and tactics for Open Access. In this, she asks the ever important question, ” Can we make them work?” (Part 1, Part 2)

There are so many clips to recommend, covering a wide spectrum in terms of topic, but others to note: Peter Murray-Rust on Open Data, Ilaria Capua’s tale of her own “access” issues and Salvatore Mele on CERN’s new project, SCOAP3.

Looking for more information about the sessions? I blogged a short run down of the event, as did Peter Murray-Rust (in much more detail). Abstracts can also be found on the conference Web site.

Introducing BMC Proceedings

December 7th, 2007

Announced yesterday on BioMed Central’s blog, the publishing group will soon be launching a new open access journal for conference proceedings – BMC Proceedings. According to the post, the journal will go live later this month, and will accept an array of conference material, ranging from abstracts to review articles and other related content from events in biology or medicine.

Proceedings material will also be indexed and submitted to Pubmed and PubMed Central. All content will be peer-reviewed, as well.

This is wonderful news. BioMed Central is one of more than 250 peer-reviewed scholarly journals that license their content under a CC-license. Proceedings will follow suit, adding another useful OA resource to the arsenal. Stay tuned for the launch!

Update: Soon after writing about BMC Proceedings, BioMed Central announced (via their blog) another open access journal they have in the queue. The new OA journal, BMC Research Notes, will “provide a home for short publications, case series, incremental updates to previous work, results of individual experiments and similar material that currently lack a suitable outlet.  The intention is to reduce the loss suffered by the research community when such results remain unpublished.” The launch is said to take place in early 2008. You can read more about it here