Weblog

Science Commons Symposium

March 9th, 2010

The videos from Science Commons Symposium are now live online!

As many of you know, we recently held an event called Science Commons Symposium at the Microsoft campus in Redmond Washington. It was a day full of presentations from leaders in the fields of Open Science, Open Access and Open Data. The conversations that took place during the breaks and the post-event reception were just as stimulating as the presentations. I was thoroughly exhilarated by the exciting ideas and bright, passionate people discussing them.

One topic that was heavily discussed in both the speaker presentations and in the hallway conversations was Panton Principles. The Panton Principles are a set of best practices for open data, and were official launched at the symposium by Cameron Neylon in the opening talk. Cameron Neylon, Peter Murray-Rust and John Wilbanks were all instrumental in crafting the Panton Principles and all three addressed them in their presentations. You can find links to video of the presentations below.

Though John Wilbanks did address the Panton Principles, he did so with the perspective of a bigger picture. John’s keynote summed up the history and the future goals of the Open Science movement. John said that the real goal is generative science and defined generative with a quote from Jonathan Zittrain:

“Generativity is a system’s capacity to produce unanticipated change through unfiltered contributions from broad and varied audiences.”


There are several excellent blog posts by people who attended the symposium.

Jean-Claude Bradley, one of the speakers, provides a personal perspective on the day, a brief summary of the presentations and his presentation slides on his blog Useful Chem.

Another speaker, Antony Williams blogged about how the generative science message in John Wilbanks’ keynote resonated powerfully with him and is very well-aligned with his motivation for ChemSpider. You can read Antony’s post and see his presentation slides on his ChemSpider blog.

There were quite a few science librarians in attendance at the symposium. Alison Aldrich sums up the presentations from a librarian’s point of view on her blog Dragonfly. Leave it to a librarian to provide clear, concise summaries of all the presentations along with links to supplemental information.

If you weren’t able to attend in person – or if you were but just need a refresher on the overwhelming amount of excited ideas that were discussed – you will really appreciate the detailed notes taken by Brian Glanz of the Open Science Foundation. Brian diligently took notes on his pad throughout the day and has posted them in an open blog on the Open Science Foundation website.

The Microsoft Research team was our gracious host. In addition to providing us with a wonderful space for the event, keeping everyone well fed and caffeinated, and giving out copies of the new CC-licensed book The Fourth Paradigm, they did an excellent job of capturing the presentations on video. The video is now live and can be watched online at:

Session 1 Featuring Lisa Green, Lee Dirks, Stewart Tansley & Kris Tolle, Cameron Neylon and Jean-Claude Bradley

Session 2 Featuring Antony Williams and Peter Murray-Rust

Session 3 Featuring Heather Joseph and Stephen Friend

Session 4 Featuring Peter Binfield and John Wilbanks

The videos are a wonderful resource – for those of us who were there as well as the people who were not able to be physically present at the symposium. I know that I will be watching them more than once in the near future. The ideas presented at Science Commons Symposium are among the most important in science; they are the ideas that will shape the future of science.

Data Sharing on the Web

February 5th, 2010

The February issue of Talis’ Nodalities magazine focuses on data sharing and includes an article by Science Commons’ own Kaitlin Thaney.

Last October, Kaitlin joined Jordan Hatcher, Leigh Dodds and Tom Heath to give a four hour tutorial titled “Legal and Social Frameworks for Sharing Data on the Web” at the International Semantic Web Conference (ISWC).  The tutorial covered the legal and social issues commonly found in data publishing, using the Linked Data Cloud as a leading example of how copyright restrictions, complex licenses and lack of clarity can quickly exacerbate problems for data sharing efforts.

Wish you could have been there? Me too.  Luckily for us, all four of the tutorial leaders are composing their thoughts on data sharing for articles in Nodalities.  Kaitlin produced a great piece for this month’s issue called Data Sharing on the Web .  Be sure to check it out! The hardcopy version of the February Nodalities is not out yet, but you can read it online here.

Nodalities is licensed under CC-BY-SA — but Kaitlin’s article, per her request, is under CC-BY.

T-shirt Contest Goes Global

February 3rd, 2010

Our Science Commons t-shirt design contest is now open to entrants outside of the US.  We realized that while we might not be able to afford to fly someone to Seattle from Australia or Germany, we would be missing out on too much talent if we limited the contest to US residents. The prize will be slightly different and negotiated separately, but all are welcome to submit their designs.

You can read more about the contest here in last week’s post, but a reminder about the quickly approaching deadline. Designs are due by February 12th – less than two weeks away – at 5 pm PST. Image files must be formatted as .jpg .png .ai .psd or .svg  An image of the Science Commons logo can be downloaded at our Logo Page. Contest submissions and any questions can be directed to Lisa Green.

So tell all your friends, get those pens to paper and jumpstart your design software. Times a’ tickin.

MichiganView releases remote sensing data under CC0 waiver

January 29th, 2010

Puneet Kishor is a Science Commons Fellow, specializing in geospatial issues and open data, and a guest blogger here at Science Commons.

Starting Jan 28, 2010, MichiganView is making available all of its more than 93 Gigabytes of Landsat 5 and 7, and NAIP imagery data in the public domain using the new CC0 Waiver provided by Creative Commons. The MichiganView consortium makes available aerial photography and satellite imagery of Michigan to the public for free over the Web. As part of the AmericaView consortium, MichiganView supports access and use of these imagery collections through education, workforce development, and research. CC0 (pronounced CC-Zero) waives any rights in a dataset, ensuring that all of the dataset is available to anyone without encumbrance of any kind.

More information on CC0 is available here, and the reasoning behind the protocol is described here. Further questions about MichiganView may be directed to Dr. Tyler Erickson, Director, MichiganView at tyler.erickson@mtu.edu and questions about the CC0 waiver may be directed to Puneet Kishor, Science Commons Fellow (Geospatial Data) at punkish@creativecommons.org.

Design a new t-shirt for Science Commons and win a trip to Seattle to attend Science Commons Symposium – Pacific Northwest!

January 27th, 2010

Science Commons needs a new t-shirt design. Sure, the “e=mc(shared)shirts are great – mine gets worn more than any other shirt in the dresser – but after years of having that design, we think it would be nice to have a new one. We are turning to the crowd to find a great new design and giving away a prize to the person with the winning design.

We’re calling on all of the science-minded Commons fans out there – from the design savvy to the slightly more artistic engineers and bench scientists. All are welcome to submit, so why not take a break from the lab and research, and take a stab at t-shirt design? Get geeky. We dare you.

The person with the winning design will win a trip to Seattle and a ticket to attend Science Commons Symposium – Pacific Northwest where the new t-shirt will be officially launched. The symposium is an exhilarating event that you don’t want to miss! It will be a full day of stellar speakers presenting on: data accessibility and sharing, open access publishing, social norms of the scientific community, web-enabled research and other exciting ideas. Click here to check out the speaker list.

Submit your entries via email by Friday February 12th. More details on the rules below.

Rules:
Prize consists of round trip airfare from a US city to Seattle, two night hotel and a ticket to Science Commons Symposium. If you are from outside of the US and would like to participate, let us know and we can work out a different prize for you. Image files must be formatted as .jpg .png .ai .psd or .svg Entries must be received by Friday February 12th at 5pm PST. An image of the Science Commons logo can be downloaded at our Logo Page Contest submissions and any questions can be directed to Lisa Green.

Remembering Babel: Open Data Sharing & Integration

November 19th, 2009

Since the release of CC0, I’ve been talking to many people about when and how to use it. A group of scientists and science policy experts recently endorsed public domain data sharing, and the use of CC0 to do so, in a letter to Nature. This is a significant affirmation of our approach to data sharing. But a question that inevitably arises in many discussions is: What about data providers that are unable or unwilling to commit their data to the public domain? Will Creative Commons support providing a flexible set of licensing options, intermediate between public domain, on the one hand, and full control (secrecy), on the other?

First, I have to clarify what I mean by “data” in this discussion. “Data” by itself can mean anything, including music, movies, pictures, and other things that are clearly copyrightable. But in this discussion, I will use the term “data” in a narrower and more specific sense:  we mean facts, ideas, and concepts that are not copyrightable by themselves. An example would be Einstein’s E=MC^2 equation, the height of Mount Everest, or the coordinates of a particular star. The unprotected status of these data was affirmed in Feist Publications vs. Rural Telephone Service, where the U.S. Supreme Court found that originality is a basic Constitutional prerequisite for copyright to exist, or as Justice O’Conner, writing for the majority, said: “It is this bedrock principle of copyright that … No one may claim originality as to facts.” (emphasis added) The U.S Copyright Act further codifies this principle as a limitation on the scope of copyright protection (at Section 102(b)). Likewise, other countries recognize this limitation in their originality requirements.

This basic limitation on the scope copyright acknowledges that copyright is inherently a social compromise between the desire to reward authors for creative output and the need to protect a reservoir of facts and ideas available for everyone to draw upon. Without this “commons” of facts and ideas, social discourse and creativity would suffer. As Lawrence Lessig writes, in The Future of Ideas, “”Free resources have been crucial to innovation and creativity… without them, creativity is crippled. Thus, and especially in the digital age, the central question becomes not whether government or the market should control a resource, but whether a resource should be controlled at all. Just because control is possible, it doesn’t follow that it is justified. Instead, in a free society, the burden of justification should fall on him who would defend systems of control.”

And yet, over time, copyright control has expanded dramatically in scope and duration, straining this delicate social compromise. Ironically, it is the growth and success of the Internet, with its extraordinary power and freedom, that has spurred renewed interest in extending copyright-like controls even beyond the traditional realms of copyright itself. Databases containing myriad facts and ideas, once considered public domain if shared publicly, are now the subject of efforts to create new systems of control. In Europe, by E.U. Directive, countries have implemented “sui generis” database rights that protect databases and their contents even if they are too unoriginal to merit copyright protection. Other countries grant copyright protection to databases under relaxed copyright standards that demand less than full originality or creativity.

Finally, there are attempts to create systems of control based on contract law (like click-wrap agreements, Web site terms of use, etc.), premised not on the existence of any copyright or statutory right, but merely on voluntary agreement. Contracts can expand copyright-like controls well beyond the boundaries of traditional copyright or even sui generis protections, and indeed have no inherent limits other than the enforceability of the agreement (which can be problematic in itself). Not only do such contracts apply to uncopyrightable data, but they can also impose controls on data already otherwise in the public domain, since the issue is not the status of the data but whether you consented to abide by a contract. A recent example is the Open Data Commons’ Open Database License (ODbL), which is being considered for adoption by the OpenStreetMap community, among others. The Open Data Commons not only has been a strong supporter and advocate for open data sharing, but it has provided important community tools, including the Public Domain Dedication and License (PDDL). But unlike the previously released PDDL, the new ODbL contains attribution and share-alike obligations, among other requirements. Its terms and conditions are imposed on copyright or sui generis database rights, but it also purports to act as a contract in the absence of these protections. As a result, it attempts to impose obligations on data that even copyright and sui generis rights do not reach.

With CC0, Creative Commons has chosen to take a different approach (or rather, to stick with an approach similar to the PDDL). CC0 is a way to give up controls and dedicate data to the public domain (or as close to it as we can legally achieve). As I have explained elsewhere, we were concerned about the practical impact of “attribution stacking” and license compatibility problems for data sharing communities. Attribution stacking can burden large-scale data sharing projects that draw on many sources and license compatibility problems can shut down data integration efforts altogether.

In science, an area that I focus on, sharing data in the public domain is in fact part of a long and honored tradition. Before the Internet, data was published, if at all, in journals in print. The articles themselves may be copyrightable, but the facts and ideas revealed there were presumed to be in the public domain. Only with the advent of the Internet and digital technology has there been interest in “licensing” contents of databases including such facts and ideas. Thus, where there is an established tradition of public domain data sharing that has worked well for a community–and continues to work well– any new system of control must meet a high burden of justification. But based on our experiences with other licensing schemes, we know that such controls carry risks. Even a simple requirement like attribution, when aggregated over thousands or millions of data elements, can become a very serious burden. Scientists should provide attribution (and citation) for valid scientific reasons, and no legal requirement may be flexible enough to replace common sense or professional judgment, an important ingredient in deciding what to attribute and how. In addition, license incompatibility problems, which are especially relevant with share-alike licenses, can prevent databases or data sources from being combined or integrated or data from being reused. All of this can have a negative impact on the usability of scientific data.

In light of such risks, what could justify departures from the public domain? One argument, made to me eloquently by several data project organizers, is that unless we grant providers the flexibility to impose some controls–rather than none–they will be reluctant or unwilling to grant any access. And even restricted sharing, with some conditions, is better than no sharing at all. Further, they argue that some extremely valuable data sets might fall into this category, because the more valuable the data, the less likely it is that someone would consider simply releasing it into the public domain. And so by not offering a graduated system of controls, like the CC suite of copyright licenses, important opportunities to share are being missed, with serious consequences for those communities and perhaps for all of us. I have to admit that it’s a powerful argument against being too dogmatically attached to the public domain, and if true, it might justify other approaches.

At issue is whether more data would be made available under a more restrictive system than the public domain and to what extent those restrictions impair the value of that data to the community. I don’t think we know the answer fully yet. It’s a question that undoubtedly deserves more research by sociologists and other scholars, based on empirical evidence. But, when in doubt, what should be done? I come back to Lessig’s admonition that, “the burden of justification should fall on him who would defend systems of control.” I think the best that can be said for more restrictive systems of sharing data is “not yet proven.” And that’s why we will continue to advocate public domain and CC0 for data sharing.

Ontology sharing and copyright considerations

November 3rd, 2009

Important (and exciting) news in the world of shared vocabularies at Science Commons, a key component of our technical work to make knowledge sharing more efficient.

As of last week, OWL 2 – a standard web ontology language – was formally recommended by the World Wide Web Consortium (W3C) as part of their Semantic Web activity. Science Commons’ Alan Ruttenberg has been diligently working with the OWL working group specifying OWL 2 at the W3C to push this recommendation through. (Ruttenberg is the co-chair with Ian Horrocks at Oxford.) The W3C says that the transition to OWL 2 is a reflection of user experience with OWL, and the need to enable seamless integration and scalability.

From the W3C’s announcement:

“[OWL 2] allows people to capture their knowledge about a particular domain (say, energy or medicine) and then use tools to manage information, search through it, and learn more from it. Furthermore, as an open standard based on Web technology, it lowers the cost of merging knowledge from multiple domains.”

Also, building off of our existing work around the application of copyright licenses to content and data, there is now a resource available at sciencecommons.org that sheds light on copyright considerations for ontologies. We have long been asked what is the best means to license (or not) ontologies, a topic that’s not always easy to discern in terms of applicable rights regimes.

The resource explores when copyright may apply to an ontology as well as a number of other concerns regarding protection and the means to achieve that.

You can find this resource – “Ontology Copyright Licensing Considerations” – in our Reading Room.

Nguyen on ‘hidden legal barriers’ to research

October 22nd, 2009

The latest edition of Genomics Law Report features a guest commentary on the roadblocks to knowledge sharing and scientific research, penned by Science Commons Counsel Thinh Nguyen. The report is a publication out of the law firm Robinson Bradshaw & Hinson.

In the commentary, “The Hidden Legal Barriers to Scientific Research”, Nguyen details some of the lesser discussed hurdles to scientific research, from materials transfer agreements to the legal implications surrounding the sharing of data and interoperability.

He writes:

“The increasing extent to which legal formalities, and thus lawyers, intermediate scientific sharing, not only between academia and industry but increasingly between academics themselves, represents a cultural sea-change with important consequences for how science is conducted in the future. The public and unrestricted availability of genomic databases and resources, or the “commons”, must be cultivated and defended, through a robust community consensus, or it can easily fragment into legally constructed “walled gardens” without public right of way.”

Well said.

For more writing on barriers to scientific research, visit our Reading Room and check out Nguyen’s “Freedom to Research”.

Wilbanks named one of Utne Reader’s 50 visionaries

October 20th, 2009

John Wilbanks, VP of Science at Creative Commons, has been named one of Utne Reader’s 50 visionaries, along with others ranging from the Dalai Lama, to Cory Doctorow and Brewster Kahle.

The piece, “50 Visionaries Who Are Changing Your World”, is the second of its kind coming from Utne Reader, an alternative news bi-monthly magazine based in the US. The list also drills down into some of the background on each of the “visionaries” — in this case crediting the work at Science Commons as pushing forth a “nerdy but important message” of access and sharing in the sciences to spur innovation and discovery.

“We have a network of knowledge,” Wilbanks says. “We need to liberate it enough that it can actually take off.”

To learn more about the other visionaries, click here.

CC0 endorsed in Nature opinion piece

September 9th, 2009

A new opinion piece in Nature on post-publication sharing of tools explicitly recommends open sharing and the use of CC0 to put data in the public domain. The special issue of Nature focuses on data sharing and is now online and accessible free of charge.

The piece “Post-publication sharing of data and tools” comes out of this year’s CASIMIR conference in Rome, and discusses the sharing of biological materials, specifically but not limited to mice and embryonic stem cells.

As you may recall, we initially wrote about this meeting back in June, following the publication of a similar opinion piece calling for better and more efficient sharing practices for physical materials. That article also stemmed from this meeting in Rome.

This opinion piece takes those ideas one step further in the discussion to a formal recommendation for open sharing under the least restrictive terms possible.

“[T]he Rome meeting strongly encouraged sharing behaviours that promote a ‘research commons’. The heart of a research commons is one in which academic research is not impeded by restrictions on use and access to data and materials, in line with the principles of the Creative Commons.”

The piece is chock-full of stellar recommendations that Science Commons supports, from better and more explicit resource sharing policies at journals and funding bodies, to the use of standard MTAs, to making data open and putting it in the public domain using CC0, our public domain waiver.

“Although it is usual practice for major public databases to make data freely available to access and use, any restrictions on use should be strongly resisted and we endorse explicit encouragement of open sharing, for example under the newly available CC0 public domain waiver of Creative Commons.”

We highly encourage a deeper read of the article for more tips on how to share resources more efficiently, as well as giving this article a read for more on pre-publication data sharing.