Why we need to figure out what we already know

January 4th, 2008 by dwentworth

Over at Corante’s In the Pipeline , organic chemist Derek Lowe has a post that vividly demonstrates one of the unfortunate realities of the research environment: researchers can spend years “discovering” what’s already been discovered. In a recent case that Lowe cites, a group of researchers somehow managed to publish two papers documenting a chemical reaction that was discovered no less than a century ago:

Professor Manfred Cristl of Wurzburg, who apparently knows his pyridinium chemistry pretty well, recognized [the reaction] as an old way to make further pyridinium salts…. He recounts how over the last couple of months he exchanged awkward e-mails with the two sets of authors, pointing out that they seem to have rediscovered a 100-year-old reaction, and have they really looked at their spectral data closely, eh?

This kind of mistake is incredibly embarrassing; you can easily imagine a chemistry professor using it as a cautionary tale to help students understand the importance of a thorough literature search. But it also has deeper implications. One of the most common myths about the Web is that it makes mistakes like this impossible. If Google knows all, why not ask it to tell us everything there is to know about pyridinium chemistry?

Of course, it doesn’t work like that, for a number of reasons. One is that Google uses text-based matching to find web pages, and its method for determining whether a particular page is relevant relies in large part on the number of people who have linked to it. This is a useful for finding lots of things — especially consumer products. It doesn’t work nearly as well for scientific research.

At Science Commons, we want to enable scientists to use the Web to get precise answers to complex research questions — not 388,000+ search results that contain the word pyridinium. We hope to do much more than help researchers avoid embarrassing mistakes. We want to empower them to build on existing knowledge, so they can take the next steps toward discovery without having to repeatedly double-back.

John Wilbanks, who leads Science Commons, has a personal blog over on Nature Network, where he’s been sharing his ideas for realizing this vision. He writes: “[We] have to date published our knowledge in formats designed for a different world. One idea, one lab, one gene, one protein, one paper, one database. [If] we can mark up the knowledge better, we can do a lot better without a gee-whiz theoretical breakthrough, just by better using what we do indeed already know.”

If you’d like to learn about how Science Commons is “marking” knowledge, check out our Neurocommons project — an initiative aimed at demonstrating how a Semantic Web approach to making information useful can help us figure out what we already know.

Comments are closed.