In a recent BBC article, Google’s Chris DiBona talked about a new program under development to help ameliorate some of the transfer problems in moving enormous data sets – up to 120,000 gigabytes worth.
The project has not been released to the public, but would involve taking massive data sets, copying the sets, and keeping the data in open form – whether under a Creative Commons license or some other format.
Now, how exactly does this work? Via hard drive recording systems, DiBona says.
He goes on to explain in the BBC piece:
“We have a number of machines about the size of brick blocks, filled with hard drives. We send them out to people who copy data on them and ship them back to us. We dump them on to one of our data systems and ship it out to people.”
Google would then make a copy of the data transmitted, keeping it open either by putting it under a Creative Commons license or by posting it in another open format.
DiBona, open source program manager at Google, sees initiatives like this a way to ease the burden on researchers in moving data sets far too large for network-transmission. To date, the project has copied and distributed data sets from the Hubble telescope, the Archimedes Palimpsest and worked with other institutions.
“I wished people were doing that for biology, genetic research and antiquities research,” he said.
For more information about Creative Commons licenses and databases, check out our FAQ.