Amongst the many conferences I went to in 2008, Science Foo Camp 08 (SciFoo) has to be the highlight. This extraordinary gathering of scientists, engineers, geeks and technologists is put together by Nature, O'Reilly and Google, and is hosted in building 40 at Googleplex (Mountain View, CA). The meeting is in its third year and I was lucky enough to be there for 2008. Thanks to a Mac Book Air that I borrowed for the trip, I made notes on the sessions I attended. However, true to form I never got around to summerizing these notes such that they could be blogged. Here (in a slightly edited form) are some comments on the sessions I attended. I've also listed a few of the sessions I couldn't make, because of clashes in the schedule. For some pics, check out the SciFoo tag on my Flickr account:
FRIDAY 8th Aug.
5pm to Midnight
Registration, dinner, initial sessions and demos. Highlights included
watching Dan
Janzen enthrall a crowd with a box of pinned insects,
chatting with Brian
Cox and Gia
Milinovich about Sunshine,
and the X-files,
and chatting with Lincoln Wallen about UK science funding, Dreamworks
and a patronage in science.
SATURDAY 9th Aug.
9.30am. Author ID Microattribution Collaboration
Scientific Careers (Ernst
Hafen & Myles Axton, Ed.
Nature Genetics)
The discussion focused on impact factors and citations in scholarly
communications. There was talk of an "open discussion platform" for
science (I'm not quite sure what this might look like, above and beyond
things like Nature
Networks) and the need to incentivize contribution. Frankly I
think the discussion missed the bigger point. Scientific advances are
made in small, granular steps (data), but the scholarly communication
process doesn't handle granularity (data) very well. Journals focus on
publishing less granular synthesized stories about data (papers) which
are easy to cite and link etc, but not the underlying data itself. We
(the scientific community delude ourselves into thinking that these
papers are sufficient to make science repeatable and most journals are
crap at publishing the data requires to make them repeatable (arguably
this isn't their business anyway). We need systems that publish data
and tools that help users synthesize these data into stories.
Publishers are not geared toward doing this (they don't know our
disciplines well enough to to it) and funding agencies don't seem to
understand the problem. There was some talk of assigning unique
identifiers (GUIDs) to authors (which it seems many publishers are
doing, thus they are no longer unique!) and some discussion on the lack
of tools, open peer review and discussion platforms, but few solutions
on how to credit contributors for engaging in these endeavors.
10.30am. Google research datasets / Science in
the cloud (Robert Tansley, Google)
Google have been dabbling with storing research datasets
for a while now and Google Research Datasets is (now was - see my note
below!) part of their way of tacking this. Specifically they had the
laudable goal of providing a space for researchers to store data on the
web, and making these data citable, versionable, and linkable, by
sniffing data formats. If Google could recognize the data format
uploaded by a user, they'd add value in some way (e.g. links), else it
would remain citable and linkable. I remember there being lots of
enthusiasm in the room about this project, but personally I was
skeptical, not least because the value added bits (GUIDs for citing,
links and version control) were not there yet (i.e. didn't exist) and
it wasn't clear to me how Google could leaver any value for themselves
out of this. Science has so much domain specificity that there are many
thousands of heterogeneous data formats, most of which evolve rapidly
or go extinct, and most usually have a tiny pool of specialist users.
Thus the value to Google of doing all this work for a relatively small
audience made me think this was unlikely to be successful. Turns out I
was right as I received an e-mail in December 08 announcing that the
project was canned. See this
post on Wired for more info.
11.30am. Open Science:
Overcoming Barriers (multiple speakers)
Usual stuff for this area, but the basic
message is we are not there (i.e open) yet! Tell me something I didn't
already know.
2.00pm. What science can learn from Google, aka
the end of science is wrong, sorta, (Chris
Anderson, Wired)
For the background to this session, checkout Chris Anderson's Wired article "The End of Theory: The Data Deluge Makes the Scientific Method Obsolete". The piece came in for a fair amount of criticism on Wired's website but I though it was an inspired article. Chris's spin was toned down a little for SciFoo, but the message was essentially the same. Research is traditionally hypotheses driven, but in the data deluge of the web, we can analyze data without preconceived hypotheses about what it might show to find patterns. Finding correlation does not equal causation, but with petabytes of data now available in many scientific disciplines, finding correlations may (in some cases) be sufficient. The session prompted much discussion, with numerous references to Craig Venter's efforts to DNA sequence ocean water in search of new species. As a taxonomist I spoke up for Craig's approach, but I also got to have the last word in the session! In answering the sessions title I said that Google lowers the transaction costs to doing science. This approach does not fundamentally change science, it just means we can find our answers more quickly.
3.00pm. 7 billion readers: DNA barcode the world (Dan Janzen)
Dan and his colleagues at the Consortium for the Barcode of Life (arguably along with all GenBank contributors), are using DNA for the identification and discovery of biological species. The term is often used in reference to a 648 base pair region of the cytochrome oxidase mitochondrial gene (COI), which is common to most animal taxa, and shows sufficient variation to make it discriminatory for most species (a combination of other genes is typically used for plants). Dan's talk focused on developments toward a cheap hand-held device that could barcode tissue in minutes, providing a species identification tool for everybody. Many taxonomists have long feared that such a device would herald the extinction of traditional systematics. I've always thought this was poppycock. Putting such a tool in the hands of everybody would initiate a resurgence of interest in natural history the likes of which we have never seen, and it would be the taxonomic community that would reap the rewards of the resurgence of interest in their work. I've been involved in the sidelines of the barcoding debate for a few ears (see for example the output from a debate I chaired on the subject back in 2004), but Dan has been one of its most ardent supporters, based on his work in the flora and fauna of Costa Rica. It was great to finally see him in action!
4.00pm Future of quantum computing (Eleanor Rieffel)
Apologies but this session was a little too heavy for
me, so I stepped out to catch my breath.
5.00pm voyage of the (new) beagle - flagship for
science (Karen James)
Karen and co. are building a replica
of the HMS Beagle - the ship that took the young naturalist
Charles Darwin on a 5 year Voyage of South America and the Pacific.
This trip was fundamental to the development of Darwin's idea on
natural selection and (amongst other things) eventually led to the
publication with Alfred Russel Wallace of two papers on the evolution
of species. This version of the Beagle will have a few enhancements
over the original (diesel auxiliary engines, radar, GPS navigation
etc), and will be a ocean going science vessel used for
research. In addition to the megagenomics projects (with
NASA), and DNA barcoding studies, Karen as science officer was looking
for additional idea on the kinds of research and education uses for the
ships laboratories. Karen was also doing a spot of
fundraising. As I recall, it is going to take about
£3 million to build the ship.
8.00pm Google Scholar Sucks (Michael Eise)
My bottom line on this session is the Google Scholar does not suck, the only people that suck are the library scientists that think Google Scholar is evil, and that they some how think they know what researchers want or how they behave when doing research. I think some members of this group could use the help of some sociologists! Doubtless, others came away from this session with a different view, but regardless, this was a fun session.
9pm - 3am. Charlie's cafe demos and and drinking
at Wild Palms Hotel.
Free beer and discussions late into the night.
SUNDAY 10th Aug.
9.30am. Second life (Jean Claud Bradley)
More cool stuff from Jean Claud on the scientific (and
not so scientific) applications of Second
Life. Nature
Publishing Group (co-organizers of SciFoo) have got into
Second Life big time, and most recently have build Elucian
Islands, a virtual archipelago dedicated to hosting
scientific events and meetings. I have toyed with the idea of getting
people at the NHM
interested in Second Life (and specifically Elucian Islands) as a
virtual venue for the NHM's Nature
Live presentations. This is where NHM scientists give daily
presentations to the public about their research and the natural world.
My concern is that the demographic of Second Life users is not the
demographic of our Nature Live audiences. Nevertheless, it would be fun
to try this, especially if we can get some schools involved, since many
of them cannot physically get to the NHM, and the studio where the
presentations are given (even the new one in out new Darwin Center 2
building) could not fit them in anyway. Also our current Nature Live
coverage on the web is crap.
10.30am. The future of the scientific method (Kevin Kelly)
Kevin Kelly has an awesome reputation and I am an avid
reader of his work (amongst other things Kevin was involved in the now
defunct All
Species Foundation - arguable the precursor to the Encyclopedia of Life).
However, this session left me frustrated, perhaps because my
expectations were too high. It descended into a largely philosophical
discussion (something I generally try to avoid). But I think my
frustrations lie in the fact that much of the future for the scientific
method seems pretty obvious to me, and Chris Anderson largely nailed it
in his previous session (and his Wired article - "The
End of Theory"). We can finesse about the edges of the
scientific method - issues about peer review, publication of negative
results, changes to funding mechanisms etc, but for me computer
technologies will drive a lot of this change.
11.30am. Pathogen biobanks, bio informatics,
crowdsourcing, "harness the demon the freezer" (Frank Rijsberman,
Google.org)
This session left me even more frustrated, despite the
caliber of the participants (I sat next to Eric Schmidt). At the time I
partially penned a blog post entitled "Will Google Save us?", in
reference to how Google appeared to be interested in biodiversity
science, but I refrained from posting this on the grounds that it might
not be in the best interests of my discipline (i.e. systematics and
taxonomy). Suffice to say that there is lot Google could do to help the
life sciences community, and especially the biodiversity sciences, but
getting too involved in idiosyncrasies of our particular discipline
would be too parochial and unsustainable for Google, and that ultimatly
this would damage us (our field). Google knows about links, identifiers
and search. Its efforts must scale to all scientific endeavors, and not
just the whims of relatively small research communities, no matter how
much I care about it. Otherwise, Googles efforts will not be
sustainable, and sustainability is essential to our discipline, whose
research data has a half-life measured in centuries.
2-4pm. Closing session and wrapping up
Final session, including five minute summaries of
selected sessions.
4pm until late.
Chatting by the hotel pool, followed by dinner with
various luminaries - life is hard!
Here are a few of the session's I really wanted to attend but could not get to because of the sessions clashes with those above: