SciFoo Camp 08 - looking back at 2008, part 2

Amongst the many conferences I went to in 2008, Science Foo Camp 08 (SciFoo) has to be the highlight. This extraordinary gathering of scientists, engineers, geeks and technologists is put together by Nature, O'Reilly and Google, and is hosted in building 40 at Googleplex (Mountain View, CA). The meeting is in its third year and I was lucky enough to be there for 2008. Thanks to a Mac Book Air that I borrowed for the trip, I made notes on the sessions I attended. However, true to form I never got around to summerizing these notes such that they could be blogged. Here (in a slightly edited form) are some comments on the sessions I attended. I've also listed a few of the sessions I couldn't make, because of clashes in the schedule. For some pics, check out the SciFoo tag on my Flickr account:

FRIDAY 8th Aug.
5pm to Midnight

Registration, dinner, initial sessions and demos. Highlights included watching Dan Janzen enthrall a crowd with a box of pinned insects, chatting with Brian Cox and Gia Milinovich about Sunshine, and the X-files, and chatting with Lincoln Wallen about UK science funding, Dreamworks and a patronage in science.

SATURDAY 9th Aug.

9.30am. Author ID Microattribution Collaboration Scientific Careers (Ernst Hafen & Myles Axton, Ed. Nature Genetics)
The discussion focused on impact factors and citations in scholarly communications. There was talk of an "open discussion platform" for science (I'm not quite sure what this might look like, above and beyond things like Nature Networks) and the need to incentivize contribution. Frankly I think the discussion missed the bigger point. Scientific advances are made in small, granular steps (data), but the scholarly communication process doesn't handle granularity (data) very well. Journals focus on publishing less granular synthesized stories about data (papers) which are easy to cite and link etc, but not the underlying data itself. We (the scientific community delude ourselves into thinking that these papers are sufficient to make science repeatable and most journals are crap at publishing the data requires to make them repeatable (arguably this isn't their business anyway). We need systems that publish data and tools that help users synthesize these data into stories. Publishers are not geared toward doing this (they don't know our disciplines well enough to to it) and funding agencies don't seem to understand the problem. There was some talk of assigning unique identifiers (GUIDs) to authors (which it seems many publishers are doing, thus they are no longer unique!) and some discussion on the lack of tools, open peer review and discussion platforms, but few solutions on how to credit contributors for engaging in these endeavors.

10.30am. Google research datasets / Science in the cloud (Robert Tansley, Google)
Google have been dabbling with storing research datasets for a while now and Google Research Datasets is (now was - see my note below!) part of their way of tacking this. Specifically they had the laudable goal of providing a space for researchers to store data on the web, and making these data citable, versionable, and linkable, by sniffing data formats. If Google could recognize the data format uploaded by a user, they'd add value in some way (e.g. links), else it would remain citable and linkable. I remember there being lots of enthusiasm in the room about this project, but personally I was skeptical, not least because the value added bits (GUIDs for citing, links and version control) were not there yet (i.e. didn't exist) and it wasn't clear to me how Google could leaver any value for themselves out of this. Science has so much domain specificity that there are many thousands of heterogeneous data formats, most of which evolve rapidly or go extinct, and most usually have a tiny pool of specialist users. Thus the value to Google of doing all this work for a relatively small audience made me think this was unlikely to be successful. Turns out I was right as I received an e-mail in December 08 announcing that the project was canned. See this post on Wired for more info.

11.30am. Open Science: Overcoming Barriers (multiple speakers)
Usual stuff for this area, but the basic message is we are not there (i.e open) yet! Tell me something I didn't already know.

2.00pm. What science can learn from Google, aka the end of science is wrong, sorta, (Chris Anderson, Wired)

For the background to this session, checkout Chris Anderson's Wired article "The End of Theory: The Data Deluge Makes the Scientific Method Obsolete". The piece came in for a fair amount of criticism on Wired's website but I though it was an inspired article. Chris's spin was toned down a little for SciFoo, but the message was essentially the same. Research is traditionally hypotheses driven, but in the data deluge of the web, we can analyze data without preconceived hypotheses about what it might show to find patterns. Finding correlation does not equal causation, but with petabytes of data now available in many scientific disciplines, finding correlations may (in some cases) be sufficient. The session prompted much discussion, with numerous references to Craig Venter's efforts to DNA sequence ocean water in search of new species. As a taxonomist I spoke up for Craig's approach, but I also got to have the last word in the session! In answering the sessions title I said that Google lowers the transaction costs to doing science. This approach does not fundamentally change science, it just means we can find our answers more quickly.

3.00pm. 7 billion readers: DNA barcode the world (Dan Janzen)

Dan and his colleagues at the Consortium for the Barcode of Life (arguably along with all GenBank contributors), are using DNA for the identification and discovery of biological species. The term is often used in reference to a 648 base pair region of the cytochrome oxidase mitochondrial gene (COI), which is common to most animal taxa, and shows sufficient variation to make it discriminatory for most species (a combination of other genes is typically used for plants). Dan's talk focused on developments toward a cheap hand-held device that could barcode tissue in minutes, providing a species identification tool for everybody. Many taxonomists have long feared that such a device would herald the extinction of traditional systematics. I've always thought this was poppycock. Putting such a tool in the hands of everybody would initiate a resurgence of interest in natural history the likes of which we have never seen, and it would be the taxonomic community that would reap the rewards of the resurgence of interest in their work. I've been involved in the sidelines of the barcoding debate for a few ears (see for example the output from a debate I chaired on the subject back in 2004), but Dan has been one of its most ardent supporters, based on his work in the flora and fauna of Costa Rica. It was great to finally see him in action!

4.00pm Future of quantum computing (Eleanor Rieffel)
Apologies but this session was a little too heavy for me, so I stepped out to catch my breath.

5.00pm voyage of the (new) beagle - flagship for science (Karen James)
Karen and co. are building a replica of the HMS Beagle - the ship that took the young naturalist Charles Darwin on a 5 year Voyage of South America and the Pacific. This trip was fundamental to the development of Darwin's idea on natural selection and (amongst other things) eventually led to the publication with Alfred Russel Wallace of two papers on the evolution of species. This version of the Beagle will have a few enhancements over the original (diesel auxiliary engines, radar, GPS navigation etc), and will be a ocean going science vessel used for research. In addition to the megagenomics projects (with NASA), and DNA barcoding studies, Karen as science officer was looking for additional idea on the kinds of research and education uses for the ships laboratories. Karen was also doing a spot of fundraising. As I recall, it is going to take about £3 million to build the ship.

8.00pm Google Scholar Sucks (Michael Eise)

My bottom line on this session is the Google Scholar does not suck, the only people that suck are the library scientists that think Google Scholar is evil, and that they some how think they know what researchers want or how they behave when doing research. I think some members of this group could use the help of some sociologists! Doubtless, others came away from this session with a different view, but regardless, this was a fun session.

9pm - 3am. Charlie's cafe demos and and drinking at Wild Palms Hotel.
Free beer and discussions late into the night.

SUNDAY 10th Aug.
9.30am. Second life (
Jean Claud Bradley)
More cool stuff from Jean Claud on the scientific (and not so scientific) applications of Second Life. Nature Publishing Group (co-organizers of SciFoo) have got into Second Life big time, and most recently have build Elucian Islands, a virtual archipelago dedicated to hosting scientific events and meetings. I have toyed with the idea of getting people at the NHM interested in Second Life (and specifically Elucian Islands) as a virtual venue for the NHM's Nature Live presentations. This is where NHM scientists give daily presentations to the public about their research and the natural world. My concern is that the demographic of Second Life users is not the demographic of our Nature Live audiences. Nevertheless, it would be fun to try this, especially if we can get some schools involved, since many of them cannot physically get to the NHM, and the studio where the presentations are given (even the new one in out new Darwin Center 2 building) could not fit them in anyway. Also our current Nature Live coverage on the web is crap.

10.30am. The future of the scientific method (Kevin Kelly)
Kevin Kelly has an awesome reputation and I am an avid reader of his work (amongst other things Kevin was involved in the now defunct All Species Foundation - arguable the precursor to the Encyclopedia of Life). However, this session left me frustrated, perhaps because my expectations were too high. It descended into a largely philosophical discussion (something I generally try to avoid). But I think my frustrations lie in the fact that much of the future for the scientific method seems pretty obvious to me, and Chris Anderson largely nailed it in his previous session (and his Wired article - "The End of Theory"). We can finesse about the edges of the scientific method - issues about peer review, publication of negative results, changes to funding mechanisms etc, but for me computer technologies will drive a lot of this change.

11.30am. Pathogen biobanks, bio informatics, crowdsourcing, "harness the demon the freezer" (Frank Rijsberman, Google.org)
This session left me even more frustrated, despite the caliber of the participants (I sat next to Eric Schmidt). At the time I partially penned a blog post entitled "Will Google Save us?", in reference to how Google appeared to be interested in biodiversity science, but I refrained from posting this on the grounds that it might not be in the best interests of my discipline (i.e. systematics and taxonomy). Suffice to say that there is lot Google could do to help the life sciences community, and especially the biodiversity sciences, but getting too involved in idiosyncrasies of our particular discipline would be too parochial and unsustainable for Google, and that ultimatly this would damage us (our field). Google knows about links, identifiers and search. Its efforts must scale to all scientific endeavors, and not just the whims of relatively small research communities, no matter how much I care about it. Otherwise, Googles efforts will not be sustainable, and sustainability is essential to our discipline, whose research data has a half-life measured in centuries.

2-4pm. Closing session and wrapping up
Final session, including five minute summaries of selected sessions.

4pm until late.
Chatting by the hotel pool, followed by dinner with various luminaries - life is hard!

Here are a few of the session's I really wanted to attend but could not get to because of the sessions clashes with those above:

  • Models of Scientific Communication (Johan Bollen)
  • 5 minute talks by smart people about web 2.0 tools for scientists: science 2.0
  • Settling mars how? (Pete Worden)
  • Bundled Value: Commodities, Carbon, Water, Biodiversity, Proverty, etc (Jason Clay)
  • Can & should machines think? technical + ethical discussion (Stan Williams)
  • The marketplace of ideas (or why the academic system sucks) (Sabine Hossenfelder)
  • Is there junk DNA? what is all that RNA doing? (Tom Gingeras)
  • Extraterrestrials - why haven't we seen any? (Martin Rees and Nick Bostrom)
  • Why whales are weird (Joy Reidenberg)
  • Deflecting asteroids (yes, it's possible) (Ed Lu)
  • Dataintegration biology (Helen Berman and Shankar Subramaniam)
  • What to do with the iSequencer? A soon to be real tool for sequencing DNA and RNA samples from any material for $0.01 in 2 minutes (Andy Fire)
  • Desktop 2.0: can we please forget the the browser for a second? (Alex Griekspoor)
  • Bioacoustics: A window into science and connection to the natural world (Bernie Krause)
  • Talking science to the religious: Why *should* the Pope have an astronomer? (Brother Guy Consolmagno)

View My Stats