I like lots of things, but I'm mostly a scientist (microbial ecology). I've tried separating the science bits of me from the other bits of me, but it didn't work (messy) so it's all in the one place. You can always search the tags if you're terribly allergic to [science] or [not science].
OTU tables contain a lot of data, but they’re also sparse - mostly zeroes. Sample numbers are going up and up and displaying the data is more of a challenge. In an exchange on twitter with Nick Loman and Rachel Poretsky a little while ago, we bitched about stacked bar plots. Here is my attempt on how to improve on one in R using the ggplot package and starting with a standard OTU table from QIIME. It’s an iterated stacked bar plot - so you’ve deconstructed your stacked bar (like a posh cheesecake) and each OTU now has the same base-line.
For me, I think it’s easier to spot differences in particular OTUs across the whole sample set. I quite like it for longitudinal data too and I add a white geom_vline between samples from each individual to delineate.
The code is a real bodge, I’m no coder, please suggest improvements on both that and the plot itself - we can definitely do better! Thanks to Saffron for her help resolving some stickier bits.
Starting off in QIIME you will have a .biom format OTU table, Mothur also supports this format. I haven’t come across an R package that can use .biom yet though (I’m sure there will be be one - perhaps via picante?), so we have to convert it back to a more straight-forward table. I use QIIME for this, making sure I keep the taxa string (why you would ever want to lose this when you’ve spent time making sure you get it in the first place, I’ve no idea).
Getting better identification from 16S rRNA amplicon sequencing data (part 1)
Microbial ecologists don’t necessarily care what name a bacterium has been given when they’re looking at molecular data. Names change over time, change with the database you’re using and don’t tie organisms back definitively to the one you have in culture - all that requires a lot of extra effort. Also, the standard species concept just doesn’t fit organisms that are as polyamorous as bacteria.
That’s why we’ve ended up with the uncharismatic OTU, operational taxonomic unit, as our unit of organism. An OTU is a group of sequences that share sequence identity at a particular level. 97 % identity is often chosen as close to whatever ‘species’ might mean in the 16S rRNA gene, but you’ll occasionally see 99 % and it will vary for other genes. For more detail on how 97 % was chosen see Stackebrandt and Gobel (not open access).
The important thing is that an OTU has a set definition that’s straightforward to understand and somewhat comparable between studies (let’s not get into that just yet - be aware that this is a white lie).
Now we get to the problem of application. I would quite happily barcode all bugs, starting at OTU1 and make lots of very lovely observations about how OTU9267 does really interesting things in your armpit. Clinicians don’t like this, they need a name because they can treat names. If it’s Streptococcus, give them penicillin; Pseudomonas? Maybe cipro.
We’ve generally been using the QIIME pipeline, so for naming a representative sequence from each OTU we use the RDP classifier, with reference to a 16S rRNA sequence database. Generally you get a range of identifications, mostly family or genus, not often to the species level, some unclassified and everything else in between. If you’re lucky the OTU that best associates with your clinical issue isn’t unclassified.
Lots of 16S rRNA studies leave it at this and end up with phrases like “Axillococcus spp., a genus that includes known pathogens of the armpit, was significantly associated with armpit hair infections”, which is a little unsatisfying. Which of the umpteen Axillococci is it? Can I use pitomycin, which kills A. bodyodouri, but A. whiffus is completely resistant to? Is it actually an Axillococcus at all or is it actually a Pitobacter that’s been misnamed in the database?
It is possible to wring a little bit more information out of your OTUs by using old school phylogenetic trees. It’s much more intensive and will not work for every OTU - so work out which your favourites are and brush up on your ARB skills. That’s what we did with Paul’s paper that came out this week.
For Haemophilus it worked beautifully, multiple treeing methods gave nice stable phylogenies and we were able to determine that the OTUs formerly known as Haemophilus spp. were in fact H. haemolyticus, H. influenzae and H. parahaemolyticus. We’re only sequencing the V3-V5 region and our reads are about 550 bp (sequenced in from V5 and before trimming etc.), but in this group of bacteria that’s enough.
Amongst the Streptococci, it didn’t work very well - there’s not enough information in that partial 16S sequence to discriminate amongst the different species. I got as far as, “it’s probably not S. pneumoniae”, but that’s vague and unhelpful. This probably won’t work for the Enterobacteriaceae either, but if you’ve got other interesting OTUs, give it a go, you might get lucky!
The next post is going to be all about databases and why people who just BLAST their 16S sequences are facing an afterlife in the malebolges with the other panderers and seducers.
I’m studying the bacteria living in (or infecting, depending on how sick you are and your definitions of those words) the respiratory tract. One of the projects I’m working on is called MOSAIC and is looking at last year’s swine flu outbreak, and I have lots of samples of various types from people who were hospitalised during the outbreak and gave their consent to be involved in the study. I don’t do any of the sampling, just receiving anonymous pots and swabs of secretions, so I volunteered to be a control, and out of interest decided to record the respiratory sampling.
Film contains tasteful scenes of tongue and nostrils, mild suspense and bad editing. No researchers were harmed in making this production, but one did get watery eyes.
I had an interview recently where one of the interviewers turned their nose up at two of my publications because they were in PLoS ONE. I found myself having to defend the journal (oddly not the actual articles, despite the fact that one was of direct relevance to the position) and our decision to publish there (the reasons are numerous: open access, quick, broad readership, new and shiny, online only, metrics etc.).
None of my defence made any impact, excuse the pun.
I know there are people around who obsess over impact factor, but really, JUST READ THE PAPER!! Then you know whether it’s good or bad. There are forgotten gems in any number of cut-price, flimsy pamphlets and the occasional lumps of steaming dog food in the big name journals.
Just READ IT! Simple.
A week or so after the interview PLoS ONE got its first impact factor, a rather healthy 4.351 (it’s that 0.001 that makes all the difference). I’m much too fair-minded to point this out. After-all, who cares about impact factor?
A good while ago I decided to revisit my publications and attempt to make them more accessible and (hopefully) more interesting with some background information on what actually went into them. This is my first attempt at that with my very first publication. It’s mainly to see whether I can do it and to practise writing in different ways. If you’re a scientist reading this, please let me know if I’ve dumbed down to the point of inaccuracy; if you’re a real person, is it interesting, do you care, what else do you want to know that I’ve missed? The original publication is embedded below in all it’s slightly dry (but look - pictures!) glory.
The sea is a soup of bacteria, viruses and algae. In a teacup of seawater you can find thousands of algae, millions of bacteria, and thousands of millions of viruses. They form complex ecosystems of organisms competing for food, preying upon each other and reproducing in near invisibility that belies their importance to life on the planet.
In 1999 I was a student studying microbiology and spotted an advert for an eight-week summer placement at the Marine Biological Association laboratory in Plymouth. Figuring that it sounded much more interesting than my usual summer job of serving cream teas and pasties to coach-loads of tourists (for American readers pasties are like empanadas, not the things that attach tassels to strippers’ nipples) I applied and got the position.
It was with Willie Wilson, a fellow at the MBA and I was supposed to study viruses that infect algae, particularly Emiliania huxleyi* and Phaeocystis pouchetii. Both these algae form large blooms of organisms and then die off; Phaeocystis blooms are thought to be responsible for the for the foam that you can sometimes see along the seashore. In the summer of 1999 there was a large bloom of E.hux (as there frequently is) off the coast of Plymouth in the English Channel looking, from satellite images, something like this:
The turquoise patches of sea are light being reflected by the coccoliths, shield-like constructions of the mineral calcite, that E.hux surrounds itself with. When E.hux dies these are released into the water and sparkle.
Willie was particularly interested in how these blooms die, and suspected that viruses played a role, though no viruses to coccolithophores (the group of organisms that E.hux belongs to) had been isolated. He and a student of his had been out to the bloom in a boat, collected seawater from various areas of living and dead algae within the bloom and brought these back to the lab. My job was to use these samples to attempt to kill E.hux that had been grown in the laboratory. This was a good thing as I’ve always been bad at keeping things alive, particularly houseplants and pets.
First the water was passed through fine filters to exclude everything except viruses and then this filtrate added to cultures of different types of E.hux. I would then monitor them to see whether they died or not. Luckily for me it’s really easy to tell when an E.hux culture has died, it goes from being a milky green flask to a clearer flask with fine white dust at the bottom and releases a strong seaside smell (a chemical called dimethyl sulfide or DMS). I would take samples from any cultures that died and reinfect fresh cultures, building up the numbers of viruses I had and also purifying them.
I also got to take some pictures of samples from dead cultures with an electron microscope (which I only broke once) revealing, to our relief, large numbers of particles that looked like viruses. Only one of my pictures made it to the publication, it’s figure 5 B in the paper and shows what is presumably a single E.hux cell that has burst apart from the mass of viruses that have been multiplying inside it. You can tell it’s my image as it’s so heavily over-stained, black and blobby compared to the others in the paper.
At the time these were extremely large viruses (christened coccolithoviruses), around 170 nanometres in diameter, some of the largest ever found, and I’d been lucky that they passed through the filter at all. Since then some truly enormous viruses, the mimiviruses have been discovered that are at least four times the size (though thousands would sit comfortably in neat line across the head of a pin).
This was the first time that a virus that infected E.hux had been isolated, and other peoples’ experiments reported in the paper provided evidence that they may have been responsible for the demise of the E.hux bloom. Since then EHV86 (Emiliania huxleyi virus 86, clever name huh? There were a lot more than 86, I can’t remember why we focused on it…must have been the best at killing E.hux) has had its genome sequenced revealing some unusual genes including those for making ceramide (erm, because they’re worth it?), and methods have been developed to track it in the environment so that its role in killing blooms can be more deeply investigated.
One question that I haven’t yet addressed, and probably a valid one is: who cares? Sometimes phrased as “why should we give you money to study this?”. E.hux viruses may have a significant affect on the world around them. This goes back to that seaside-smelling chemical I mentioned above produced by the dead E.hux cultures. When viruses kill E.hux, DMS is released into the atmosphere where it is converted to other compounds called cloud condensation nuclei. These increase the number of water droplets in clouds, which then reflect more sunlight and have a greater cooling effect…something as minuscule (or in this case, slightly larger than minuscule) as a virus potentially affecting the Earth’s climate.
It’s hard to find much information about the science policies of the three main parties, so I’ll collect some information sources here.
The New Scientist has an election blog here: The S Word, where they’re collating information and commenting on science poliicies.
CaSE, the Campaign for Science and Engineering in the UK also has a blog, The Science Vote they aim to make science an election issue.
I’ve also come across two debates between the three science representatives, Paul Drayson (Labour), Adam Afriyie (Conservative) and Evan Harris (Liberal Democrats), one hosted by CaSE (link) and one by the Royal Society of Chemistry (link)
I did two sciencey things yesterday and I’m pretty sure that the approach of one would help the other out.
A friend of mine had tickets for the Ignobel Prize Roadshow, short talks from previous winners of the prize given for improbable research, and kindly invited me along. The 2009 winners included Catherine Bertenshaw* and Peter Rowlinson for finding that naming cows makes them produce more milk and Elena Bodnar and friends who make bras that double as gas masks (demonstrated on the night - mmm, still warm). It was a fun night, with some good short talks, an eye-watering sword swallower and plenty of silly sounding research. I could have done without the operettas rewritten with cringeworthy science lyrics (all a bit primary school assembly), but it did exactly what it set out to do, made the audience laugh, then think. You can see previous years’ shows at the Imperial Graduate School website (sidebar links) and this years’ should be up soon.
I arrived for the talks early and thought I’d spend the time I had in the Science Museum. I hadn’t been for years and was looking forward to seeing how it had changed. Having only an hour, I did a bit of targeting and headed for the biological bits that I could find. Given that the Wellcome Wing is being refurbished, this left Health Matters on the 3rd floor, Glimpses of Medical History on the 4th and The Science and Art of Medicine on the 5th (map).
Utterly disappointing. The 4th floor consisted entirely of the same tedious dioramas (burn them!) I saw on my last visit (at least 10 years ago, probably more like 15), the 5th was a jam-packed, confusing maze of medical instruments and by the time I’d got through my trip to Health Matters and its slight, unengaging exhibits I was thoroughly grumpy.
In the Science and Art of medicine they have Jenner’s own inoculating equipment, tiny pen-knives used for transferring cow pox pus into healthy people to give them immunity to the killer small pox and the origin of vaccination. Of course, Jenner tested this out on a child, James Phipps, giving him small pox to see whether his vaccination had worked. He must have missed the medical ethics sessions when he was training. None of this history was really mentioned, let alone the legacy (eradication of small pox, near eradication of Polio and dozens of other diseases significantly reduced) and ethical issues of his work.
Surely these stories are a gift for a museum? Visitors could learn about the history and impact of science, whilst also understanding the ethical choices that are central, perhaps by being presented with this choice themselves (not unlike the ethical decision the Science Museum themselves made by cancelling James Watson’s talk). What about Lister, antiseptics, nursing and infection control in hospitals? What about test-tube babies and fertility coming from the obstetrics collection? Nope. Stuff in boxes with neat little labels.
I did enjoy Listening Post, but it feels more like a Tate Modern installation, and I didn’t have time to go to their more recent exhibition 1001 Inventions, which I’ll definitely go back for, but what I saw was all a bit sad.
Laughing then thinking is definitely what those collections needed. I hope the refurbished Wellcome Wing delivers and that they can then tackle the empty, lifeless floors above.
*Perplex City players reading, coincidentally number9dream’s wife :D. I would have said hello, but the the explanation of how I knew her husband played out badly (and not briefly) in my head so I didn’t.