An introduction

Well, it’s about time that I actually got around to actually using the blog portion of my website (which is still and likely forever will be a work in progress). To kick things off I figured I would simply give a brief description of myself and what I believe the aims of this blog will be.

As you’ve probably noticed, my full and publication name is Michael C. Nelson, but I almost always go by Mike. 

I’ve always had an interest in biology, however it was through pure serendipity that I ended up majoring in Microbiology, having pretty much no idea what that actually meant when I first told my advisor that that was the degree I wanted. He was the chair of the Botany-Microbiology department and taught most of the microbiology courses, so I figured what the heck and signed up! As luck would have it, microbiology proved to be immensely more interesting to me than standard eukaryotic cell biology, let alone studying anatomy. Things got more interesting when I took a “bioinformatics” course, which combined my studies with my hobbyist interests in computers and programming.

It wasn’t until the summer of 2004, however, that I truly figured out what I actually wanted to do with my eventual degree. That summer I was invited to participate in a 2.5 month long REU program run by the University of Tennessee, but taking place at the University of the Free State in Bloemfontein, South Africa. The whole REU was based on the study of microbes living in the deep-subsurface, which could only be accessed by sampling deep within some of the countries gold mines. I won’t give a full recount of the experience in this post, but needless to say it was phenomenal.

Upon my return to the US, I started applying to graduate school, mostly looking at programs that had people working on environmental/extremophile microbiology. After applying to about a half dozen programs and being accepted to three I ended up choosing to join the Microbiology Dept. at The Ohio State University. Unfortunately, things took a bit of a twist and ended up joining the lab of Dr. Mark Morrison who just as it so happened was just about to move to Australia as part of their repatriating of elite scientists (Phil Hugenholtz, Gene Tyson, and a few others were all big names associated with that program), but was keeping an appointment in the Animal Science Dept. and thus his lab there. So, I ended up transferring from Micro into the Environmental Science Graduate Program (ESGP), which in hindsight was actually a good thing as it forced me to broaden my horizons, so to speak.

My dissertation research was focused on analyzing the microbial communities and their dynamics in anaerobic digestion systems, which involved a lot of 16S rRNA gene sequencing and analysis. Fortunately for me, my long-time love of computers meant that I was rather comfortable with bioinformatics and so I took to the topic like a fish to water. Another fortunate development from this time was that the first models of “next-generation” sequencing instruments from Roche/454 and Solexa were just starting to become available. Long story short, I graduated in 2011 with my PhD and a handful of manuscripts.

One of the biggest problems I had while finishing up my dissertation was that people would ask me if I knew what any of the microbes in the anaerobic digestion systems I was studying were doing. Of course I would hem and haw about how members of species X all do metabolism Y while members of genus A are believed to all consumer metabolite B, but basically my answers were “No, we’re just guessing”, even if they were educated guesses. This lead me to seek out a post-doc where I could actually try to answer those questions, so I joined Joerg Graf’s lab at the University of Connecticut, with the aim of analyzing the meta-transcriptomes of the juvenile leech microbiome as it develops.

Again, the science gods decided to throw another wrench into the plan; it turns out we still don’t have the capabilities to get sufficient amounts of RNA from the microbial community in the digestive tract of a baby leech (new methods of whole transcriptome amplification could probably do that for us now, but the methods are still too expensive and prone to contamination to be reliable for such a project just yet). Funnily enough though, Illumina announced they were releasing the MiSeq at about the same time UConn announced they would be shutting down our un-reliable HiSeq core. Joerg was able to convince the university, the MCB department and some fellow PIs to put up funds to get a MiSeq, which we got in April of 2012. As the post-doc, I was de facto in charge of the system which became a good and bad thing as it allowed me freedom to explore a new sequencing method and technology but also became a bit of a time-suck as other users started needing my assistance.

While microbial genome sequencing was our first and primary use of the MiSeq when we first got it, we quickly realized along with several other groups that it had the potential to displace the much more expensive 454 system for 16S amplicon sequencing. We initially started to develop our own system in early 2013 for sequencing 16S amplicons, but when Joerg met Rob Knight at a conference and got access to a pre-print of their paper for doing such sequencing we quickly adopted their methods. Having a background in such analyses, I quickly started putting together an analysis scheme that others in the lab and department could utilize. Because of the scale of the datasets, the first Illumina based 16S analyses all used reference mapping of the reads to known 16S datasets, ignoring all of the reads that weren’t mapped. Doing a quick analysis I found that for some environments with good representation in the databases (eg. human and mouse microbiomes) this was OK but for more diverse environments nearly 1/2-2/3 of the data was being thrown away. This led to the development of what I called the RDS pipeline where we utilize reference mapping followed by de novo OTU creation with chimera checking that produced nearly identical results to pure de novo analyses. This pipeline has since been published, along with our findings regarding “dataset” contamination that are due to fundamental limitations of the Illumina sequencing method.

Alongside my 16S work, I have been actively exploring new areas of genomics, transcriptomics, and bioinformatics. Initially this was within the European leech (Hirudo verbana) and termite (Reticulitermes flavipes) systems that Joerg works on, but I have since moved onto focusing more on the North American leech Macrobdella decora. I expect that I’ll cover this area more fully in future blog posts, as well as with excited mentions of new publications, but for now I’ll simply say that with M. decora I have a system that I can not only call my own but also poses some unique and interesting research questions that I can hopefully build a career off of.

So, that my not so short introduction and background about myself. My plan for future posts is to be more focused on some particular topic that either directly relates to work that I’m currently focused on (such as why submitting annotated genomes to an INSDC repository is such a pain in the ass) or perhaps some general musings about the world of science and how it all works (is the grad school:post doc:research professor track really just a pyramid scheme?).

Hopefully you’ll stay tuned for future updates and I hope at least some of them are interesting.