Data management and Bioinformatics

18 Mar 2021 Sam Silverbrand

The discussion yesterday on data management and introduction to some bioinformatics analysis for sequencing data/ discussing the difference between OTUs and ASVs was awesome. Data management is something that’s not taught to us very well in undergrad (in my opinion) and its been quite the learning curve to understand organizational methods and record keeping that makes things clearer rather than more confusing when storing data. This conversation was super helpful in hearing other people’s experiences of how and where they store their data and what strategies might work best. Additionally, all of the information on cyber security was completely new for me. I had little to no idea about data security methods, VPNs, encrypting your data and your harddrives. This lecture will change a lot of my practices when working with my data and also change how I’m going to reorganize everything (yikes for me, that’s a big task).

On the note of OTU vs ASV and the conversation surrounding that and the paper by Callahan et al. (2017), I thought both the group discussion leaders and the paper did a great job of teaching us the differences and helping us understand the benefits and fallbacks of using OTUs and why the field has changed a bit. Additionally, the walkthrough of dada2 was a great supplement to the discussion. Although I’ve worked through a dada2 pipeline before, its always nice to see how other people have their workflows set up and how I might change some of my own analysis/ quality control methods in a pipeline. I also think seeing and learning things hand in hand with a lecture or paper is a great way to let the knowledge sink in.

Great job guys!!