Reflection from bioinformatics module
Learning about the process of sequencing analysis this week helped reinforce my understanding of how the Illumina MiSeq works, which I did not totally get before. This module discussed the topics that Sue’s amplicon sequencing analysis is all about, including talk on bioinformatic analysis, the merits of ASVs over OTUs, and going through the dada2 pipeline. We even read the Callahan et al. 2017 paper for that class. One topic I am noticing keeps popping up, both in this class and in discussions across different groups is just how important good reference databases are, and how hard they are to get. Is assignTaxonomy a limiting step for a lot of amplicon analysis projects because of the difficulty of obtaining the them? How do people typically go about putting them together when using genes for which there are no easily downloadable pre-made databases?
Also, protecting data from being stolen through encryption or other means is not something I have ever had to think about. I just don’t think I have data so desirable that anyone would try to steal it. But I have thought about backing up data and making sure it saves to more than just a local hard drive, after having my own experiences losing months of work. I now love dropbox and just blindly trust it to solve all my problems.