Survey Design Reflection
This week in class we discussed the many considerations associated with study design. One of my major takeaways was the emphasis on determining an analytical approach for the data set before data collection actually occurs. This decreases the need to sort through large amounts of data trying to find some pattern or correlation, and focuses on purposely collecting data to answer a targeted hypothesis (or multiple competing hypotheses!). By taking the time to identify variables of interest and analysis tools, data collection can be more efficient and will produce more useful data. I thought Melissa’s explanation of the issue of ‘double dipping’ with highly correlated variables was interesting and very well explained. Essentially, if two highly correlated variables are used, the correlation in the dataset is going to be overpredicted as both variables will explain the same response and be double counted. It is important to look at all variables and their interactions and dependence with each other before completing analyses.
My second major takeaway from class this week is the importance of knowing the natural history of the study organism. Either through literature searches or observational studies, it is important to know about the organism’s reproduction, distribution, age, migration and feeding behavior and then use that information to decide if eDNA is the most useful tool for studying that specific species. In our breakout room, we discussed the recent push in biological sciences to use environmental DNA as the solution for surveying everything. However, in reality we don’t know the full capabilities of eDNA and depending on the species it may not be the most effective method. This was emphasized by the study Erin shared that was unable to detect any green crab eDNA from a bucket water sample even though the crabs were clearly visible in the bucket. Clearly there are a lot of considerations that go into study design and it is important to commit time to planning out this portion of the process to have effective data.