Microbial Ecology: Introduction into Data Analysis

Nina Dombrowski

Institute for Biodiversity and Ecosystem Dynamics (IBED), University of Amsterdam (UVA)

n.dombrowski@uva.nl

Introduction

In this tutorial, you will learn how to work on the command line and use a High-Performance Computing (HPC) environment. This will allow you to analyse 16S long-reads and turn this data into a count table of microbial taxa.

This course is designed for students in microbial ecology with little or no prior experience using the command line. During this tutorial, you will learn how to:

  1. Install a Terminal

  2. Document your Code

    • Decide what software to use for code documentation
    • How to use markdown and document workflows
  3. Navigate the Command Line

    • Navigate the filesystem
    • Work with files
    • Search and filter data
    • View and understand sequence files
  4. Understand the setup of an HPC

  5. Work with an HPC

    • Connect to an HPC
    • Understand job schedulers (e.g., SLURM)
    • Run command-line tools (e.g., fastqc, seqkit)
    • Set up and activating conda environments
  6. Analyse long-read data

    • Analyse sequence read quality
    • Perform quality filtering of sequence read data
    • Align quality-filtered reads to a reference database
    • Generate a count table

The accompanying slides can be found here: