Explore the intersection of biology and data science in our Bioinformatics course. Master the art of analyzing Next-generation sequencing data (RNA-seq) and developing pipelines to solve complex biological problems efficiently.
The rapid advancement of high-throughput technologies has revolutionized biomedical research, necessitating scalable and reproducible methods for both experiments and computational analysis due to the growing volume and complexity of data.
Transforming this data into useful information requires numerous tools, parameter optimization, and integration of evolving reference data. To address these challenges, workflow managers were created. These tools streamline pipeline development, optimize resource usage, manage software installations and versions, and ensure workflows can run across various computing platforms, promoting portability and sharing.In this course, you will learn how to develop.
No prior knowledge is required. Learn through practice.
- Bioinformatics data files, structured and unstructured files
- Bioinformatics tools
- How to leverage workflow management tools to develop a reproducible, scalable, and scalable RNA-seq data analysis pipeline
- Programming languages: R and Python, and Shell
- Perform your custom analysis pipelines
Module 1: Introduction to shell
This module introduces students to the fundamental concepts of Unix shell scripting, tailored for those analyzing data in a scientific setting. Module 1 begins with an overview of shell scripts, file management, directories, and the processes of editing, viewing, and concatenating files. It also covers how to connect to servers using the terminal.
Module 2: Hands-on Python environment
This module will enable students to configure the environment for core software used in the pipeline management tool, with a primary focus on understanding the bioinformatics tools for the canonical RNA-seq data analysis pipeline and setting up environmental dependencies.
Module 3: Introduction to Bioinformatics tools for RNA-seq data analysis
Students will learn how to handle different types of RNA-seq data, including structured and unstructured data, and how to use Bioinformatics tools.
Module 4: Advanced bioinformatics pipeline development
The final module will delve deep into the development of the pipeline. Topics covered will include how to process the raw fastq files, trim them, quality check, map the reads, obtain raw read counts, and finally normalize the read counts. Besides that, students will learn to write Python and R scripts for data analysis and visualization. Finally, share the pipeline with an online code repository.