Found 1 recipes
What is the quality of my RNA-seq reads? Are my RNA-seq reads of good quality or are there artifacts and errors I should remove and/or be aware of?
This recipe provides an outline of one method to preprocess and determine the quality of RNA-seq libraries. An example use of this recipe is a case where an investigator has completed sequencing of samples, and wants to evaluate and check the quality of the sequencing results before going through the time-consuming process of aligning the reads against a genome.
Given a set of raw RNA-seq reads, the goal is to align the reads to a reference genome and assess the quality of the read alignments by obtaining metrics such as depth of coverage, rRNA contamination, continuity of coverage, and GC bias. The purpose of this recipe is to process raw RNA-seq reads prior to aligning them against a reference genome. In particular, this recipe uses a dataset which has RNA-seq reads contaminated with adapter sequences. First we identify and remove adapter sequences using Galaxy, then we process the data and align it to a reference genome using GenePattern.
Why remove adapter sequences? Adapter sequences are short pieces of DNA with known sequence, which are used to link DNA molecules together. In particular, these are used to link an unknown DNA sequence to a sequencing primer, in order to sequence the unknown DNA strand. Most of the time adapter sequences are removed during pre-processing when data is sent to a researcher; however, sometimes some adapter sequences remain. It is important to remove these sequences before aligning reads to a genome.