One of the most common interests is to find out the RNA expression difference between two states of an organism or a cell.
What you need:
- At least 3 biological repeats of RNAseq data from each state of the organism.
- The adapter sequences used in the RNA sequencing facility to generate the RNAseq data (read 1 and read 2).
- A transcriptome or cDNA data file (in FASTA format) of the organism to match the sequenced RNA sequences to genes.
Source: Ensembl
- A gtf - file that supplies meta data to genes. gtf file and transcriptome data file should be from the same source to have matching gene names. A gtf file is identical to a gff file, version 2.
Source: Ensembl
3 biological repeats are required to estimate whether the expression difference of a gene between the two states is within the natural variance of expression of this particular gene or not.
Example Data:
.gz compressed data files can be used directly and don't need to be decompressed. .zip files should be decompressed.
1. Data trimming exercise: (paired end data)
Read 1 Data: rnaseq test1
Read 2 Data: rnaseq test2
Adapter Sequences:
Read 1: GTCAACTTCAGTGACAGTGGTCAAACCGGTGGTGACTGGAACTTCAGTAC,
CTCGTACTTGCTCCCCAGGTTACAGCTGAACAAAAAAGAATGTGCTTGTA
Read 2: CCGAAATGCAATACTCAAAAGTCGCTATCTTGTCTGCCGTTGCTGGTTCA,
GCTTAATCTACTACTAATCAATAATATCCGAAATGCAATACTCAAAAGTC
2. Data quantification exercise:
Organism: Caenorhabditis elegans
States: Starvation - no starvation
Data: (single end, trimmed)
No Starvation: n-starv1, n-starv2, n-starv3
Starvation: starv1, starv2, starv3
Transcriptome database: caenorhabditis_elegans
Genome meta file (gtf-file): caenorhabditis_elegans.gtf