Genomics data

From CC Doc
Jump to: navigation, search
This page contains changes which are not marked for translation.


This article is a draft

This is not a complete article: This is a Draft, a work in progress that is intended to be published into an article, which may or may not be ready for inclusion in the main wiki. It should not necessarily be considered factual or authoritative.



Other languages:
English • ‎français

In partnership with C3G, we maintain several genome databases that are available on general-purpose systems (Béluga, Cedar, Graham) as well as Guillimin and Mammouth. In addition to the FASTA sequence, many genomes include aligner indices and annotation files.

When it is available, the genomics data are always located here: /cvmfs/ref.mugqic/genomes.

We encourage you to browse the directory to get more information.

[user@cedar5 ~]$  ls -1 /cvmfs/ref.mugqic/genomes
blast_db
chimera_gold_db
chimera_unite_db
greengenes_db
mirbase
pfam_db
silva_db
species
temp
unite_db


Available genomes in species/

Common name Species Builds
Human Homo sapiens
  • GRCh38
  • GRCh37
  • hg19
Mouse Mus musculus
  • GRCm38
  • mm10
  • mm9
  • NCBIM37
Rat Rattus norvegicus
  • rn5
  • Rnor_5.0
  • Rnor_6.0
Monkey Macaca mulatta
  • MMUL_1
Chimpanzee Pan troglodytes
  • panTro4
  • CHIMP2.1.4
Baboon Papio anubis
  • PapAnu2.0
Dog Canis familiaris
  • CanFam3.1
Cow Bos taurus
  • UMD3.1
Chicken Gallus gallus
  • Galgal4
Fly Drosophila melanogaster
  • BDGP5
C. Elegans Caenorhabditis elegans
  • WBcel235
Yeast Saccharomyces cerevisiae
  • R64-1-1
Schizosaccharomyces pombe
  • ASM294v2
Bacteria Escherichia coli str k_12 substr dh10b
  • ASM1942v1
pseudomonas aeruginosa pa14
  • Pseu aeru PA14_V1
Pseudomonas aeruginosa UCBPP_PA14
  • ASM1462v1
Plants Arabidopsis thaliana
  • TAIR10