layout: true class: inverse --- class: special, center ![GATC Logo](../shared-images/AdminTraining2016-250.png) # Reference Genomes in Galaxy **Slides: @blankenberg, @Slugger70** .footnote[\#usegalaxy \#GAT2017 / @galaxyproject] --- layout: true class: left, inverse --- class: left, middle, center ![GATC Logo](../shared-images/AdminTraining2016-100.png) ## Please interrupt *We are here to answer questions!* .footnote[\#usegalaxy \#GAT2017 / @galaxyproject] --- class: left ## ![GATC Logo](../shared-images/AdminTraining2016-100.png) Overview .large[ * **Intro to built in datasets** * Built in data hierarchy * Some problems * Data Managers ] .footnote[\#usegalaxy \#GAT2017 / @galaxyproject] --- class: left ## ![GATC Logo](../shared-images/AdminTraining2016-100.png) Built in Data ![List_of_data.png](images/i06-List_of_data.png) .footnote[\#usegalaxy \#GAT2017 / @galaxyproject] --- class: left ## ![GATC Logo](../shared-images/AdminTraining2016-100.png) Data, what data? .large[ * Some genomes are large! Human, Mouse, Coral * Some tools require indices of the genomes. * The indices take a long time to build! * Better to pre-build the indices. ] .footnote[\#usegalaxy \#GAT2017 / @galaxyproject] --- class: left ## ![GATC Logo](../shared-images/AdminTraining2016-100.png) Overview .large[ * Intro to built in datasets * **Built in data hierarchy** * Some problems * Data Managers ] --- ## ![GATC Logo](../shared-images/AdminTraining2016-100.png) Data schematics in Galaxy ![schematic](images/data_managers_schematic_overview.png) .footnote[\#usegalaxy \#GAT2017 / @galaxyproject] --- ## ![GATC Logo](../shared-images/AdminTraining2016-100.png) Using reference data in a tool #### bwa.xml ``` xml
Use a built-in genome index
Use a genome from history and build index
``` .footnote[\#usegalaxy \#GAT2017 / @galaxyproject] --- ## ![GATC Logo](../shared-images/AdminTraining2016-100.png) Where are the data tables? #### tool_data_table_conf.xml (Usually located in `galaxy/config/`) ``` xml
value, dbkey, name, path
``` .footnote[\#usegalaxy \#GAT2017 / @galaxyproject] --- ## ![GATC Logo](../shared-images/AdminTraining2016-100.png) "loc" files - Short for location! Not *"sending me loco"* bwa_index.loc ``` text ... # #
# ... bosTau7 bosTau7 Cow (bosTau7) /mnt/galaxyIndices/genomes/bosTau7/bwa_mem_index/bosTau7/bosTau7.fa ce10 ce10 C. elegans (ce10) /mnt/galaxyIndices/genomes/ce10/bwa_mem_index/ce10/ce10.fa danRer7 danRer7 Zebrafish (danRer7) /mnt/galaxyIndices/genomes/danRer7/bwa_mem_index/danRer7/danRer7.fa dm3 dm3 D. melanogaster Apr. 2006 (BDGP R5/dm3) (dm3) /mnt/galaxyIndices/genomes/dm3/bwa_mem_index/dm3/dm3.fa hg19 hg19 Human (hg19) /mnt/galaxyIndices/genomes/hg19/bwa_mem_index/hg19/hg19.fa hg38 hg38 Human (hg38) /mnt/galaxyIndices/genomes/hg38/bwa_mem_index/hg38/hg38.fa mm10 mm10 Mouse (mm10) /mnt/galaxyIndices/genomes/mm10/bwa_mem_index/mm10/mm10.fa ... ``` .footnote[\#usegalaxy \#GAT2017 / @galaxyproject] --- class: left ## ![GATC Logo](../shared-images/AdminTraining2016-100.png) Overview .large[ * Intro to built in datasets * Built in data hierarchy * **Some problems** * Data Managers ] --- ## ![GATC Logo](../shared-images/AdminTraining2016-100.png) Some Problems! .large[ * Time consuming! * ~30 minutes work just to add a new genome to 1 tool! * Administrator needs to know: * how to index **every** tool * expected format of the reference data * format of the .loc file ] .footnote[\#usegalaxy \#GAT2017 / @galaxyproject] --- ## ![GATC Logo](../shared-images/AdminTraining2016-100.png) Typical conversation .middle[![ref-problem-1.png](images/Ref-problem-1.png)] .footnote[\#usegalaxy \#GAT2017 / @galaxyproject] --- ## ![GATC Logo](../shared-images/AdminTraining2016-100.png) Typical conversation .middle[![ref-problem-2.png](images/Ref-problem-2.png)] .footnote[\#usegalaxy \#GAT2017 / @galaxyproject] --- ## ![GATC Logo](../shared-images/AdminTraining2016-100.png) Typical conversation .middle[![ref-problem-3.png](images/Ref-problem-3.png)] .footnote[\#usegalaxy \#GAT2017 / @galaxyproject] --- ## ![GATC Logo](../shared-images/AdminTraining2016-100.png) Typical conversation .middle[![ref-problem-4.png](images/Ref-problem-4.png)] .footnote[\#usegalaxy \#GAT2017 / @galaxyproject] --- ## ![GATC Logo](../shared-images/AdminTraining2016-100.png) Other concerns .large[ * **Accessible?** * Manually download genome FASTA files * Download, compile, run bwa index; which options? * **Reproducible?** * Only if the person performing manual steps keeps good notes * **Transparent?** * Send email to sysadmin asking for notes * Restart Galaxy server for new entries ] .footnote[\#usegalaxy \#GAT2017 / @galaxyproject] --- class: left ## ![GATC Logo](../shared-images/AdminTraining2016-100.png) Overview .large[ * Intro to built in datasets * Built in data hierarchy * Some problems * **Data Managers** ] (now we're onto the good stuff!) --- ## ![GATC Logo](../shared-images/AdminTraining2016-100.png) Data Managers .large[ * Allows for the **creation of built-in** (reference) data * underlying data * data tables * \*.loc files * Specialized Galaxy tools that can only be accessed by an admin * Defined **locally** or installed from **ToolShed** ] .footnote[\#usegalaxy \#GAT2017 / @galaxyproject] --- ## ![GATC Logo](../shared-images/AdminTraining2016-100.png) Data Managers .large[ * **Flexible** framework * Not just genomic data * Run Data Managers through UI * Workflow compatible * API * Examples * Adding new genome builds (dbkeys) * Fetching genome (fasta) sequences * Building short read mapper indices for genomes ] .footnote[\#usegalaxy \#GAT2017 / @galaxyproject] --- ## ![GATC Logo](../shared-images/AdminTraining2016-100.png) Special class of Galaxy tool Looks just like a normal Galaxy tool! ![Data-manager-ui.png](images/Data-manager-ui.png) .footnote[\#usegalaxy \#GAT2017 / @galaxyproject] --- ## ![GATC Logo](../shared-images/AdminTraining2016-100.png) What does it do? The output of the data manager is a JSON description of the new data table entry ![data_table_JSON.png](images/data_table_JSON.png) This gets turned into a new data table entry ![data_table_entry.png](images/data_table_entry.png) The index files themselves get placed in the appropriate location. .footnote[\#usegalaxy \#GAT2017 / @galaxyproject] --- ## ![GATC Logo](../shared-images/AdminTraining2016-100.png) Data Managers Admin .large[ * Located on the Galaxy's Admin Tab under **Local Data** ] ![data_managers_tool_list.png](images/data_managers_tool_list.png) .footnote[\#usegalaxy \#GAT2017 / @galaxyproject] --- ## ![GATC Logo](../shared-images/AdminTraining2016-100.png) Data Managers Admin .large[ * UI tools to fetch reference genomes/build indices * View progress of index build jobs * View contents of tool data tables ] ![data_table_ui.png](images/data_table_ui.png) .footnote[\#usegalaxy \#GAT2017 / @galaxyproject] --- ## ![GATC Logo](../shared-images/AdminTraining2016-100.png) Resources / further reading .large[ * Galaxy Wiki Page on Data Managers * Details * Building * Examples https://wiki.galaxyproject.org/Admin/Tools/DataManagers ] .footnote[\#usegalaxy \#GAT2017 / @galaxyproject] --- ## ![GATC Logo](../shared-images/AdminTraining2016-100.png) Exercise Time! .footnote[\#usegalaxy \#GAT2017 / @galaxyproject]