+ - 0:00:00
Notes for current slide
Notes for next slide

GATC Logo

All these Clouds

It's positively meterological..

Slides: @Slugger70, @nuwang, @afgane

#usegalaxy #GAMe2017 / @galaxyproject
1 / 24

GATC Logo

Please interrupt

We are here to answer questions!

#usegalaxy #GAMe2017 / @galaxyproject

2 / 24

GATC Logo Overview

  • Galaxy in the Clouds?
  • AWS and other Clouds
  • CloudMan and CloudLaunch (& cloud agnosticisim)
  • CloudMan Galaxy
  • Architecture
  • Persistence
  • Taking it further - The GVL
  • Other cloud usage - burst!

#usegalaxy #GAMe2017 / @galaxyproject

3 / 24

GATC Logo Help!

  • Galaxy server flat out?
  • Queue longer than a Grateful Dead concert?
  • An urgent job to run?

What do you do now?

#usegalaxy #GAMe2017 / @galaxyproject

4 / 24

GATC Logo Help!

  • Galaxy server flat out?
  • Queue longer than a Grateful Dead concert?
  • An urgent job to run?

What do you do now?

Use the cloud man!

cloudman

#usegalaxy #GAMe2017 / @galaxyproject

5 / 24

GATC Logo Clouds?

Cloud computing ... is a model for enabling ubiquitous, on-demand access to a shared pool of configurable computing resources ... which can be rapidly provisioned and released with minimal management effort. Cloud computing and storage ... may be located far from the user – ranging in distance from across a city to across the world. - Wikipedia, Cloud Computing.

aws_logo.png OpenStack_logo.png gce-logo.png

#usegalaxy #GAMe2017 / @galaxyproject

6 / 24

GATC Logo Available Clouds

  • Amazon Web Services
    • Pay-per-time/machine etc.
    • Reasonably priced, but keep an eye on the costs
    • Large range of machine (i.e., instance) types
    • Education grants
  • OpenStack
    • Open source community project
    • NeCTAR in Australia, Jetstream in USA, CLIMB in UK, lots of others
    • Some free for researchers (NeCTAR, CLIMB), some with project grants (Jetstream)
#usegalaxy #GAMe2017 / @galaxyproject
7 / 24

GATC Logo Why Clouds?

  • Elastic compute!
    • Can dynamically resource analyses
  • No need to maintain the hardware
    • Provider takes on cost of hardware and maintenance
    • Cost is shared between all users
  • Move the compute to the data
    • Data on East Coast?
    • Start compute there. Save on data transfer.

#usegalaxy #GAMe2017 / @galaxyproject

8 / 24

GATC Logo Galaxy on the Cloud

  • There are cloud images (VM blueprints) available
    • with Galaxy pre-installed
    • with different sets of tools installed
    • with access to reference data
    • for different clouds (AWS globally, Jetstream, NeCTAR, CLIMB etc.)
  • You just need credentials for the cloud you want to "launch" on.
    • Credentials are generally strings
    • An access key and a secret key or username and password with project details
    • They are obtained from the cloud account admin page you want to use
#usegalaxy #GAMe2017 / @galaxyproject
9 / 24

GATC Logo CloudLaunch

  • CloudLaunch is a system for launching Galaxy (and other applications) on cloud resources
  • CloudLaunch will now provision you a computer in the cloud with Galaxy installed and ready to go.
    • Depending on your choices and availability you will also have access to reference data and various tools
    • It should only take 2-3 minutes for everything to be set up.

#usegalaxy #GAMe2017 / @galaxyproject

10 / 24

GATC Logo Launch Demo

#usegalaxy #GAMe2017 / @galaxyproject

11 / 24

GATC Logo Cloud Manager

cloudman

  • Cloud manager: middleware to control cloud clusters
  • Can be used to control system and application services, such as Galaxy
  • Can mount filesystems, dynamically add/remove worker nodes, start/stop services

#usegalaxy #GAMe2017 / @galaxyproject

12 / 24

GATC Logo CloudMan

#usegalaxy #GAMe2017 / @galaxyproject

13 / 24

GATC Logo CloudMan Admin

#usegalaxy #GAMe2017 / @galaxyproject

14 / 24

GATC Logo Cluster on the Cloud?

  • Your CloudMan instance is a single machine
    • It is the "Head node" of a cluster
  • CloudMan can start "worker" nodes.
    • More cloud instances (of any size)
    • Automatically connects to file system
    • Are registered in Slurm setup
    • A node will take ~2-3 minutes to start and configure.
#usegalaxy #GAMe2017 / @galaxyproject
15 / 24

GATC Logo Auto-Scaling

  • Can set up dynamic scaling to respond to the system load
    • Upper and lower node numbers
    • When queue is full and jobs wait certain time, new nodes are launched.

#usegalaxy #GAMe2017 / @galaxyproject

16 / 24

GATC Logo CloudMan Galaxy

  • Your Galaxy server is set up and ready to go!
    • Includes large list of pre-installed tools.
    • Includes access to reference data files
    • Zero-to-go in less than 10 minutes.
  • Toolsets and reference sets can be tailored to suit needs
    • Will be discussed in architecture section

#usegalaxy #GAMe2017 / @galaxyproject

17 / 24

GATC Logo CloudMan Galaxy

#usegalaxy #GAMe2017 / @galaxyproject

18 / 24

GATC Logo CloudMan Galaxy

  • Configured for Slurm out of the box
    ...
    <destinations default="default_dynamic_job_wrapper">
    <destination id="slurm_cluster" runner="slurm"/>
    <destination id="slurm_4slots" runner="slurm">
    <param id="nativeSpecification">--ntasks=4</param>
    </destination>
    <destination id="default_dynamic_job_wrapper" runner="dynamic">
    <param id="type">python</param>
    <param id="function">default_dynamic_job_wrapper</param>
    </destination>
    ...
    </destinations>
    <tools>
    <tool id="toolshed.g2.bx.psu.edu/repos/devteam/bwa/bwa/0.3.1" destination="slurm_4slots" />
    ...
    # COMPUTE NODES
    NodeName=master NodeAddr=45.113.232.91 CPUs=15 RealMemory=64431 Weight=10 State=UNKNOWN
    NodeName=w1 NodeAddr=45.113.232.83 CPUs=16 RealMemory=64431 Weight=5 State=UNKNOWN
    NodeName=w2 NodeAddr=45.113.232.92 CPUs=8 RealMemory=32176 Weight=5 State=UNKNOWN
    NodeName=w3 NodeAddr=45.113.232.93 CPUs=8 RealMemory=32176 Weight=5 State=UNKNOWN

#usegalaxy #GAMe2017 / @galaxyproject

19 / 24

GATC Logo CloudMan Architecture

architecture.png

#usegalaxy #GAMe2017 / @galaxyproject

20 / 24

GATC Logo Persistence

  • Cloud instances are typically transient
    • Can be terminated and resources returned to the pool
  • However, user data and cluster configuration can be persisted
    • Then can be attached to new instance when they start
  • CloudMan stores an instance's set up in an object store container for persistence

#usegalaxy #GAMe2017 / @galaxyproject

21 / 24

GATC Logo Looking to the future

  • An all-new system is under development
  • We already saw the new CloudLaunch
    • No longer Galaxy-only: any application and multiple clouds can be plugged in
  • It is powered by CloudBridge
    • http://cloudbridge.readthedocs.io/
    • An abstraction layer for mature clouds
    • Will let the new version of CloudMan run on any cloud
    • Therefore, we can use our Galaxy images on any cloud
  • New CloudMan is planned
    • Container-based so no/minimal building necessary per cloud
    • Powered by CloudBridge, so natively cross-cloud

#usegalaxy #GAMe2017 / @galaxyproject

22 / 24

GATC Logo Taking it Further: GVL

GVL-evolution.png

#usegalaxy #GAMe2017 / @galaxyproject

23 / 24

GATC Logo GVL applications

GVL-dash.png

#usegalaxy #GAMe2017 / @galaxyproject

24 / 24

GATC Logo

Please interrupt

We are here to answer questions!

#usegalaxy #GAMe2017 / @galaxyproject

2 / 24
Paused

Help

Keyboard shortcuts

, , Pg Up, k Go to previous slide
, , Pg Dn, Space, j Go to next slide
Home Go to first slide
End Go to last slide
b / m / f Toggle blackout / mirrored / fullscreen mode
c Clone slideshow
p Toggle presenter mode
t Restart the presentation timer
?, h Toggle this help
Esc Back to slideshow