elasticHPC is a package and a library for supporting High Performance Computing over cloud computing infrastructures. The current version of elasticHPC supports the Amazon’s cloud infrastructure (Amazon Web Service AWS). elasticHPC offers a range of functionalities that supports different use cases for individual users and bioinformatics service providers wishing to exploit the advantages of cloud computing.
elasticHPC inculdes the following functions:
Functionalities related to computing machines
Establishment and management of a computer cluster on the cloud including MapReduce cluster (Elastic MapReduce Product of Amazon). The cluster can be of any virtual machine type in AWS and of any size, provided that the user account permits this.
Addition and removal of compute nodes of the cluster in run time.
Automatic configuration of the cluster middleware including PBS torque for job scheduling and MPI for parallel programming.
Automatic setting of security options to facilitate the communication between the machines.
Functionalities related to storage and data transfer
Mounting EBS volumes to compute nodes (EBS stands for Elastic Block Store and it is like a hard- of flash-disk.)
Creating EBS Volumes from EBS snspshots
Automatic configuration of the share file system NFS
Associating S3 storage to compute nodes as a shared file system
Transfer of data from client’s local machine to the cloud machines or the S3 account.
Transfer of data among cloud nodes in efficient way.
Functionalities related to cost optimization
Running the cluster only if the price falls below a certain user-defined threshold. (This utilizes the spot instance options of AWS)
Functionalities related to remote job submission
Submission of jobs from client local machine to the cloud cluster based on a protocol similar to the REST protocol, where the user can monitor job status and retrieve the data back to his local machine.
Interface
Command-line interface.
Web Interface that 1) accessed from this site to use the package and bioinformatics tools or 2) can be downloaded to be used in own applications.
Bioinformatics Specific Options
Pre-configured cloud machine image (see download page for AMI ID) including elasticHPC and around 200 bioinformatics tools
Prepared nucleotide and protein databases as well as a copy of the human genome and its Bowtie indices.
Web-server interface for all installed tools that can be accessed upon creation of the cluster. This web-interface can be accessed from this web-page.
Standalone usage through this web-site where the user can use all elasticHPC features and execute the tools through the built-in interface.
elasticHPC can be used to support individual users seeking computational power and service providers wishing to offer cloud-based analysis services.
Individual users can use the “Use elasticHPC” tab in this web-site to utilize elasticHPC with all its functionalities. The included pages include creation of cluster, running generic jobs or some tools from a repository of 200 bioinformatics tools with their web-interface.
Users can download the client module at their local machine and invoke individual elasticHPC API’s
Service providers can use the library in a similar way to the available elasticHPC web-interface in this site to offer cloud based services for their tools. Here the users are forwarded to an instance of the provider services on the cloud, and the tasks runs on the cloud infrastructure. In this case, the user is the one who is charged and not the provider. This scenario requires that the provider prepares an image of their system on the cloud with elasticHPC also installed in this image.
Service providers can use the elasticHPC APIs to expand their computational resources in a hidden manner from the service users. That is, the users still interact with the main provider web-site but their tasks run on the cloud infrastructure. In this case, the provider takes over the cost of computation.