What is BOSH? Why is it useful?
BOSH creates and deploys virtual machines (VMs) on top of a physical computing infrastructure, and deploys and runs Cloud Foundry on top of this cloud. To configure the deployment, BOSH follows a manifest document.
BOSH creates and deploys virtual machines (VMs) on top of a physical computing infrastructure, and deploys and runs Cloud Foundry on top of this cloud. To configure the deployment, BOSH follows a manifest document.
BOSH is a recursive acronym for Bosh Outter SHell. In contrast to the “Outter Shell”, the system being deployed and managed by BOSH is called the “Inner Shell”. The below diagram illustrates a simplified model of BOSH.
BOSH can be considered as a server or a robot which orchestrates the deployment process of a distributed system. There is a ruby tool which can interact with BOSH Command Line Interface (CLI). Before BOSH starts to deploy a system, it needs three prerequisites: a stemcell, a release (the software to be installed), and a deployment manifest. Let’s look at these three items in more detail.
Stemcells: A stemcell is a VM template containing a standard Ubuntu/CentOS distribution. A BOSH agent is also embedded in the template so that BOSH can take control of VMs cloned from the stemcell. The name “stemcell” originated from biological term “stem cells”, which refers to the undifferentiated cells that are able to grow into diverse cell types later. Similarly, VMs created by a BOSH stemcell are identical at the beginning. After inception, VMs are configured with different CPU/memory/storage/network, and installed with different software packages. Hence, VMs built from the same stemcell template behavior differently.
Releases: A release is a collections of software bits, configurations properties, configuration templates, scripts, binaries which will be installed onto the target system. Each VM is deployed with a collection of software, which is called a job. Configurations are usually templates which contain parameters such as IP address, port number, user name, password, domain name. These parameters will be replaced at deploy time by the properties defined in a deployment manifest file.
Deployments: A deployment is something that turns a static release into runnable software on VMs. A Deployment Manifest defines the actual values of parameters needed by a deployment. During a deployment process, BOSH substitutes the parameters in the release and makes the software run on the configuration as planned.
When the above 3 items are ready, they will be uploaded to BOSH by the BOSH CLI tool. After that, a BOSH installation of a distributed system typically has the following major steps:
1) If some packages in the release require compilation, BOSH first creates a few temporal VMs (worker VMs) to compile them. After compiling the packages, BOSH destroys the worker VMs and stores the binaries to its internal blobstore.
2) BOSH creates a pool of the VMs which will be the nodes where the release to be deployed on. These VMs are cloned from the stemcell with a BOSH agent installed.
3) For each job of the release, BOSH picks a VM from the pool and updates its configuration according to the Deployment Manifest. The configuration may include IP address, persistent disk size etc.
4) When the reconfiguration of the VM is completed, BOSH sends commands to the agent inside each VM. The commands tell the agent to install software packages. During the installation, the agent may download packages from BOSH and installs them. When the installation finishes, the agent runs the starting script to launch the job of the VM.
5) BOSH repeats step 3-4 until all jobs are deployed and launched. The jobs can be deployed simultaneously or sequentially. The value “max_in_flight” in the manifest file controls this behavior. When it is 1, it means the jobs are deployed one by one. This value is useful for a slow system to avoid timeout caused by resource congestion. While it is greater than one, it means jobs are deployed in parallel.
This comment has been removed by the author.
ReplyDelete