Pivotal Engineering Journal

Technical articles from Pivotal engineers.

Using Terraform, Concourse, and om to Continuously Deploy Pivotal Cloud Foundry's Elastic Runtime

A look at how Release Engineering deploys and tests the Elastic Runtime Tile

Posted on by
Categories:   Infrastructure    Elastic Runtime Tile    Release Engineering    Terraform    Ops Manager    om   
Edit this post on GitHub.

At Pivotal, we strive for empathy with our customers. One way we achieve this: running and managing our own products, just as a customer would. Here’s a look at how our Release Engineering team maintains the Elastic Runtime Tile (ERT). The ERT includes both open-source and proprietary components that power Pivotal Cloud Foundry. Ops Manager is a web application that platform operators use to configure and deploy tiles, such as the ERT, MySQL, Redis, RabbitMQ, etc.

Ops Manager Web App

Ops Manager Web App

Our team works in the space between platform component teams and customers. Our core responsibilities are to consume platform components, expose configuration options to operators in a user-friendly manner via forms on Ops Manager, and test the platform. Furthermore, the team applies bug fixes and critical vulnerability patches to previous versions of the ERT.

OSS and proprietary components flow into the ERT

OSS and proprietary components flow into the ERT

It is critical that we deploy and test the product in production-like situations on the infrastructures our customers use. After all, customers depend on us for long-term support. Our testing matrix below details the versions, upgrade paths, and infrastructures we test.

ERT Testing Matrix (N, M are latest versions)

ERT Testing Matrix (N, M are latest versions)

This post explains the Concourse-driven system we use to deploy and test the ERT. We use terraform to create infrastructure and an Ops Manager instance. We built a tool called om that interacts with Ops Manager to configure and deploy the ERT. To see the pipelines in action go to https://releng.ci.cf-app.com.

Pipeline deploying ERT 1.11 on GCP

Pipeline deploying ERT 1.11 on GCP

Creating Infrastructure

Terraform automates the creation and modification of infrastructure. It supports all major cloud providers. Users define their infrastructure components (load balancer, DNS, network configuration, etc.) in template .tf files. After running terraform apply, the user is given a .tfstate state file that contains information about the infrastructure. This is an important file - it allows you to manipulate and destroy your infrastructure in a painless, reentrant way. As long as you have the state file, terraform can do its job.

terraform creates resources in Google Cloud

terraform creates resources in Google Cloud

Our team maintains a series of repos with terraform templates for Google Cloud, AWS, and Azure. We use these templates to stand up infrastructure on each cloud. Since templates do not contain credentials, we share them freely. Describing infrastructure with sharable template files (instead of documentation or writing a client program) is one of the major benefits of terraform.

To create the infrastructure, our pipeline pulls templates from one of the aforementioned repos. A terraform variables file .tfvars with environment specific values (credentials, SSL certs, etc.) is created and then terraform apply is executed. The terraform concourse resource is an easy, safe way to run terraform templates. As mentioned earlier, the state file is important and confidential. The resource takes templates and variables as inputs, performs the terraform apply command, and outputs your state file to an Amazon S3 bucket of your choice. This puts the state file in a safe, secure location for further operations.

Check out the README.md in the terraforming repo of your choice for specific info on how to get started.

Configuring Ops Manager

After the infrastructure is stood up, it’s time to configure authentication so we can interact with Ops Manager. We automate with Concourse, so we use the aforementioned om program for this purpose.

Some of you may be familiar with opsmgr - a similar tool that used Capybara to automate form submission. Unfortunately, opsmgr was susceptible to a high rate of false negatives, like failing to find elements on a page because they were overlapping. As a result, we created om to improve the reliability and speed of our pipelines.

Configuring Ops Manager authentication

Configuring Ops Manager authentication

To start interacting with Ops Manager via the API, we set up authentication using the following command:

$ om --target https://pcf.example.com configure-authentication \
  --user desired-username --password desired-password \
  --decryption-passphrase desired-passphrase

Uploading Artifacts

Next, the pipeline uploads the ERT and its stemcell to Ops Manager. The ERT contains compiled releases of open-source Cloud Foundry components like loggregator, UAA, and Diego, as well as Pivotal Cloud Foundry components like App Autoscaler and Apps Manager. The tile also contains metadata that describes its properties, and a manifest template that is populated with user-provided configuration (see Configuring ERT).

Uploading stemcell for ERT

Uploading stemcell for ERT

Since our focus is automation, we use om for the upload. The ERT bits are on Pivotal Network, a site that contains many Pivotal products like Redis, Spring Cloud Services, and Pivotal Cloud Foundry Runtime for Windows.

Stemcells can be found on bosh.io. The required stemcell version for an ERT is found on the tile’s download page. For automation purposes, our pipeline unzips the tile, finds the metadata file, and extracts the required stemcell version from the stemcell_criteria section of the metadata.

Once we have the ERT and stemcell, they are uploaded to Ops Manager via the following om commands:

$ om --target https://pcf.example.com --user some-user --password password upload-product \
  --product /path/to/product/file.pivotal

$ om --target https://pcf.example.com --user some-user --password password stage-product \
  --product-name cf --product-version 1.11.1

$ om --target https://pcf.example.com --user some-user --password password upload-stemcell \
  --stemcell /path/to/stemcell/file.tgz

Configuring BOSH

Cloud Foundry uses BOSH for deployment; so does Ops Manager in Pivotal’s commercial distribution. Much of the same configuration that is required when using bosh-init or bbl is applicable to configuring BOSH via Ops Manager.

Configuring BOSH director

Configuring BOSH director

To configure the BOSH Director, you need to provide Ops Manager with details about the infrastructure the platform will be deployed into (see Creating infrastructure). Terraform to the rescue! Our pipeline extracts this information from the terraform state file .tfstate via the following command:

$ terraform output -state terraform.tfstate | jq -r ‘map_values(.value)’
{ 
    "vm_tag": "some-id-tag-for-vms",
    "project": "some-gcp-project",
    "network_name": "some-network",
    "azs": ["us-central1-a", "us-central1-b", "us-central1-c"],
    ...
}

These values are provided to the configure-bosh command. Here is an example that sets the IaaS and network configuration for the Director.

$ om --target https://pcf.example.com --username some-user --password some-password configure-bosh \
  --iaas-configuration '{"default_deployment_tag": "some-id-tag-for-vms", "project": "some-gcp-project"}'

$ om --target https://pcf.example.com --username some-user --password some-password configure-bosh \
  --az-configuration '{"availability_zones": [{"name": "us-central1-a"}, {"name": "us-central1-b"}, {"name": "us-central1-c"}]}'
...

To fully configure the BOSH Director, check out the examples in configure-bosh command documentation.

Configuring ERT

Configuration for open-source Cloud Foundry is provided via a manifest file. However, configuration of the ERT is exposed by a series of forms on Ops Manager and these values are populated into the manifest. The forms allow operators to enable features like container networking, TCP routing, and specify values like SSL certificates for the routers, a destination for external system logging, desired location for the Cloud Controller database, etc. Values that are provided in these forms are translated into properties for their respective BOSH jobs.

Configuring ERT

Configuring ERT

Here’s an example of part of an om configure-product command that is used to configure the system_domain value that would be provided to the cloud_controller job:

$ om --target https://pcf.example.com --username some-user --password some-password configure-product \
  --product_properties='{".cloud_controller.system_domain": {"value": "sys.example.com"}, ... }'

...

To fully configure the ERT, check out the examples in the configure-product command documentation.

Note: Ops Manager has a useful API method that displays all configurable properties of a tile. This can be useful when crafting an om command to configure a property not covered in the om examples and documentation.

Deploying ERT

Now that BOSH is ready to deploy VMs and the ERT is configured, the final step is to deploy the platform!

Ops Manager uses bosh-init to deploy the Director with the configuration provided to the om configure-bosh command in the Configuring BOSH section. It issues a bosh deploy to deploy a manifest with the configuration provided to the om configure-product command in Configuring the ERT section. The lengthy compilation step is skipped; components in the ERT have already been compiled.

The pipeline applies changes by issuing the following command:

$ om --target https://pcf.example.com --user some-user --password some-password apply-changes

Running the apply-changes command will tail the Ops Manager installation output and exit 0 for a successful deployment and 1 for a failed deployment. The command is also reentrant meaning it will re-attach to an installation in progress.

Conclusion

Automation is critical to testing the Elastic Runtime Tile. To ensure we release the highest quality software, it is imperative that we deploy and test different versions, upgrade paths, and configurations on numerous IaaSes. Tools like terraform make creating and managing infrastructure state simple. We created om to automate a stable, reentrant interaction with Ops Manager. Tooling is at the heart of Release Engineering. It has allowed us to maintain long-term support for the products we ship to customers.