Auto Scaling Instances with OpenStack

scale

Overview

Intelligently and automatically scaling applications based on resource requirements is at the heart of the OpenStack value proposition. It is one of the key capabilities that differentiate cloud vs traditional infrastructure such as VMware. For those of us who have been in IT a while auto scaling was always a white unicorn, often discussed but never actually seen. In this article we will talk about how to do this in OpenStack using Heat and Ceilometer.

Heat

Orchestration and automation within OpenStack is handled by Heat. It is the brains of your cloud. Heat provides a declarative structure for defining IT processes using YAML. You could say the value of OpenStack is implemented by Heat as this is where your business processes exist. Heat will let you automatically provision infrastructure (compute, network and storage)  based on YAML templates. In addition Heat also lets you create policies around running infrastructure. One such policy is around auto scaling.

Ceilometer

Collecting and persisting utilization measurements within OpenStack is handled by Ceilometer. OpenStack attempts to handle IT infrastructure as a utility and as such metering is a critical aspect. Furthermore, enabling billing systems to provide pay-as-you-go consumption models, requires exact metering. Beyond billing, such metering is also key for auto scaling. Decisions made by Heat on when to scale applications, are based on data collected by Ceilometer.

Auto Scaling Heat Templates

A lot of the configuration information in this article comes from article written by Christian Berendt. In this auto scaling example, Heat and Ceilometer will be used to scale CPU bound virtual machines. Heat has the concept of a stack which is simply the environment itself. The Heat stack template describes the process or logic around how a Heat stack will be built and managed. This is where you can create an auto scaling group and configure Ceilometer thresholds. The environment template explains how to create the stack itself, what image or volume to use, network configuration, software to install and everything an instance or instances need to properly function. You can put everything into the Heat stack template, but separating the Heat stack template from the environment is much cleaner, at least in more complex configurations such as auto scaling.

Environment Template

Below we will create an environment template for a cirros image. The instance template will create an instance based on Cirros image, configure a cinder volume, add IP from private network, add floating IP from public network, add security group, private ssh-key and generate 100% CPU load through user-data. Note: you will need to make changes below depending on your OpenStack configuration. The OpenStack installation and configuration used for these examples can be found in this article.

#vi /etc/heat/templates/cirros.yaml

heat_template_version: 2014-10-16
description: A base Cirros 0.3.4 server

resources:
  server:
    type: OS::Nova::Server
    properties:
      block_device_mapping:
        - device_name: vda
          delete_on_termination: true
          volume_id: { get_resource: volume }
      flavor: m1.nano
      key_name: admin
      networks:
        - port: { get_resource: port }

  port:
    type: OS::Neutron::Port
    properties:
      network: private
      security_groups:
        - all

  floating_ip:
    type: OS::Neutron::FloatingIP
    properties:
      floating_network: public

  floating_ip_assoc:
    type: OS::Neutron::FloatingIPAssociation
    properties:
      floatingip_id: { get_resource: floating_ip }
      port_id: { get_resource: port }

  volume:
    type: OS::Cinder::Volume
    properties:
      image: 'Cirros 0.3.4'
      size: 1

Now that we have an environment template, we need to create a Heat resource type and link it to above file /etc/heat/templates/cirros.yaml.

#vi /root/environment.yaml

resource_registry:

    "OS::Nova::Server::Cirros": "file:///etc/heat/templates/cirros.yaml"

Stack Template

Below we will define our Heat stack template. We will create the following resources: scaleup_group, scaleup_policy and cpu_alarm_high. The scaleup_group explains how the instance should be scaled and also defines the environment (OS::Nova::Server::Cirros) that points to the environment yaml file “/etc/heat/templates/cirros.yaml”. The scaleup_policy defines how to handle a scale-up event. Finally we have a threshold, the cpu_alarm_high resource is used to trigger a scale-up event. Here we define the threshold and actions provided by Ceilometer.

#vi /root/example.yaml

heat_template_version: 2014-10-16
description: Example auto scale group, policy and alarm
resources:
  scaleup_group:
    type: OS::Heat::AutoScalingGroup
    properties:
      cooldown: 60
      desired_capacity: 1
      max_size: 3
      min_size: 1
      resource:
        type: OS::Nova::Server::Cirros

  scaleup_policy:
    type: OS::Heat::ScalingPolicy
    properties:
      adjustment_type: change_in_capacity
      auto_scaling_group_id: { get_resource: scaleup_group }
      cooldown: 60
      scaling_adjustment: 1

  scaledown_policy:
    type: OS::Heat::ScalingPolicy
    properties:
      adjustment_type: change_in_capacity
      auto_scaling_group_id: { get_resource: scaleup_group }
      cooldown: 60
      scaling_adjustment: -1

  cpu_alarm_high:
    type: OS::Ceilometer::Alarm
    properties:
      meter_name: cpu_util
      statistic: avg
      period: 60
      evaluation_periods: 1
      threshold: 50
      alarm_actions:
        - {get_attr: [scaleup_policy, alarm_url]}
      comparison_operator: gt

  cpu_alarm_low:
    type: OS::Ceilometer::Alarm
    properties:
      meter_name: cpu_util
      statistic: avg
      period: 60
      evaluation_periods: 1
      threshold: 10
      alarm_actions:
        - {get_attr: [scaledown_policy, alarm_url]}
      comparison_operator: lt

Update Ceilometer Collection Interval

By default Ceilometer will collect CPU data from instances every 10 minutes. For this example we want to change that to 60 seconds. Change the interval to 60 in the pipeline.yaml file and restart OpenStack services.

#vi /etc/ceilometer/pipeline.yaml

- name: cpu_source
interval: 60
meters:
- "cpu"
sinks:
- cpu_sink

#openstack-service restart

Running Heat Stack

At this point we are ready to run our auto scaling Heat stack. The expected results should be that a single Cirros instance is launched. It should have private and floating IPs.

#heat stack-create example -f /root/example.yaml -e /root/environment.yaml
 +--------------------------------------+------------+--------------------+----------------------+
 | id | stack_name | stack_status | creation_time |
 +--------------------------------------+------------+--------------------+----------------------+
 | 6fca513c-25a1-4849-b7ab-909e37f52eca | example | CREATE_IN_PROGRESS | 2015-08-31T16:18:02Z |
 +--------------------------------------+------------+--------------------+----------------------+

Heat will create the stack and launch the one cirros instance.

#nova list
 +--------------------------------------+-------------------------------------------------------+--------+------------+-------------+--------------------------------------+
 | ID | Name | Status | Task State | Power State | Networks |
 +--------------------------------------+-------------------------------------------------------+--------+------------+-------------+--------------------------------------+
 | 3f627c84-06aa-4782-8c12-29409964cc73 | ex-qeki-3azno6me5gvm-pqmr5zd6kuhm-server-gieck7uoyrwc | ACTIVE | - | Running | private=10.10.1.156, 192.168.122.234 |
 +--------------------------------------+-------------------------------------------------------+--------+------------+-------------+--------------------------------------+

Heat will also create two cpu alarms which are used to trigger scale-up or scale-down events.

ceilometer alarm-list
+--------------------------------------+-------------------------------------+-------------------+----------+---------+------------+--------------------------------+------------------+
| Alarm ID | Name | State | Severity | Enabled | Continuous | Alarm condition | Time constraints |
+--------------------------------------+-------------------------------------+-------------------+----------+---------+------------+--------------------------------+------------------+
| 04b4f845-f5b6-4c5a-8af0-59e03c22e6fa | example-cpu_alarm_high-rd5kysmlahvx | ok | low | True | True | cpu_util > 50.0 during 1 x 60s | None |
| ac81cd81-20b3-45f9-bea4-e51f00499602 | example-cpu_alarm_low-6t65kswutupz | ok | low | True | True | cpu_util < 10.0 during 1 x 60s | None |
+--------------------------------------+-------------------------------------+-------------------+----------+---------+------------+--------------------------------+------------------+

Automatically Scaling Up

Heat will scale instances up based on the cpu_alarm_high threshold. Once CPU utilization is above 50% instances will be scaled up. In order to generate CPU load, log into the instance and run the “dd” command.

$ssh -i admin.pem cirros@192.168.122.232
$sudo -i
#dd if=/dev/zero of=/dev/null &
#dd if=/dev/zero of=/dev/null &
#dd if=/dev/zero of=/dev/null &

Upon running “dd” commands we should have close to 100% CPU utilization in our cirros instance. After 60 seconds we should see that Heat has scaled and we have two instances.

#ceilometer alarm-list
+--------------------------------------+-------------------------------------+-------+----------+---------+------------+--------------------------------+------------------+
| Alarm ID | Name | State | Severity | Enabled | Continuous | Alarm condition | Time constraints |
+--------------------------------------+-------------------------------------+-------+----------+---------+------------+--------------------------------+------------------+
| 04b4f845-f5b6-4c5a-8af0-59e03c22e6fa | example-cpu_alarm_high-rd5kysmlahvx | ok | low | True | True | cpu_util > 50.0 during 1 x 60s | None |
| ac81cd81-20b3-45f9-bea4-e51f00499602 | example-cpu_alarm_low-6t65kswutupz | alarm | low | True | True | cpu_util < 10.0 during 1 x 60s | None |
+--------------------------------------+-------------------------------------+-------+----------+---------+------------+--------------------------------+------------------+
#nova list
 +--------------------------------------+-------------------------------------------------------+--------+------------+-------------+--------------------------------------+
 | ID | Name | Status | Task State | Power State | Networks |
 +--------------------------------------+-------------------------------------------------------+--------+------------+-------------+--------------------------------------+
 | 3f627c84-06aa-4782-8c12-29409964cc73 | ex-qeki-3azno6me5gvm-pqmr5zd6kuhm-server-gieck7uoyrwc | ACTIVE | - | Running | private=10.10.1.156, 192.168.122.234 |
 | 0f69dfbe-4654-474f-9308-1b64de3f5c18 | ex-qeki-qmvor5rkptj7-krq7i66h6n7b-server-b4pk3dzjvbpi | ACTIVE | - | Running | private=10.10.1.157, 192.168.122.235 |
 +--------------------------------------+-------------------------------------------------------+--------+------------+-------------+--------------------------------------+

After additional 60 seconds we should see that Heat has scaled again and we have three instances. Since three is the maximum for this configuration, we will not scale any higher.

#nova list
 +--------------------------------------+-------------------------------------------------------+--------+------------+-------------+--------------------------------------+
 | ID | Name | Status | Task State | Power State | Networks |
 +--------------------------------------+-------------------------------------------------------+--------+------------+-------------+--------------------------------------+
 | 3f627c84-06aa-4782-8c12-29409964cc73 | ex-qeki-3azno6me5gvm-pqmr5zd6kuhm-server-gieck7uoyrwc | ACTIVE | - | Running | private=10.10.1.156, 192.168.122.234 |
 | 0e805e75-aa6f-4375-b057-2c173b68f172 | ex-qeki-gajdwmu2cgm2-vckf4g2gpwis-server-r3smbhtqij76 | ACTIVE | - | Running | private=10.10.1.158, 192.168.122.236 |
 | 0f69dfbe-4654-474f-9308-1b64de3f5c18 | ex-qeki-qmvor5rkptj7-krq7i66h6n7b-server-b4pk3dzjvbpi | ACTIVE | - | Running | private=10.10.1.157, 192.168.122.235 |
 +--------------------------------------+-------------------------------------------------------+--------+------------+-------------+--------------------------------------+

Automatically Scaling Down

Heat will scale instances down based on the cpu_alarm_low threshold. Once CPU utilization is below 10% instances will be scaled down. We can simply kill the “dd” processes and watch Heat scale instances back down.

After stopping “”dd” processes we should see that the cpu_alarm_low event is triggered. This will cause Heat to scale down and remove instance.

#ceilometer alarm-list
+--------------------------------------+-------------------------------------+-------+----------+---------+------------+--------------------------------+------------------+
| Alarm ID | Name | State | Severity | Enabled | Continuous | Alarm condition | Time constraints |
+--------------------------------------+-------------------------------------+-------+----------+---------+------------+--------------------------------+------------------+
| 04b4f845-f5b6-4c5a-8af0-59e03c22e6fa | example-cpu_alarm_high-rd5kysmlahvx | ok | low | True | True | cpu_util > 50.0 during 1 x 60s | None |
| ac81cd81-20b3-45f9-bea4-e51f00499602 | example-cpu_alarm_low-6t65kswutupz | alarm | low | True | True | cpu_util < 10.0 during 1 x 60s | None |
+--------------------------------------+-------------------------------------+-------+----------+---------+------------+--------------------------------+------------------+

After a few minutes we should be back to a single instance.

Summary

In this article we talked about how Heat and Ceilometer work together within OpenStack, providing the brains behind your cloud. We looked at a typical cloud use case around auto scaling and how to configure auto scaling through Heat. Hopefully you found this article informative. As always any feedback is greatly appreciated and as we say at Red Hat, sharing is caring.

Happy Auto Scaling!

(c) 2015 Keith Tenzer

10 thoughts on “Auto Scaling Instances with OpenStack

  1. Thank you for the article. I am currently dealing with using Autoscaling in Openstack, but I have issue here. I can see in heat-engine.log that alarm is in new state, and that the autoscaling group is being adjusted by 1, but after this there is an authentication error and as a result no new virtual machine is spawned. Do you have an idea what could be wrong?

    Like

      • Actually I was not owner of the Open stack cloud, but they moved to another Red Hat Open Stack, where the autoscaling is working using webhook. However, cpu util alarm is not working well, the alarm is still in “insufficient data” state. I configured the “cpu_alarm_low” and “cpu_alarm_high” configuration period to 60, and ceilometer interval in pipeline.yaml to 60, as you say in the article. Do you know what could be causing the problem?

        Like

      • Hi Michal,

        The issue you refer to I have also experienced. It was caused by Heat CFN missing. You Need Heat, Ceilometer, Heat CloudWatch and Heat CFN in order for scaling policies to work. Let me know if this helps?

        Regards,

        Keith

        Like

  2. autoscaling Expecting to find username or userId in passwordCredentials – the server could not comply with the request since it is either malformed or otherwise incorrect

    I faced above nova error..
    what’s the matter?..
    Help me

    Like

    • I did this originally on kilo but it should work in liberty and even newer releases. The only thing that might change is heat template syntax but you newer osp versions support older heat templates which is why you have template version.

      Like

      • It can take a while for ceilometer to gather data so I wonder if that is issue? You can change default polling to something more aggressive. If you look at my blog on autoscaling this is documented. Let me know how it goes?

        Like

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s