Deploy steps tutorial
In this tutorial I'm showing how to create a custom deploy step for Ironic, how to build a ramdisk with in and how to use it when deploying a node.
Deploy steps are an answer to question "how do I run non-standard actions during deployment". Out-of-band steps run from your control plane and can talk to the BMC. More interesting for us are in-band steps that run from within the machine and offer nearly infinite opportunity for customization.
Today we'll create a solution for the following story:
As an operator, I would like to inject small files into the root partition of the final instance through the bare metal API.
There are, of course, numerous way of implementing it, cloud-init being probably the most popular. But we will concentrate on using a deploy step. The complete source code for this tutorial can be found here: https://github.com/dtantsur/ironic-inject-files/
Prerequisites
What will we need for this exercise? Of course a functional Ironic installation! If you don't have one, create it with Bifrost. If you use another way to install ironic, everything should work, but the paths may be different.
I will use a CentOS 8 image created with:
This guide assumes some familiarity with Ironic and its CLI. I also recommend reading my post on scheduling first.
Writing a step
We are creating an in-band deploy step, thus the code will be executed inside the deployment ramdisk on the target node. We will not have access to the control plane, but we will have unlimited access to hardware and will be able to mount the disks.
Before we dive into code we need to decide on the priority, which defines when exactly the step will be running. Looking at the existing steps, we need to run our step between the image is written (priority 80) and the ramdisk is shut down (priority 40). In theory the new deploy step may modify the files that affect the bootloader installation, so let's pick 50.
Finally, we need to understand what we can and cannot use. We cannot use Ironic
itself. We can use any Python library that can be installed with pip
. What
about agent API?
The more stable API is offered by ironic-lib - the internal shared library used by Ironic components. Somewhat less stable API is offered by ironic-python-agent itself. Try to use ironic-lib whenever possible, falling back to ironic-python-agent internal API when required. Hardware manager calls (the ones from HardwareManager) must always be made via dispatch_to_managers, not directly!
Initial structure
We are building a Python package, and we'll use some standard OpenStack libraries, for example pbr. Let's start with this layout:
$ tree . ├── ironic_inject_files.py ├── LICENSE ├── setup.cfg └── setup.py 0 directories, 4 files
setup.py
will be a stub that tells pbr to look at setup.cfg
:
We'll start with a very simple setup.cfg
:
Hardware manager
Hardware managers are a kind of ironic-python-agent plugins. They are extremely powerful, but today we're only interested in get_deploy_steps.
Let us start working on ironic_inject_files.py
by adding a skeleton for a
hardware manager:
from ironic_python_agent import hardware class InjectFilesHardwareManager(hardware.HardwareManager): HARDWARE_MANAGER_NAME = 'InjectFilesHardwareManager' HARDWARE_MANAGER_VERSION = '1' def evaluate_hardware_support(self): return hardware.HardwareSupport.SERVICE_PROVIDER
First, we specify the hardware manager name and version. Then we provide the
mandatory evaluate_hardware_support
call that tells the ramdisk whether
this hardware manager is suitable for this node. Our hardware manager is
considered suitable for all nodes and receives the highest priority
(SERVICE_PROVIDER
).
Then we need to declare our future deploy step, let's call it
inject_files
:
def get_deploy_steps(self, node, ports): return [ { 'interface': 'deploy', 'step': 'inject_files', 'priority': 0, 'reboot_requested': False, 'abortable': True, 'argsinfo': { 'files': { 'required': True, 'description': 'Mapping between file paths and their ' 'base64 encoded contents' } } } ] def inject_files(self, node, ports, files): pass
Most of the values are obvious, but why is priority 0? Haven't we agreed to use 50? Well, I could put 50 here, and the step will be always run for all nodes. But I want to show you how to enable optional steps during scheduling. Priority of 0 makes a step optional.
The inject_files
call accepts two standard arguments, the node
as a
dictionary and a list of ports
belonging to it, and one custom argument
files
. If we wanted per-node customization, we should have used
node["instance_info"]
or node["extra"]
dictionaries for passing
information. But I would like to show-case passing arguments via deploy
templates since this approach is more cloud-style and is compatible with
using OpenStack Nova if you need it.
Looking for a partition
We continue with writing a helper function to find a partition with the given
directory. I will cut the corner a bit and look for the /etc
directory as a
marker of the root partition. Real code may rather allow specifying the target
partition.
Using a few existing primitives from ironic-lib and ironic-python-agent:
import contextlib import os import tempfile from ironic_lib import disk_utils from ironic_lib import utils from oslo_concurrency import processutils from ironic_python_agent import hardware # This is being moved to ironic-lib: https://review.opendev.org/c/openstack/ironic-lib/+/774502 def partition_index_to_name(device, index): part_delimiter = '' if 'nvme' in device: part_delimiter = 'p' return device + part_delimiter + str(index) @contextlib.contextmanager def partition_with_path(path): root_dev = hardware.dispatch_to_managers('get_os_install_device') partitions = disk_utils.list_partitions(root_dev) local_path = tempfile.mkdtemp() for part in partitions: if 'esp' in part['flags'] or 'lvm' in part['flags']: LOG.debug('Skipping partition %s', part) continue part_path = partition_index_to_name(root_dev, part['number']) try: with utils.mounted(part_path) as local_path: found_path = os.path.join(local_path, path) LOG.debug('Checking for path %s on %s', found_path, part_path) if not os.path.isdir(found_path): continue LOG.info('Path found: /%s on %s', found_path, part_path) yield found_path return except processutils.ProcessExecutionError as exc: LOG.warning('Failure when inspecting partition %s: %s', part, exc) raise RuntimeError("No partition found with path %s, scanned: %s" % (path, partitions))
dispatch_to_managers calls into another hardware manager that implements
get_os_install_device
- a call to find out which disk device is used for the root file system.list_partitions returns a list of dictionaries with partition information.
We filter out the UEFI boot partition and ignore LVM.
Then we mount each partition (requires ironic-lib 4.5.0 from Wallaby) and look for a path there.
If the path exists, yield it and stop.
Deploy step
Now we are ready to finish our deploy step:
class InjectFilesHardwareManager(hardware.HardwareManager): HARDWARE_MANAGER_NAME = 'InjectFilesHardwareManager' HARDWARE_MANAGER_VERSION = '1' def evaluate_hardware_support(self): return hardware.HardwareSupport.SERVICE_PROVIDER def get_deploy_steps(self, node, ports): return [ { 'interface': 'deploy', 'step': 'inject_files', 'priority': 0, 'reboot_requested': False, 'abortable': True, 'argsinfo': { 'files': { 'required': True, 'description': 'Mapping between file paths and their ' 'base64 encoded contents' } } } ] def inject_files(self, node, ports, files): with partition_with_path('etc') as path: for dest, content in files.items(): content = base64.b64decode(content) fname = os.path.normpath( os.path.join(path, '..', dest.lstrip('/'))) LOG.info('Injecting %s into %s', dest, fname) with open(fname, 'wb') as fp: fp.write(content)
The code is pretty straightforward, we're finding the root partition, decoding the content, recalculating the path and writing the file.
Entry point
The package needs the last touch before we can work on building a ramdisk with
it. Let us update setup.cfg
to help ironic-python-agent find our new
hardware manager:
[metadata] name = ironic-inject-files summary = File injection deploy step for Ironic author = Dmitry Tantsur python-requires = >=3.6 [files] modules = ironic_inject_files [entry_points] ironic_python_agent.hardware_managers = ironic_inject_files = ironic_inject_files:InjectFilesHardwareManager
The last line creates an entry point called ironic_inject_files
in the
ironic_python_agent.hardware_managers
namespace pointing at our new class.
Building ramdisk
In this part we are building a deployment ramdisk with ironic-python-agent and our new hardware manager.
Writing an element
The standard way of building production-ready ramdisks for Ironic is with diskimage-builder (DIB). We will create a new element to install our hardware manager automatically (using the same repository for convenience):
$ tree elements/ elements/ └── ironic-inject-files ├── element-deps ├── install.d │ └── 80-ironic-inject-files-install └── source-repository-ironic-inject-files 2 directories, 3 files
The diskimage-builder documentation explains these files in greater details, here is what we need there:
- element-deps
-
lists dependencies on other elements:
ironic-python-agent-ramdisk source-repositories
The first item simply declares a dependency on ironic-python-agent itself, the second - on a helper element for checking out git repositories.
- source-repository-ironic-inject-files
-
is a configuration for checking out the ironic-inject-files repository:
ironic-inject-files git /tmp/ironic-inject-files https://github.com/dtantsur/ironic-inject-files
This line specifies the destination directory and the source repository. It is convenient that DIB allows overriding it via environment variables, so e.g. you can use your local repository instead of my github.
- install.d/80-ironic-inject-files-install
-
is an actual installation script. The leading number is a priority. Since ironic-python-agent itself is installed with priority 60, we need a higher value (in DIB priorities work the opposite way from deploy steps).
#!/bin/bash if [ "${DIB_DEBUG_TRACE:-0}" -gt 0 ]; then set -x fi set -eu set -o pipefail /opt/ironic-python-agent/bin/pip install /tmp/ironic-inject-files
The first lines are boilerplate typical for DIB elements, the last line installs our project (cloned by source-repository) into the virtual environment where DIB lives.
It is mandatory to make install scripts executable!
chmod +x elements/ironic-inject-files/install.d/80-ironic-inject-files-install
Building and configuring
Now we are ready to build the ramdisk! Assuming you want to use Debian, and
that ironic-python-agent-builder is cloned in /opt/stack
(as the case in
Bifrost) and your project in /home/user
, run:
# Bifrost-specific source /opt/stack/bifrost/bin/activate cd /opt/stack/ironic-python-agent-builder # Build the image export DIB_REPOLOCATION_ironic_inject_files=/home/$USER/ironic-inject-files ironic-python-agent-builder -o ~/ipa-debian-inject-files debian-minimal \ --elements-path /home/$USER/ironic-inject-files/elements \ -e dhcp-all-interfaces -e ironic-inject-files
What is happening here?
DIB_REPOREF_ironic_inject_files
overrides the location for our project (note underscores instead of dashes),debian-minimal
creates a minimal Debian image,dhcp-all-interfaces
(optional) runs DHCP for all interfaces on start-up,ironic-inject-files
requests our new element to be used.
The output will be files ramdisk.kernel
and ramdisk.initramfs
that we
will supply to ironic. On Bifrost images are kept in /httpboot
, thus:
sudo cp ~/ipa-debian-inject-files.kernel /httpboot/ipa.kernel sudo cp ~/ipa-debian-inject-files.initramfs /httpboot/ipa.initramfs
Lastly, Bifrost uses the feature called fast track, in which it keeps the node powered on with the ramdisk running and waits for commands. A node in the fast track mode won't recognize your new hardware manager until you restart it. Power the nodes off with:
baremetal node power off <node>
Deploy templates
Our deploy step is disabled by default (priority = 0). Planned in the Wallaby release is the ability to request non-default deploy steps directly via the provisioning API/CLI. But in our case we can benefit from the older procedure that involves deploy templates.
Let us create a new deploy template called CUSTOM_INJECT_FILES
that runs
our new step at the priority 50 (as agreed before). We will inject a message of
the day:
Hello Ironic
or in base64:
SGVsbG8gSXJvbmljCg==
baremetal deploy template create CUSTOM_INJECT_FILES \ --steps '[{"interface": "deploy", "step": "inject_files", "priority": 50, "args": {"files": { "/etc/motd": "SGVsbG8gSXJvbmljCg==" }}}]'
Here:
interface
must match one of the step (usuallydeploy
),step
is the step name,priority
is the priority to run the step at,args
are used to pass additional arguments to the step.
Next we assign a trait CUSTOM_INJECT_FILES
to some nodes. Traits
represent a certain ability of a node, in this case - to execute the deploy
template with the same name. See my post on scheduling for a much more
detailed explanation.
Deployment
Finally, we need to request an allocation with this trait and deploy on the
resulting node. The manual procedure is somewhat verbose (check the deploy
templates documentation), so we will use metalsmith which has native
support for traits. Assuming you have a CentOS 8 cloud image in
/httpboot/centos8.qcow2
:
metalsmith deploy --resource-class baremetal --trait CUSTOM_INJECT_FILES \ --image file:///httpboot/centos8.qcow2 \ --ssh-public-key ~/.ssh/id_ed25519.pub
The first two arguments concern scheduling. We request resource class
baremetal
and traitCUSTOM_INJECT_FILES
. The trait request automatically engage the corresponding deploy template.The last two arguments provide the image to deploy and a public key to connect to the node later.
Under the hood metalsmith creates an allocation like this:
$ baremetal allocation show 840a2966-e8da-4337-b2e7-e400c81c4fe9 +-----------------+--------------------------------------+ | Field | Value | +-----------------+--------------------------------------+ | candidate_nodes | [] | | created_at | 2021-02-08T18:13:27+00:00 | | extra | {} | | last_error | None | | name | testvm1 | | node_uuid | 4e41df61-84b1-5856-bfb6-6b5f2cd3dd11 | | owner | None | | resource_class | baremetal | | state | active | | traits | ['CUSTOM_INJECT_FILES'] | | updated_at | 2021-02-08T18:13:28+00:00 | | uuid | 840a2966-e8da-4337-b2e7-e400c81c4fe9 | +-----------------+--------------------------------------+
and populates instance_info
like this (more fields in reality):
$ baremetal node show <node> --fields instance_info -f json { "instance_info": { "traits": [ "CUSTOM_INJECT_FILES" ], "capabilities": { "boot_option": "local" }, "image_source": "file:///httpboot/centos8.qcow2" } }
Once the deployment is successful, we can wait a bit to let the operating
system boot, find out the IP address via dnsmasq
logs and try SSH access
using the provided IP and our SSH key:
$ ssh centos@192.168.122.78 ... Hello Ironic
Conclusion
In this post you've learned how to create a deploy step that injects arbitrary files into the final image. Fortunately, you won't have to: I'm planning on upstreaming a more sophisticated version of this deploy step in the Wallaby release. But this knowledge will definitely help you build your own deploy steps.