Ephemeral workloads with Ironic
In this post I'm presenting the ramdisk deploy interface, explaining how to use it to run ephemeral workloads and how to provide configuration data for them.
Ironic has been actively explored by the scientific community as a way to automate running calculations without incurring the costs of virtualization. This sort of workloads does not necessarily require installing anything on the machine's hard drive, which may instead be used for caching, swap or not used at all. The results are posted back via HTTP(s) or stored on a network share.
Ramdisk deploy
Ironic has a concept of deploy interfaces. They can be configured per node and define how exactly the provisioning process happens. One of the deploy interface implementations is the ramdisk deploy interface which essentially bypasses the whole deployment process and boots the provided ramdisk or an ISO image directly. The hard drive, if present at all, is not touched, the operating systems runs fully in RAM.
Two uses cases influenced the development of this interface:
Scientific workloads that don't need persistent local state.
3rd party installers.
The latter is a relatively new idea currently researched by the OpenShift community to reuse the same installer across the platforms.
Configuring
The ramdisk deploy is configured the same way as other deploy interfaces:
and can be set per node:
baremetal node set <node> --deploy-interface ramdisk
Starting with the Wallaby cycle, Bifrost enables the ramdisk deploy by default, and I will use it throughout this guide.
Boot interfaces
Another Ironic concept is boot interface, which specifies how exactly a ramdisk (either our service ramdisk or a user-provided one) gets to a node. Since boot interface implementations have a direct impact on booting ramdisks, not all of them support the ramdisk deploy or all its options. The two main implementations are:
Deploying
First, nodes must be configured with the right deploy and boot interfaces. For iPXE:
baremetal node set <node> --boot-interface ipxe --deploy-interface ramdisk
For Redfish virtual media:
baremetal node set <node> --boot-interface redfish-virtual-media --deploy-interface ramdisk
The for each deployment you need to provide a link to the kernel and ramdisk.
Using the file
protocol:
baremetal node set <node> \ --instance-info kernel=file:///httpboot/ramdisk.kernel \ --instance-info ramdisk=file:///httpboot/ramdisk.initramfs \ --instance-info image_source=file:///httpboot/ramdisk.initramfs
Using HTTP:
baremetal node set <node> \ --instance-info kernel=http://<bifrost IP>/ramdisk.kernel \ --instance-info ramdisk=http://<bifrost IP>/ramdisk.initramfs \ --instance-info image_source=http://<bifrost IP>/ramdisk.initramfs
If you have an ISO image, provide only it:
baremetal node set <node> --instance-info boot_iso=http://<bifrost IP>/myimage.iso
In the end deploy as normally:
baremetal node deploy <node>
After a few seconds your node will become active
and start booting your
ramdisk of choice.
To clean or not to clean?
Automated cleaning normally runs before a node is first available for deployment and between deployments. While it's highly encouraged to leave cleaning enabled, it may not make much sense for ephemeral workloads. Starting with Ironic 16.1.0 (Wallaby) and soon-to-be-released ironicclient 4.5.0 (also Wallaby), it is possible to disable cleaning for a node:
baremetal node set <node> --no-automated-clean
and enable it back afterwards:
baremetal node set <node> --automated-clean
If you workloads leave any temporary data (or you're using an older Ironic),
it's highly recommended to keep cleaning enabled. If the data is not sensitive,
you can limit cleaning to only remove metadata (partitioning)
in ironic.conf
:
Building images
Ironic does not place any restrictions on the content of the operating system,
other than a few caveats mentioned for boot interfaces. Starting with
version 2.4.0, ironic-python-agent-builder contains the
ironic-ramdisk-base
element that can be used by diskimage-builder to build
general purpose ramdisks.
Assuming you want to use a minimal Debian image and that
ironic-python-agent-builder is cloned in /opt/stack
(as the case in
Bifrost), run:
export ELEMENTS_PATH=/opt/stack/ironic-python-agent-builder/dib disk-image-create -o ~/ramdisk debian-minimal \ ironic-ramdisk-base devuser simple-init openssh-server
What is happening here?
debian-minimal
creates a minimal Debian image,ironic-ramdisk-base
creates a ramdisk instead of a normal image,devuser
(optional) configures authorized keys for userdevuser
(can also create a user of your choice - see devuser documentation),simple-init
(optional) installs Glean which handles network configuration,openssh-server
ensures the image has an SSH server enabled.
Instead of simple-init
you can use:
dhcp-all-interfaces
(optional) runs DHCP for all interfaces on start-up.
Depending on your workloads you may want to add:
stable-interface-names
(optional) force network interface names to be stable using biosdevname.
The output will be files ramdisk.kernel
and ramdisk.initramfs
that
you supply to Ironic. On Bifrost images are kept in /httpboot
, thus:
sudo cp ~/ramdisk.kernel ~/ramdisk.initramfs /httpboot
Disconnected deploy
In the previous example we relied on DHCP for networking and devuser
for
SSH access. Using DHCP may be problematic for nodes that do not have L2
connectivity to the control plane; you may also need predictable IP addresses
or different SSH keys per instance. For normal deployments this is handled by
providing a config drive to first-boot scripts like Glean or cloud-init.
Starting with the Wallaby release of Ironic, it's also possible for ramdisk
deployments.
Config drives
Config drive is a file system (usually ISO 9660, the one used for data CD) that contains first-boot configuration. Bare Metal provisioning API accepts config drives either as a gzipped and base64 encoded blob or as JSON sources.
- meta_data
-
This structure contains generic information about the instance. The most useful fields are:
name
(Glean) orhostname
(cloud-init) - host name for the node. Ironic defaultsname
to the node name.public_keys
- a dictionary with SSH public keys as values. They will be added as authorized keys for theroot
user.
Example:
- network_data
-
Network configuration. The exact capabilities differ between Glean and cloud-init, but both a capable of setting IPv4 addresses (Glean has only limited support for IPv6), configuring DNS and creating bonds.
Example:
{ "links": [ { "id": "port-023c6a90-1e4b-4e02-a119-131d8a729b60", "type": "phy", "ethernet_mac_address": "52:54:00:1f:79:7e" } ], "networks": [ { "id": "network0", "type": "ipv4", "link": "port-023c6a90-1e4b-4e02-a119-131d8a729b60", "ip_address": "192.168.122.42", "netmask": "255.255.255.0", "network_id": "network0", "routes": [] } ], "services": [] }
Ramdisk with a config drive
Adding a config drive to a ramdisk deployment is only supported for Redfish virtual media and works by attaching the config drive to a virtual USB slot (the hardware must have it, which is quite common).
If your hardware supports redfish-virtual-media
boot, everything else works
the same way as for normal deployments: either build an image with your config
drive or make Ironic build it for you:
baremetal node deploy <node> --config-drive '{"meta_data": {...}, "network_data": {...}}'
Scripting
The deployment command becomes quite large for larger config drives, so you may script the deployment in Python instead. As a nice side effect, you'll be able to populate MAC addresses automatically.
Start with installing openstacksdk and following its instructions on populating the environment. For Bifrost it's enough to do:
Now you can create a connection and fetch the required objects:
# Using environment variables for connection parameters conn = openstack.connect().baremetal node = conn.get_node(node_id) port = next(conn.ports(node=node_id))
Then build meta data and network configuration:
meta_data = { "public_keys": { "0": open(os.path.expanduser("~/.ssh/id_ed25519.pub"), "rt").read(), } } network_data = { "links": [ { "id": f"port-{port.id}", "type": "phy", "ethernet_mac_address": port.address, } ], "networks": [ { "id": "network0", "type": "ipv4", "link": f"port-{port.id}", "ip_address": ip, "netmask": "255.255.255.0", "network_id": "network0", "routes": [] } ], "services": [] }
Finally, configure the node (we set boot and deploy interfaces just in case) and deploy it:
conn.update_node( node, boot_interface="redfish-virtual-media", deploy_interface="ramdisk", instance_info={"kernel": kernel, "ramdisk": initramfs, "image_source": initramfs}, ) conn.set_node_provision_state( node, 'active', config_drive={'meta_data': meta_data, 'network_data': network_data}, wait=True, timeout=300, )
A complete example script can be found here: https://gist.github.com/dtantsur/7e614963d48cd929ef39fa60c0b34a3d.
See it in action
This short demo (no sound) show-cases what I've just explained: