Ironic Cleaning Deep Dive

The Pike Release Version



Dmitry Tantsur (Principal Software Engineer, Red Hat)
owlet.today/talks/pike-ironic-cleaning-deep-dive

Agenda

  • Cleaning process
  • Clean steps
  • Manual cleaning and RAID

Cleaning process

Cleaning process

Overview

Cleaning is a process of preparation a node for carrying tenant loads.

Common examples are wiping hard drives, updating firmware, resetting BIOS variables, etc.

It is run every time before nodes get available for deployment.

Cleaning process

Ironic state machine

Ironic provision states are organized into a formal state machine: states diagram.

Nodes enter cleaning on the way from manageable or deleting to available.

There is also a way to initiate manual cleaning via API.

Cleaning process

Overview

  1. Order the boot interface to configure booting the IPA ramdisk.
  2. Create port(s) on the cleaning network.
  3. Power on the machine to boot IPA.
  4. On the first heartbeat, collect the clean steps and store them on the node.
  5. Take the first step, execute it.
  6. Reboot, if needed. Wait for IPA heartbeat again.
  7. Repeat, until all steps are done.
  8. Remove ports from the cleaning network.

Cleaning process

Boot and networking

The boot interface prepare_ramdisk is used again, the same as with deployment.

However, during cleaning there are no Neutron ports created in advance by Nova, so Ironic has to create them even in flat case: ironic/drivers/modules/network.flat.py, ironic/common/neutron.py.

Nearly the same procedure is used for neutron networking: ironic/drivers/modules/network/neutron.py.

Cleaning process

Running cleaning

As with deployment, clean steps are run in response to IPA heart beats: ironic/drivers/modules/agent_base_vendor.py.

Then Ironic fetches the next clean step, and runs it: ironic/conductor/manager.py [1], ironic/conductor/manager.py [2].

Here we either run out of further steps, or encounter an asynchronous step, and move back to clean wait.

Clean steps

Clean steps

Overview

Clean steps are actions to run during cleaning. They can be provided by power, management, deploy and raid interfaces.

Clean steps have priority and "abortable" flag. For manual cleaning, clean steps can also have arguments.

With IPA-based deploy, clean steps can also be provided by IPA hardware managers.

Priority defines the order the steps are run. Priority 0 means the step will only work with manual cleaning explained later.

Clean steps

Collecting steps

By default, clean steps are collected from interface methods, marked by a special decorator: ironic/drivers/base.py [1], ironic/drivers/base.py [2].

More advanced interfaces (e.g. IPA-based deploy) can override get_clean_steps.

The IPA-based deploy interfaces also load clean steps from the ramdisk. For this to be possible, Ironic starts the ramdisk unconditionally before starting cleaning.

Clean steps

IPA steps

The clean steps are cached after the first heart beat, to not block get_clean_steps later: ironic/drivers/modules/agent_base_vendor.py.

On the IPA side, each hardware manager that is enabled exposes its clean steps: ironic_python_agent/hardware.py, ironic_python_agent/extensions/clean.py.

Then Ironic can run any of them via another IPA API call: ironic_python_agent/extensions/clean.py.

Manual cleaning

Manual cleaning

Overview

Manual cleaning is a way to run driver-specific actions on manageable nodes. The mechanism is the same as with automated cleaning described previously.

It is not limited to actually cleaning nodes. For example, RAID is implemented through it.

Manual cleaning is started by invoking clean provisioning action and providing a list of clean steps to run and their arguments.

Manual cleaning

RAID

Ironic can build RAID on nodes in advance (when nodes are in manageable state).

The raid interface validates and processes the requested RAID layout: ironic/drivers/raid_config_schema.json.

Then the actual configuration happens on manual cleaning, for example for drac: ironic/drivers/modules/drac/raid.py.

Questions?