Ironic node multi-tenancy


This is a guest post from Tzu-Mainn Chen, Principal Software Engineer from Red Hat, who is working on the Elastic Secure Infrastructure initiative.

Ironic node muti-tenancy is a relatively new feature in Ironic. What is it and how can it be used?

What is node multi-tenancy?

The concept is pretty simple. Ironic was originally designed for admin-only usage, that is, only a user with administrative privileges could access its API, and every user had full access to all nodes (bare metal machines in Ironic terminology). Node multi-tenancy allows regular users to gain selective API access to a subset of nodes.

Why would I use it?

Imagine a bare metal cloud with several projects. Node multi-tenancy makes it easy to give each project an exclusive set of nodes and prevent its members from accessing nodes from other projects.

There's an even more ambitious use case, however. An organization that employs a bare metal cloud may have frequent periods of time when hardware is left idle. Is there a way to increase utilization during those moments?

The Elastic Secure Infrastructure (ESI) takes it one step further: create a bare metal cloud consisting of hardware owned by multiple independent contributory organizations. Hardware owners have exclusive access to their machines by default. However, they also have the option of leasing their nodes to lessees. Organizations can thus lend out and borrow nodes, maximizing bare metal utilization and allowing everyone to benefit!

How is it implemented?

The implementation steps for node multi-tenancy are straightforward:

  1. Set owner and lessee (optional) fields on the nodes.
  2. Expose owner and lessee information to node API policy checks.

A node can be assigned to a project by setting the owner field (e.g. to a Keystone project ID):

openstack baremetal node set <node> --owner <project ID>

The following changes allow us to define two new policy rules:

"is_node_owner": "project_id:%(node.owner)s"
"is_node_lessee": "project_id:%(node.lessee)s"

Ironic operators can then use these rules to grant access to certain API endpoints. For example:

"baremetal:node:update": "rule:is_admin or rule:is_node_owner"
"baremetal:node:set_power_state": "rule:is_admin or rule:is_node_owner or rule:is_node_lessee"

This policy file modification is all that's needed to allow for node multi-tenancy!

There are a few other details to the implementation:

  1. One of our goals was to allow for non-admins to use standalone Ironic to provision a node. In order to do so, a non-admin has to be able to update a node's extra and instance_info fields. However, we don't want a node lessee to modify other attributes of a node. In order to handle this case, we added two additional policy rules: baremetal:node:update_extra and baremetal:node:update_instance_info.
  2. In order to allow non-admins to view and potentially manage a node's associated baremetal ports, we updated port API operations to expose the node's owner and lessee. This change allows us to define the following policy rule:
    "baremetal:port:get": "rule:is_admin or rule:is_observer or rule:is_node_owner or rule:is_node_lessee"
  3. We added an owner field to node allocations. A non-admin creating an allocation will be assigned as that allocation's owner, and the allocation conductor will only match nodes that the allocation owner can access.

We tested these changes with MetalSmith - a client-side Python library for provisioning Ironic nodes - and a modified policy file, and everything worked exactly as expected! Owners and lessees were able to provision upon nodes that they owned or leased, and were unable to access other nodes.

When can I use it?

The initial implementation of node multi-tenancy has already been released as part of OpenStack Ussuri.

Where can I see it in action?

This ESI presentation and demo video provides a look at what node multi-tenancy looks like in action:

ESI is also preparing to perform a trial rollout of a hardware leasing system in the Massachusetts Open Cloud (MOC).

What's next?

Although the base implemenation of node multi-tenancy is complete, there's still work to be done. Some of it is directly related to multi-tenancy: we'd like to enable non-admins to use boot-from-volume. Other features are indirect. For example, we're working on integrating Keylime with Ironic, allowing for node attestation that gives assurances that a node has not been tampered with.