Ironic Wallaby meetup - Russian edition

We've had a very successful meetup about Ironic in Russian. This post is a short summary of what we discussed and what I learned.

If you understand Russian and have 3 hours of free time, you can just follow the recording:

The slides (also in Russian) are here: https://owlet.today/talks/ironic-wallaby-ru/

Audience

The meetup had 80 registrations, up to 30 attendees were online at the same time (not counting those who follow the translations). We did not have a rigid agenda, rather a natural flow with a lot of input from the participants.

We haven't conducted a formal poll of the attendees, but we've had people with different relationships with Ironic, including operators of:

Wallaby summary

I presented my personal top 9 most interesting features. Somewhat surprisingly to me, the new Anaconda deploy interface did not attract any interest, while the news from the deploy steps area did cause some excitement and follow-up questions.

Inspired by the deploy steps news, there was a small discussion about integration with inventorization systems like Netbox. The specific request was to validate the inspected LLDP information against the expected wiring, which was found possible to implement with an introspection hook, but the matching itself seemed problematic.

NVMe secure erase proved something that quite a few operators were interested in. A related request was for the iDRAC hardware type to support out-of-band secure erase that some RAID controllers offer. I need to follow-up with the Dell team on this topic.

Last but not least, a participant with thousands of nodes voiced a request for better scaling. We had a short discussion about the issues with scaling the Compute Resource Tracker, I pointed at the approach CERN took to resolve them and announced that performance is one of the main topics for Xena.

Redfish and features

Redfish topics attracted a lot of interest. I explained in depth what virtual media is and how it works. One participant was interested in using 3rd party installation ISOs with Ironic, which is something we have already implemented in Metal3 using the ramdisk deploy interface. This functionality proved unfamiliar to the attendees, and I wonder how we can make such somewhat exotic features more obvious to Ironic operators.

There was a request for graphical console access via Redfish. They may even get someone working on that if the stars align. Another request concerned attaching fibre channel volumes to nodes via Redfish. Apparently, there is support for that in Redfish (in network interfaces), but it's unclear how widely it is actually implemented. As with the most wishlist items we need volunteers to work on it.

Standalone

There was a good share of attendees who have either standalone Ironic or one of the projects that use it internally (Bifrost, Metal3, Airship) in their fleet. I presented the recently introduced bifrost-cli as a way for newcomers to try Ironic.

There were two complaints related to Bifrost:

  • The code respondible for downloading IPA is fragile and very ugly. Apparently, we may end up with a half-downloaded image.

  • It is hard to customize Bifrost (provide parameters) when it's used inside of Kolla inside of Kayobe.

We got some positive feedback about Metal3. The only request was to consider running Ironic in an HA configuration (e.g. on all 3 controllers except for only one). There is similar interest in the community, but we probably need to finish support for JSON RPC in ironic-inspector.

Networking

One of panelists presented their networking architecture. They make a heavy use of networking-ansible to provide network separation, and it works pretty well for them. The only issue is scaling to more than a hundred of parallel provisions. They're interested in contributing their fixes to the networking-ansible upstream, maybe we should reopen the discussion on accepting the project under the Ironic umbrella.

A discussion of port security ended with an expected conclusion that it's not usable for bare metal. Smart NICs were mentioned as a potential fix, but nobody really tried them with OpenStack, nor was it clear, to which extent they actually work. The price of these devices clearly puts operators off.

Some attendees were interested in DHCP-less deployments, and one even took it further with an idea to provide deploy steps via the ISO and avoid any interaction between Ironic and the machine. This does not seem to aligh with our plans but is an interesting idea nonetheless.

Communication

An important discussion happened around the question, how we can bridge the gap between developers and operators.

Of three people who voiced their opinion on the topic of the PTG two found the virtual format too difficult to participate in. Lack of personal contact was mentioned as one of the blockers for participation, the other being an overwhelming number of sessions happening in parallel, with little break and with unclear and inconsistent schedule.

When it comes to contributing changes, the number of revisions required is putting operators off. Additionally, there is a lot of frustration around backports: operators want to patch the version they use, but we only apply feature changes to the master branch. The difference between their version (often Ussuri and older at this point) and the upstream (Xena right now) forces them to write two versions of a patch. As a counter-argument one of the panelists noted, that this work has to be done anyway if they plan on ever upgrading.

"Features just appear out of thin area", was the perception of roadmap planning of many OpenStack projects. We may need to advertise our priorities process and its artifacts clearer, as well as document it. We probably do document it somewhere in the contributor guide, but as it is now it's too hard to navigate even for me. One request was to record a video of making a patch from zero to submission.

Overall, some operators would like to get more insight into the internal processes of the projects. Someone wanted to have a clear idea who in the core team is responsible for which area of the project - this is something we could document on a new "Core team" page on ironicbaremetal.org. Another action item was proposed for us to specify database changes in the release notes.

There was clearly some interest for mentoring, but I could not quite understand if anyone was interested in a more committed long-term mentoring, or rather one-off help with a patch or two. One attendee argued that a more global mentoring initiative must come from the TC.

Last and not least, documentation was a standard source of complaints. More specifically, we put quite a lot of good effort in documenting each feature, but much less - in how to make everything work together. I used this opportunity to sales-pitch my blog, but we may need to rethink our approach to the documentation. The future blog section of ironicbaremetal.org may help if we keep populating it.

Conclusion

The meetup was definitely a success. With around ten active participants and two dozens of listeners it clearly surpassed my expectations, and there is a possibility that these events will be recurring. There is an interest in similar meetups for other projects, and we should consider organizing them in other languages as well.