(This isn't strictly an openshift/os change, but more how the node image is used.)
Starting in 4.19, we switched the installer to directly overlay the node image on top of the initial boot to avoid paying for another reboot (see openshift/installer#8742).
We could consider doing this as well for the scale up case where saving a reboot is also extremely valuable. With the bootimage skew work, we can add enforcements that this feature can only be turned on if the bootimage used is within some version, and that also limits CI coverage needed for this.
There would be some major caveats around this that users would have to accept before turning this on. Primarily the kernel mismatch. Again, I think skew limitations allow us to keep this testable, but any operator that has a tighter binding on the node image kernel than our skew check (or any drivers) wouldn't be usable on the first boot.
There's also soft-reboot nowadays of course. It'd be slower than the overlay approach but probably not by much. Worth looking at the timing. But we do still need to keep the overlay approach anyway for the live case (but ideally eventually extend bootc's soft-reboot feature to support this use case).
Probably simplest would be to move all the node pulling and overlaying logic out of the openshift/installer repo and into the base image where it can be more easily reused.
(This isn't strictly an openshift/os change, but more how the node image is used.)
Starting in 4.19, we switched the installer to directly overlay the node image on top of the initial boot to avoid paying for another reboot (see openshift/installer#8742).
We could consider doing this as well for the scale up case where saving a reboot is also extremely valuable. With the bootimage skew work, we can add enforcements that this feature can only be turned on if the bootimage used is within some version, and that also limits CI coverage needed for this.
There would be some major caveats around this that users would have to accept before turning this on. Primarily the kernel mismatch. Again, I think skew limitations allow us to keep this testable, but any operator that has a tighter binding on the node image kernel than our skew check (or any drivers) wouldn't be usable on the first boot.
There's also soft-reboot nowadays of course. It'd be slower than the overlay approach but probably not by much. Worth looking at the timing. But we do still need to keep the overlay approach anyway for the live case (but ideally eventually extend bootc's soft-reboot feature to support this use case).
Probably simplest would be to move all the node pulling and overlaying logic out of the openshift/installer repo and into the base image where it can be more easily reused.