Skip to content

osbuild: use bootc install to deploy the container#4224

Open
jbtrystram wants to merge 7 commits intocoreos:mainfrom
jbtrystram:osbuild-bootc-install-fs
Open

osbuild: use bootc install to deploy the container#4224
jbtrystram wants to merge 7 commits intocoreos:mainfrom
jbtrystram:osbuild-bootc-install-fs

Conversation

@jbtrystram
Copy link
Copy Markdown
Member

Instead of deploying the container to the tree then copy all the contents to the disk image, use bootc to directly manage the installation to the target filesystems.

Right now this requires to use the image as the buildroot so this requires python (for osbuild). This is tracked in [1].

[1] bootc-dev/bootc#1410 Requires osbuild/osbuild#2149

@openshift-ci
Copy link
Copy Markdown

openshift-ci bot commented Jul 17, 2025

Skipping CI for Draft Pull Request.
If you want CI signal for your change, please convert it to an actual PR.
You can still manually trigger a test run with /test all

Copy link
Copy Markdown
Contributor

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

The pull request introduces changes to use bootc install to deploy the container, which simplifies the image build process. There are a few critical issues in the YAML manifest related to copy-paste errors that lead to incorrect configurations for the 4k image builds and missing options for loopback devices. These issues need to be addressed.

@dustymabe
Copy link
Copy Markdown
Member

dustymabe commented Jul 17, 2025

I switched the CI on this to run against rawhide (contains python) so we could actually test the change.

@dustymabe
Copy link
Copy Markdown
Member

A few diffs picked up by cosa diff --metal from #4226

cosa-diff-metal.txt

We should probably profile each diff (maybe in coreos/fedora-coreos-tracker#1827) and evaluate whether it's a change we want to make or not.

@dustymabe
Copy link
Copy Markdown
Member

I can't get a built qemu image to boot. I suspect probably the root= and boot= UUIDs added on the kernel command line?

@jbtrystram
Copy link
Copy Markdown
Member Author

I can't get a built qemu image to boot. I suspect probably the root= and boot= UUIDs added on the kernel command line?

do you mind sharing more logs ? What I am getting locally is ignition failing on coreos/fedora-coreos-tracker#1250

@dustymabe
Copy link
Copy Markdown
Member

Ahh. I see that too now:

[    4.726843] ignition[875]: Ignition failed: failed to create users/groups: failed to configure users: failed to create user "core": exit status 10: Cmd: "useradd" "--root" "/sysroot" "--create-home" "--password" "*" "--comment" "CoreOS Admin" "--groups" "adm,sudo,systemd-journal,wheel" "core" Stdout: "" Stderr: "useradd: cannot lock /etc/group; try again later.\n"

@jbtrystram

This comment was marked as outdated.

@jbtrystram

This comment was marked as outdated.

@jbtrystram
Copy link
Copy Markdown
Member Author

I can't get a built qemu image to boot. I suspect probably the root= and boot= UUIDs added on the kernel command line?

looks like removing those make the boot process go further (ignition completes), and out of the initramfs but fail to mount the boot partition.

@jbtrystram
Copy link
Copy Markdown
Member Author

Blocked on bootc-dev/bootc#1441

@jbtrystram
Copy link
Copy Markdown
Member Author

ok this works with the following PRs :

for the bootc PR, it can be built then added into the image through overrides/rootfs. Make sure to build rawhide.

@jbtrystram
Copy link
Copy Markdown
Member Author

follow-up : either find a way to get the boot components inside cosa, or change the bootc code to call bootupd from the deployed root . I think the latter is preferable.
I filed bootc-dev/bootc#1455

@jbtrystram
Copy link
Copy Markdown
Member Author

jbtrystram commented Jul 29, 2025

follow-up : either find a way to get the boot components inside cosa, or change the bootc code to call bootupd from the deployed root . I think the latter is preferable. I filed bootc-dev/bootc#1455

Made bootc-dev/bootc#1460
With this, we no longer require to use the container as the buildroot, cosa works, so we could do that on all streams.

@jbtrystram jbtrystram force-pushed the osbuild-bootc-install-fs branch 4 times, most recently from bb4270f to 310bd60 Compare July 30, 2025 07:38
@jbtrystram
Copy link
Copy Markdown
Member Author

Alright, marking this as ready for review as all the bits are in place.
I guess i need to update the osbuild manifest or the other arches as well, but I'll do that after a review to reduce the amount of back and forth.

This will need a release of bootc.

@jbtrystram jbtrystram marked this pull request as ready for review July 30, 2025 07:44
Copy link
Copy Markdown
Member

@dustymabe dustymabe left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Some comments.

I think there are a few things we need to iron out before we can really move forward with this:

  1. supporting both old and new paths at the same time

Do we need to? Usually when we make a change this large we roll it out slowly, which means we have to support both ways for some time.

This PR is ignoring that fact, but TBH looking at OSBuild configs that support both would be pretty intimidating, so I'm not excited about trying to do that either. I'd be interested in @jlebon or @travier's thoughts.

  1. We need to make sure any/every diff that exists between images generated this way and the old way are considered and acknowleged as acceptable before we'd make this change.

@jbtrystram jbtrystram force-pushed the osbuild-bootc-install-fs branch from bb4e221 to 3772cfb Compare March 10, 2026 10:34
@jbtrystram
Copy link
Copy Markdown
Member Author

If this can save a bit of time reviewing this, attached to this comment is the output for the following:

cosa build
# ommitting output..
cosa osbuild metal
# ommitting output..
cosa build --force
# ommitting output..
COSA_OSBUILD_USE_BOOTC_INSTALL=1 cosa osbuild --force
# ommitting output...
cosa list
43.20260309.20.dev1
   Timestamp: 2026-03-10T14:38:38Z (0:06:09 ago)
   Artifacts: ostree oci-manifest metal
      Config: 3c4e6215245ab19055a8d143895eb4069095d19f

43.20260309.20.dev0
   Timestamp: 2026-03-10T14:29:52Z (0:14:55 ago)
   Artifacts: ostree oci-manifest metal
      Config: 3c4e6215245ab19055a8d143895eb4069095d19f

# where from is the first build and to is the image build via bootc install
sudo cosa diff --from 43.20260309.20.dev0 --to 43.20260309.20.dev1 --metal > metal-diff 

metal-diff.txt

@jbtrystram jbtrystram force-pushed the osbuild-bootc-install-fs branch from 3772cfb to 99384c3 Compare March 25, 2026 20:12
@travier travier requested review from alicefr and travier March 30, 2026 15:42
@dustymabe
Copy link
Copy Markdown
Member

I did get back to this today. Doing some local testing.

Comment on lines +101 to +103
# TODO move this to an overlay in fedora-coreos-config
# so it get baked into the container at build time. We
# want the container to be the source of truth as much as possible.
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

agree. I guess this is something we should go ahead and do?

Though I do have a question, will bootc work either way?

  • /usr/lib/bootc/install/10-ostree.toml in buildroot
  • /usr/lib/bootc/install/10-ostree.toml in target container

Does one take precedence over the other?

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Copy Markdown
Member Author

@jbtrystram jbtrystram Apr 3, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok I did some more testing on this to answer Dusty's question above.
Until we move to image builder we have to keep those in COSA, because we call bootc from the COSA and not the target container, so bootc won't read the configs.

See coreos/fedora-coreos-config#4093 (comment) for more details.
I filed bootc-dev/bootc#2122

btw : Yet another thing that would be easy if we had python in.
I will update that comment to be clearer.

Though I do have a question, will bootc work either way?

/usr/lib/bootc/install/10-ostree.toml in buildroot
/usr/lib/bootc/install/10-ostree.toml in target container

In this PR, only buildroot matters. With image-builder, the target container one will override the buildroot. (Note: only if they have the same name, otherwise they'll merge)

@dustymabe
Copy link
Copy Markdown
Member

OK did a deep dive here today. Got distracted with a few things like python :( (added a comment to 1730) and also simplifying our manifests just in general to make them more maintainable (AI actually gave me some good insights today on this).

Here's some general comments:


Trying to do cosa diff --metal I got an error:

2026-03-31 16:26:08,111 INFO - Running command: ['tar', '-xf', '/tmp/tmpait9uwqf.tar', '-C', 'tmp/diff-cache/metal/43.20260331.20.dev1', '--transform', 's|[[:xdigit:]]{64}|XXXXXXXXXXXXXXXX|gx']
tar: ./ostree/bootc/storage/overlay/backingFsBlockDev: Cannot mknod: Operation not permitted

The file inside the disk image is:

[core@cosa-devsh ~]$ sudo ls -l --color /ostree/bootc/storage/overlay/backingFsBlockDev                                
brw-------. 1 root root 259, 3 Mar 31 15:24 /ostree/bootc/storage/overlay/backingFsBlockDev

and I had to hack cmd-diff to get the diff to work:

diff --git a/src/cmd-diff b/src/cmd-diff
index 5424fa497..78051e544 100755
--- a/src/cmd-diff
+++ b/src/cmd-diff
@@ -568,6 +568,7 @@ def diff_metal_helper(diff_from, diff_to):
                 # in filenames with XXXXXXXXXXXXXXXX so that we can get a real diff between
                 # two of the same files in different builds.
                 runcmd(['tar', '-xf', tmp_tar.name, '-C', diff_dir,
+                        '--exclude', '*backingFsBlockDev',
                         '--transform', 's|[[:xdigit:]]{64}|XXXXXXXXXXXXXXXX|gx'])
 
         except Exception as e:

The most disturbing thing I see in the actual diff is the origin file has changed:

diff --git a/tmp/diff-cache/metal/43.20260309.20.dev0/ostree/deploy/fedora-coreos/deploy/XXXXXXXXXXXXXXXX.0.origin b/tmp/diff-cache/metal/43.20260309.20.dev1/ostree/deploy/fedora-coreos/deploy/XXXXXXXXXXXXXXXX.0.origin
index b1f437b..60d27fc 100644
--- a/tmp/diff-cache/metal/43.20260309.20.dev0/ostree/deploy/fedora-coreos/deploy/XXXXXXXXXXXXXXXX.0.origin
+++ b/tmp/diff-cache/metal/43.20260309.20.dev1/ostree/deploy/fedora-coreos/deploy/XXXXXXXXXXXXXXXX.0.origin
@@ -1,2 +1,2 @@
 [origin]
-container-image-reference=ostree-image-signed:docker://quay.io/fedora/fedora-coreos:testing-devel
+container-image-reference=ostree-unverified-registry:ostree-image-signed:docker://quay.io/fedora/fedora-coreos:testing-devel

If I'm not mistaken doesn't zincati use the origin file and this might throw it off?

@dustymabe
Copy link
Copy Markdown
Member

and also simplifying our manifests just in general to make them more maintainable (AI actually gave me some good insights today on this).

The insight it gave me was this:


mpp-if already works at a lower level than stage/node level. Looking at the implementation, there's nothing restricting it to stages. The _process_format method (line 1317) recursively walks the entire YAML tree — dicts and lists alike — and
evaluates mpp-if anywhere it encounters it as a dict value. Specifically:

  1. In a dict value (lines 1403-1418): For every key in a dict, if the value is an mpp-if node, it evaluates it and either replaces the value with the then/else result, or deletes the key entirely if the chosen branch doesn't exist (remove=True on
    line 1358 when neither then nor else matches).

  2. In a list element (lines 1420-1432): For every element in a list, if it's an mpp-if node, it evaluates it and either replaces the element or removes it from the list (lines 1431-1432).


In other words I think there may be a way to break up all the duplicated code in each of the architecture files into something more shared. I started working on this today but ran out of time.

@jbtrystram
Copy link
Copy Markdown
Member Author

jbtrystram commented Apr 1, 2026

Thanks Dusty for looking into this.

If I'm not mistaken doesn't zincati use the origin file and this might throw it off?

Zincati uses the output of rpm-ostree status --json IIRC. A quick test shhow that rpm-ostree probably uses that for the output because zincati trips on that new imgref :
Apr 01 08:30:41 cosa-devsh zincati[1668]: [WARN zincati::cincinnati] booted deployment ostree-unverified-registry:ostree-image-signed:docker://quay.io/fedora/fedora-coreos:stable not found in the update graph

sigh. I should have tested this

@jbtrystram
Copy link
Copy Markdown
Member Author

In other words I think there may be a way to break up all the duplicated code in each of the architecture files into something more shared. I started working on this today but ran out of time.

It's not worth spending this time IMHO. We will switch to generating those manifest at build-time in the near future anyway.

@jbtrystram
Copy link
Copy Markdown
Member Author

jbtrystram commented Apr 1, 2026

The most disturbing thing I see in the actual diff is the origin file has changed:

diff --git a/tmp/diff-cache/metal/43.20260309.20.dev0/ostree/deploy/fedora-coreos/deploy/XXXXXXXXXXXXXXXX.0.origin b/tmp/diff-cache/metal/43.20260309.20.dev1/ostree/deploy/fedora-coreos/deploy/XXXXXXXXXXXXXXXX.0.origin
index b1f437b..60d27fc 100644
--- a/tmp/diff-cache/metal/43.20260309.20.dev0/ostree/deploy/fedora-coreos/deploy/XXXXXXXXXXXXXXXX.0.origin
+++ b/tmp/diff-cache/metal/43.20260309.20.dev1/ostree/deploy/fedora-coreos/deploy/XXXXXXXXXXXXXXXX.0.origin
@@ -1,2 +1,2 @@
 [origin]
-container-image-reference=ostree-image-signed:docker://quay.io/fedora/fedora-coreos:testing-devel
+container-image-reference=ostree-unverified-registry:ostree-image-signed:docker://quay.io/fedora/fedora-coreos:testing-devel

If I'm not mistaken doesn't zincati use the origin file and this might throw it off?

Ok I quickly hacked this :

cat /ostree/deploy/fedora-coreos/deploy/ee7d26d8db4f0ad533ed73239098849b206f34366af443d16dd2a17cb72dc68e.0.origin 
[origin]
container-image-reference=ostree-image-signed:docker://quay.io/fedora/fedora-coreos:testing-devel

This will require bootc-dev/bootc#2112
And a config change like that :

diff --git a/image-base.yaml b/image-base.yaml
index cbda166b..b61801ec 100644
--- a/image-base.yaml
+++ b/image-base.yaml
@@ -38,7 +38,10 @@ platform-compressor:
     digitalocean: gzip
 
 # Set container-imgref
-container-imgref: "ostree-image-signed:docker://quay.io/fedora/fedora-coreos:{stream}"
+# container-imgref: "ostree-image-signed:docker://quay.io/fedora/fedora-coreos:{stream}"
+# For bootc install we just specify the pullspec, because bootc automatically preffix
+# the `ostree-image-signed:docker://`
+container-imgref: "quay.io/fedora/fedora-coreos:{stream}"

Comment on lines +197 to +200
should_use_bootc_install() {
_should_enable_feature "COSA_OSBUILD_USE_BOOTC_INSTALL" "use_bootc_install"
}

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm thinking we should enable this via an image.yaml setting versus a manifest or environment variable. WDYT?

if we do change this to image.yaml, honestly all of this code can go away now because the build_with_buildah config knob is obsolete and we should be able to delete all of the relevant code now and overwrite the old code with mv cmd-build-with-buildah cmd-build.

@dustymabe
Copy link
Copy Markdown
Member

dustymabe commented Apr 2, 2026

ok I opened #4519 to simplify the manifests and reduce duplication...

and rebased this PR on top (since I know that's extra work I don't want you to have to do because of something I decided) if you'd like to use it: https://github.com/dustymabe/coreos-assembler/tree/dusty-bootc-install

jbtrystram added a commit to jbtrystram/fedora-coreos-config that referenced this pull request Apr 2, 2026
Introduce a new overlay to ship configuration files for bootc and
image-builder. These file are sourced from the container during
`bootc install to-filesystem`.
We can also use this later to ship other bits as we make the container
more and more the source of truth, e.g. the partition table definition.

This is prep work for [1]

[1] coreos/fedora-coreos-tracker#1827
See also coreos/coreos-assembler#4224
jbtrystram added a commit to jbtrystram/fedora-coreos-config that referenced this pull request Apr 2, 2026
Introduce a new overlay to ship configuration files for bootc and
image-builder. These file are sourced from the container during
`bootc install to-filesystem`.
We can also use this later to ship other bits as we make the container
more and more the source of truth, e.g. the partition table definition.

This is prep work for [1]

[1] coreos/fedora-coreos-tracker#1827

See also coreos/coreos-assembler#4224
The log disk usage message comming every 10 seconds is quite noisy,
hide it when we are in a shell in osbuild.

I aslo added a couple of helpful tips in comments given by @dustymabe
to work with osbuild.
Prep work to add a knob for using bootc install in osbuild.
Refactor the override logic in a helper function so we can easily add
those knobs down the line.
This adds raw-{,4k}-image-bootc manifests that are alternative versions
of the raw-{,4k}-image manifests. This will allow to keep the legacy build
path with a new path that leverages bootc install to filesystem.

In this mode instead of deploying the container to the tree then copy all
the contents to the disk image, use bootc to directly manage the installation
to the target filesystems.

We can conditionalize this until we are confident to roll this to all streams
or move to image-builder.

Requires:
bootc-dev/bootc#1460
bootc-dev/bootc#1451
osbuild/osbuild#2149
osbuild/osbuild#2152
bootc-dev/bootc#1978
bootc-dev/bootc#1909
Create symlinks to the aleph file created by bootc so our tests and
tooling find the aleph at the expected path.

Note that when moving to image-build we will likely move this to
an overlay in the config, that's way easier than having to wire up a
blueprint option to allow creating arbitrary symlinks.
By default bootc calls bootups with the `--write-uuid` option that
write a stamp file with the boot partition UUID in the UEFI parition.
We want to restamp those UUID at first boot, so adding this option make
sure bootc does not pass that flag to bootupd.

See bootc-dev/bootc#1978
Bootc is looking for the prepare-root config file in the buildroot
environnement because the main assumption is that it's run from the
target container.
However, in osbuild, it's run from te buildroot, because podman inside
bwrap (inside supermin in our case) causes issues.
It's fine for RHCOS and SCOS where we use the target container as the
buildroot but we cannot do that for FCOS because we require python in
the buildroot.

For now, insert a prepare-root file in the supermin VM (use as the
buildroot for osbuild) until either :
- bootc learn to look into the container for it [1]
- we ship python in our images and can use them as buildroot.

Another approach would be to layer python and the osbuild dependencies
on top of our image and use that as the buildroot, but that would create
room for packages drift (what was in the repos at build time?). At least
using COSA it's easier to keep track of versions.

[1] bootc-dev/bootc#1410
Add a bootc install config file[1] to set ostree repo options so we inject
the `grub_users` config on non-default entries.

[1] https://bootc-dev.github.io/bootc/man/bootc-install-config.5.html#ostree

See bootc-dev/bootc#1909
@jbtrystram jbtrystram force-pushed the osbuild-bootc-install-fs branch from 99384c3 to e39271a Compare April 2, 2026 12:56
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

7 participants