fix: clear stale apt lists before apt-get update to fix ARM64 build#54
Conversation
Agent-Logs-Url: https://github.com/dappnode/DAppNodePackage-lighthouse-generic/sessions/9599f824-9c9d-425f-8b02-1cbd96bf5c8b Co-authored-by: Marketen <36164126+Marketen@users.noreply.github.com>
✅ SYNC TEST REPORT - PASSEDThis CI runs on a real DAppNode using the same RPC calls a user would make to configure a staker setup: execution client, consensus client, web3signer, MEV-boost, and relays. The self-hosted runner is pre-loaded with synced (or near-synced) execution client volumes, enabling fast sync and attestation tests with any client combination. Consensus clients use checkpoint sync, so no pre-synced volume is required. Sync Test: Verifies that both execution and consensus clients reach a synced state. This test ensures the staker configuration is valid and clients can synchronize with the network. 📦 Clients Used
🔖 Version TrackingExecution Client Versions
Consensus Client Versions
⏱️ Timing MeasurementsEnvironment Setup
Test Execution
Total Duration: 2m5s 🔴 Container Error Logs
DAppNodePackage-validator.lighthouse-hoodi.dnp.dappnode.eth: DAppNodePackage-beacon-chain.lighthouse-hoodi.dnp.dappnode.eth: |
ARM64 (
linux/arm64) builds were failing duringapt-get updatewith a GPG signature error againstports.ubuntu.com. Root cause: stale/corrupted APT metadata cached inside thedocker buildxbuilder instance.Changes
beacon-chain/Dockerfileandvalidator/Dockerfile: prependrm -rf /var/lib/apt/lists/*to theRUNstep that runsapt-get update, ensuring package lists are always fetched fresh regardless of builder cache state.RUN rm -rf /var/lib/apt/lists/* && \ apt-get update && apt-get --yes install curl && apt-get clean && \ chmod +rx /usr/local/bin/entrypoint.sh /etc/profile.d/consensus_tools.shOriginal prompt
Problem
The ARM64 (
linux/arm64) Docker build is failing for both thebeacon-chainandvalidatortargets during theapt-get updatestep. The error is:This occurs at line 19-20 in
beacon-chain/Dockerfileand line 18-19 invalidator/Dockerfile. The root cause is stale or corrupted APT repository metadata cached in the buildx builder instance, causing GPG signature verification to fail during cross-architecture builds.Failing job: https://github.com/dappnode/DAppNodePackage-lighthouse-generic/actions/runs/24122984921/job/70406505104
Solution
Add
rm -rf /var/lib/apt/lists/*beforeapt-get updatein both Dockerfiles to clear any stale cached package metadata, making the build resilient to corrupted builder caches.beacon-chain/Dockerfile— change the RUN command (line 19-20) from:RUN apt-get update && apt-get --yes install curl && apt-get clean && \ chmod +rx /usr/local/bin/entrypoint.sh /etc/profile.d/consensus_tools.shto:
RUN rm -rf /var/lib/apt/lists/* && \ apt-get update && apt-get --yes install curl && apt-get clean && \ chmod +rx /usr/local/bin/entrypoint.sh /etc/profile.d/consensus_tools.shvalidator/Dockerfile— change the RUN command (line 18-19) from:RUN apt-get update && apt-get --yes install curl && apt-get clean && \ chmod +rx /usr/local/bin/entrypoint.sh /etc/profile.d/consensus_tools.shto:
RUN rm -rf /var/lib/apt/lists/* && \ apt-get update && apt-get --yes install curl && apt-get clean && \ chmod +rx /usr/local/bin/entrypoint.sh /etc/profile.d/consensus_tools.shThe following is the prior conversation context from the user's chat exploration (may be truncated):
User: Please find a solution for this failing job. Use the logs, job definition, and any referenced files where the failure occurred. Keep your response focused on the solution and include code suggestions when appropriate.
Assistant:
I'll start by fetching the job logs and the workflow definition to understand the failure.
Assistant: Let me look at the Dockerfiles involved in this build.
Assistant: ## Failure Analysis
The build is failing on the ARM64 (
linux/arm64) Docker build for both thebeacon-chainandvalidatortargets. The error occurs at line 18-19 in both Dockerfiles duringapt-get update:This is a well-known issue with cross-architecture Docker builds using
docker buildxwith thedocker-containerdriver. The GPG signature verification fails because of stale or corrupted apt cache/lists inside the buildx builder instance. Theports.ubuntu.comURLs confirm this is the ARM64 build specifically (x86 usesarchive.ubuntu.com).Solution
There are two approaches to fix this:
Option 1 (Recommended): Clean apt lists before updating
Add
rm -rf /var/lib/apt/lists/*beforeapt-get updatein both Dockerfiles to clear any stale cached package metadata:beacon-chain/Dockerfile— change line 19:RUN rm -rf /var/lib/apt/lists/* && \ apt-get update && apt-get --yes install curl && apt-get clean && \ chmod +rx /usr/local/bin/entrypoint.sh /etc/profile.d/consensus_tools.shvalidator/Dockerfile— change line 18:RUN rm -rf /var/lib/apt/lists/* && \ apt-get update && apt-get --yes install curl && apt-get clean && \ chmod +rx /usr/local/bin/entrypoint.sh /etc/profile.d/consensus_tools.shOption 2: Prune the buildx builder on the runner
If the issue is a corrupted buildx builder cache on the
staking-test-hoodirunner, you can reset it by adding a step before the build in.github/workflows/release.yml:Why this happens
The
sigp/lighthousebase image uses Ubuntu Jammy (22.04). When building forlinux/arm64via buildx with adocker-containerdriver, the builder instance can cache stale or corrupted APT repository metadata. The GPG signatures then fail to verify because the cac...This pull request was created from Copilot chat.