Multi platform images #2273

bdovaz · 2023-01-17T19:39:30Z

We continue the discussion from #1553

nvuillam · 2023-01-21T23:03:10Z

🦙 MegaLinter status: ⚠️ WARNING

Descriptor	Linter	Files	Fixed	Errors	Elapsed time
✅ BASH	bash-exec	5		0	0.01s
✅ BASH	shellcheck	5		0	0.09s
✅ BASH	shfmt	5	0	0	0.37s
✅ COPYPASTE	jscpd	yes		no	3.42s
✅ DOCKERFILE	hadolint	123		0	17.86s
✅ JSON	eslint-plugin-jsonc	23	0	0	2.28s
✅ JSON	jsonlint	21		0	0.19s
✅ JSON	v8r	23		0	14.3s
✅ MAKEFILE	checkmake	1		0	0.01s
⚠️ MARKDOWN	markdownlint	255	0	254	28.05s
✅ MARKDOWN	markdown-link-check	255		0	5.89s
✅ MARKDOWN	markdown-table-formatter	255	0	0	29.08s
✅ OPENAPI	spectral	1		0	1.41s
⚠️ PYTHON	bandit	200		61	3.18s
✅ PYTHON	black	200	0	0	4.78s
✅ PYTHON	flake8	200		0	1.93s
✅ PYTHON	isort	200	0	0	0.84s
✅ PYTHON	mypy	200		0	10.74s
✅ PYTHON	pylint	200		0	14.4s
⚠️ PYTHON	pyright	200		319	22.8s
✅ PYTHON	ruff	200	0	0	0.5s
✅ REPOSITORY	checkov	yes		no	35.64s
✅ REPOSITORY	git_diff	yes		no	0.37s
⚠️ REPOSITORY	grype	yes		1	10.34s
✅ REPOSITORY	secretlint	yes		no	11.58s
✅ REPOSITORY	trivy	yes		no	25.56s
✅ REPOSITORY	trivy-sbom	yes		no	1.07s
✅ SPELL	cspell	665		0	26.81s
✅ SPELL	lychee	335		0	3.6s
✅ XML	xmllint	3	0	0	0.37s
✅ YAML	prettier	160	0	0	4.92s
✅ YAML	v8r	102		0	161.76s
✅ YAML	yamllint	161		0	1.58s

See detailed report in MegaLinter reports

MegaLinter is graciously provided by

echoix · 2023-01-29T15:34:33Z

It isn't the best for us, but I think that what is done elsewhere to workaround this is instead of loading (--load), they push to the registry (--push). What that brings is what to do when we aren't ready to have a release, that we are building it, we would have to push it either way... that may mean that doing some retagging and removing the tags used for building would be needed.

nvuillam · 2023-01-29T16:44:16Z

Internal contributors can use push because they have access to credentials, but I wouldn't like forks to be able to push to docker hub mega linter images ^^

echoix · 2023-01-29T18:10:39Z

Of course, but the fork would need to log on the registry to be able to do that, and they couldn't.

bdovaz · 2023-01-29T18:47:13Z

@echoix I would appreciate anything you can work on and advance this PR because right now I am quite busy with #2294 which is quite tedious because of having to go through all the linters that do format/autofix.

bdovaz · 2023-02-26T19:46:15Z

@echoix I have done rebase and I am working on this PR, to see if I can at least clear the doubt of which ones will be compatible with arm64 and which ones will not.

My idea is to create from build.py a page with a table that says for each linter the compatibility by platform since I see impossible today (or maybe never) to get all linters to work in arm64.

echoix · 2023-02-26T19:49:08Z

Yep, that's a good idea. Last weekend I stumbled to look again at status, and saw that there was some progress in the last weeks on the Docker issue, docker side, but it was just released, but not a complete fix. I think it will start to work easily one day, without manually playing with manifests :)

echoix · 2023-02-26T19:53:16Z

Once we got the infrastructure sorted out correctly, I plan that we could start with only one flavor, even without all linters included, and ramp up from there. It'll let some time to sort things up. What is still bugging me is that I didn't find a way that could allow us to test the flavors quickly in CI. It's a bit risky that we ship untested images.

echoix · 2023-02-26T20:55:04Z

https://github.com/oxsecurity/megalinter/actions/runs/4276668304/jobs/7444925057

Using too much memory?

echoix · 2023-02-26T20:56:50Z

How come pyright needs node, by python?

echoix · 2023-02-26T21:21:00Z

We can maybe take a look at why the two powershell jobs are taking more than an hour to build...
https://github.com/oxsecurity/megalinter/actions/runs/4276668304/jobs/7444926852

nvuillam · 2023-02-26T21:28:38Z

@echoix there's probably a blocking prompt somewhere
or a 1000 times retry when github api limits are reached

echoix · 2023-02-26T21:31:08Z

Maybe.. but even the raw logs don't contain anything else than waiting for a runner to pick the job

Kurt-von-Laven · 2023-02-27T01:25:24Z

I see #30 [linux/arm64 stage-1 7/19] RUN pwsh -c 'Install-Module -Name PSScriptAnalyzer -RequiredVersion ${PSSA_VERSION} -Scope AllUsers -Force' as the last step in the logs before the job was cancelled. I have tended to rerun our CI a few times to make sure the issue is reproducible. Where are the raw logs you're looking at?

echoix · 2023-02-27T01:36:35Z

During the execution we didn't have any more output if we didn't have that section opened when it ran, and for the last 10 mins (where I had some opened), no new log lines were outputted. In the gear menu, there's a link to see raw logs, but it wasn't out yet. We wanted to have other CI jobs to run, and they were blocked there

echoix · 2023-02-27T01:54:37Z

Now, we can see what was running:

https://github.com/oxsecurity/megalinter/actions/runs/4276668304/jobs/7444926825

https://github.com/oxsecurity/megalinter/actions/runs/4276668155/jobs/7444927563

On many of the cancelled tasks, we see that there wasn't any output for a long time, more than an hour. I see that the rest of the image was building fine, the amd64 steps were practically finished inside of a minute (started at 20:17:11, and 20:18:03 was the step before last for amd64), even when interleaving steps for both platforms together (since the emulation for arm64 takes longer, it will always finish last, but still).

Theres probably some crash or some really long emulated step for it to not have any output in an hour.

waterfoul · 2023-04-10T14:53:04Z

Not sure about your powershell issue but I've been trying to do something similar. We have an internal requirement which causes us to re-build most tools (so we can patch cves before the upstream project) and I noticed that many of the tools which don't support native arm can be easily compiled. For example any of the GO binaries (which there are many) can be built using the following dockerfile snippet:

# syntax=docker/dockerfile:1.5-labs
FROM --platform=$BUILDPLATFORM scratch as gitleaks-download
ADD --chown=65532 https://github.com/gitleaks/gitleaks.git#v8.16.2 /gitleaks

FROM --platform=$BUILDPLATFORM golang:alpine as build-gitleaks
COPY --link --from=gitleaks-download --chown=65532:65532 /gitleaks /gitleaks
WORKDIR /gitleaks
USER 65532
ENV GOCACHE=/gitleaks/.cache
ARG TARGETARCH
ENV GOARCH=$TARGETARCH
RUN --mount=type=cache,id=gopkg-${BUILDARCH},sharing=locked,target=/go/pkg,uid=65532 \
    --mount=target=/gitleaks/.cache,id=go-${BUILDARCH},sharing=shared,type=cache,uid=65532 \
## Patch CVEs
## END Patch
    go mod tidy \
 && go mod vendor

RUN --mount=target=/gitleaks/.cache,id=gitleaks-cache,sharing=shared,type=cache,uid=65532 \
    CGO_ENABLED=0 go build -mod=vendor -o /tmp/gitleaks

FROM scratch as gitleaks-fs
COPY --link --from=build-gitleaks /tmp/gitleaks /usr/bin/gitleaks
RUN ["/usr/bin/gitleaks", "--help"]

FROM scratch as gitleaks
COPY --link --from=gitleaks-fs /

This will use the native CPU to build both binaries, then check them using the target architecture (the final from just cleans up the output layers if you build that target). If you don't want to use docker/dockerfile:1.5-labs you can replace the first layer with any os and use git clone. To translate this into another proejct you just need to change out the repo, project name, and some of the test scripts (--help isn't always supported). If you would like I could help with this effort, I understand it's WIP though so I didn't want to step on any toes.

echoix · 2023-04-10T17:03:41Z

I like your pattern!
Indeed, go tools are so easy to cross-build!
If you look in the main branch, certain tools written in go are built in multi-layered approach in case their latest releases aren't released automatically by CI. If the BUILDPLATFORM wasn't written in what is in main, it was my plan to use it if it was needed. You seem to have a written and complete example here of all the cross compilation concepts all at the same place, so it is a great reference and help.

One of my concerns with the multi-arch images is build time. Ideally we would use published and compiled versions for the correct architecture, and only fallback to compiling if needed.

Some of the problems we might need help:

Finding a way to be able to build and test multi-arch images in CI. At the surface, almost everything is there, but in practice, we hit a gray spot with Docker that wasn't quite resolved last time I looked. We can build multi-arch images, but cannot load them. It seems we need to push to registry, then pull one in the local docker to run it. We would like to avoid the old manifest-based approach to publishing individual images with tags per arch, then having a third tag that is only a manifest linking the two images.

Is there a way to run an image of another arch through QEMU to run a test in the container? (Like making sure it runs. Like tools that aren't static, single files like Go based linters, making sure they can actually start)

Lastly, what is missing from starting to enable linters for other arches (once the minimal pushing of the images can be done, pushing directly to the registry is a workaround), is to really finish up parsing the yaml descriptor files to follow the default behaviour when overrides aren't used (to allow for exceptions to accommodate each linter), then write out the Dockerfiles (with that build.py script, we usually know what Dockerfile we want to be outputted). I assume that's out of your specialty. 😃

echoix · 2023-04-10T17:25:13Z

@waterfoul, I forgot to ask about your chown handling in your example. I know about the Linux permissions styles like 755, 655, 777, 400 etc, but is it a permission or a user that is 65532. What does it solve and what were the drawbacks when you implemented this?

I know that we have had and might still have issues especially with some node modules that downloads with high user ids that cause problems, and the workaround solution of changing the owner to root, is another source of problems, especially when the files created by megalinter, mounted to the computer of a user running it locally, becomes undeletable. Is this something that you and your team have encountered and have solved?

waterfoul · 2023-04-10T18:51:33Z

One of my concerns with the multi-arch images is build time. Ideally we would use published and compiled versions for the correct architecture, and only fallback to compiling if needed.

I understand the concern but there are quite a few things you can leverage if you build it yourself to shrink the image. This example doesn't do those (yet) since I'm still working on shrinking our internal image. What is the longest build time which would be acceptable?

It seems we need to push to registry, then pull one in the local docker to run it. We would like to avoid the old manifest-based approach to publishing individual images with tags per arch, then having a third tag that is only a manifest linking the two images.

Buildkit already does this for you if the environment is configured correctly. Try using the flag "--platform linux/amd64,linux/arm64/v8" which should automatically use a local qemu to run the arm layers and the final tagged image should already contain both images with the manifest wired up correctly. If you want to build them separately you can run both builds with different --export-cache/--import-cahe (or --cache-from/--cache-to with docker buildx build) registries and no outputs. You can then add flags to import both separate registries for the final build and it should import the caches and merge the images.

Finding a way to be able to build and test multi-arch images in CI. At the surface, almost everything is there, but in practice, we hit a gray spot with Docker that wasn't quite resolved last time I looked. We can build multi-arch images, but cannot load them. It seems we need to push to registry, then pull one in the local docker to run it. We would like to avoid the old manifest-based approach to publishing individual images with tags per arch, then having a third tag that is only a manifest linking the two images.

Is there a way to run an image of another arch through QEMU to run a test in the container? (Like making sure it runs. Like tools that aren't static, single files like Go based linters, making sure they can actually start)

docker run --platform linux/arm64 should run the arm version of the image

I assume that's out of your specialty.

Yes and no. I'll take a look. I have a pretty broad set of skills at this point so I might be able to help (I was a Full stack AppDev before I moved into DevSecOps). There are quite a few things I've found in the dockerfile build process that I'd like to adjust so I'll take a look at that.

I forgot to ask about your chown handling in your example. I know about the Linux permissions styles like 755, 655, 777, 400 etc, but is it a permission or a user that is 65532. What does it solve and what were the drawbacks when you implemented this?

We have a policy that we only run using root unless explicitly necessary, we even do this when building since any process running in a build is just running inside a container. If you run everything as root a poisoned binary can be used to escape the build process and potentially be used for nefarious purposes like installing remote shells inside build infrastructure. While difficult it's not impossible and running as a non root user thwarts 99% of attempts since you need root to access most of what could be used to escape the container (including most docker engine exploits). 65532 is just a generic userid we used for the megalinter build. Making the container run under a non-root user would have issues (including those you listed and others) which could be solved but the building user in a different stage which has a static binary output would have no impact on it except for the owner id being set on the binary. I'd like to eventually help you tackle the final user eventually but that should be a separate effort

Kurt-von-Laven · 2023-04-10T20:21:36Z

This is an exciting thread to see! When the time comes, I have been very slowly making progress on setting a non-root user and running the container as a non-root user in #1985.

bdovaz · 2023-04-10T21:29:00Z

@waterfoul I see you have a lot of experience and ideas so it makes perfect sense for you to continue this PR if you want to.

echoix · 2024-02-03T13:47:53Z

@bdovaz https://github.blog/changelog/2024-01-30-github-actions-introducing-the-new-m1-macos-runner-available-to-open-source/ !

nvuillam · 2024-02-07T21:27:48Z

Can this announce be a wake up call for this long time new feature ? :p

echoix · 2024-02-07T23:46:08Z

But we need to manage to do all this with 14 Gb storage, 7Gb ram...

nvuillam · 2024-02-08T04:36:49Z

@echoix if you need more space I could ask @oxsecurity :)

https://docs.github.com/en/actions/using-github-hosted-runners/about-larger-runners/about-larger-runners

echoix · 2024-02-08T04:51:18Z

I'm not sure yet, but at least they're fast. That kind of fast: OSGeo/grass#3395 (comment). To speed up serial-only tests like that, unoptimized, the raw speed and disk access speed must be for something.

bdovaz · 2024-06-01T13:03:55Z

@echoix @nvuillam I just merged main in this branch to get it back up to date.

What do we do to unlock it?

On one hand there is the problem of identifying which linters don't work on arm which I had more or less under control (I have to revisit this build) and with the schema changed with “supported_platforms” we control which linters run on which architecture.
At the moment it runs on the ubuntu runner which is not ARM and the build is done with QEMU, it would be better to do it with the ARM runners that @echoix commented Multi platform images #2273 (comment)
Related to the above we have to see what strategy we follow to try that the build time is not multiplied x2 or more by having to build for amd64/arm64.
What do we do with @waterfoul's PR Fixing Multiplatform Builds #2549 that was abandoned? It was related to this PR if I remember correctly

nvuillam · 2024-06-02T09:18:29Z

@bdovaz @echoix I've never been really involved in this enhancement, I let you decide what you want to do with it ;)

All I want is that is doesn't break current behavior ^^

echoix · 2024-06-10T23:15:09Z

/build

Command run output
Build command workflow started.
Installing dependencies
Running script ./build.sh
Build command workflow completed updating files.

bdovaz · 2024-10-16T10:36:03Z

@echoix @nvuillam I just merged main in this branch to get it back up to date.

What do we do to unlock it?

On one hand there is the problem of identifying which linters don't work on arm which I had more or less under control (I have to revisit this build) and with the schema changed with “supported_platforms” we control which linters run on which architecture.

At the moment it runs on the ubuntu runner which is not ARM and the build is done with QEMU, it would be better to do it with the ARM runners that @echoix commented Multi platform images #2273 (comment)

Related to the above we have to see what strategy we follow to try that the build time is not multiplied x2 or more by having to build for amd64/arm64.

What do we do with @waterfoul's PR Fixing Multiplatform Builds #2549 that was abandoned? It was related to this PR if I remember correctly

I want to unlock this, what do we do @echoix?

cc @nvuillam

nvuillam · 2024-10-16T17:26:22Z

This PR is probably the oldest PR of MegaLinter history 🥲

This was referenced Feb 5, 2023

Add GITHUB_AUTH_TOKEN as arg to Dockerfile to RUN commands using phive #2314

Merged

Use docker/build-push-action@ in CI instead of upload-docker.sh #2342

Merged

This was referenced Feb 17, 2023

Change clj-kondo installation from Docker image to installation script #2368

Merged

Remove apk go requirement when possible #2318

Merged

bdovaz force-pushed the dev/multi-platform-images branch from cdc6de6 to 14f6821 Compare February 26, 2023 12:52

bdovaz force-pushed the dev/multi-platform-images branch from af6a7ed to c0d5dbe Compare March 10, 2023 08:50

bdovaz force-pushed the dev/multi-platform-images branch 2 times, most recently from 9b31731 to 5e9c6a1 Compare March 26, 2023 20:26

bdovaz temporarily deployed to dev January 5, 2024 21:10 — with GitHub Actions Inactive

bdovaz had a problem deploying to dev January 5, 2024 21:10 — with GitHub Actions Failure

Merge branch 'main' into dev/multi-platform-images

84f0eed

bdovaz temporarily deployed to dev June 1, 2024 12:54 — with GitHub Actions Inactive

[MegaLinter] Apply linters fixes

d5cae30

nvuillam temporarily deployed to dev June 1, 2024 13:00 — with GitHub Actions Inactive

nvuillam temporarily deployed to dev June 1, 2024 13:01 — with GitHub Actions Inactive

Merge branch 'main' into dev/multi-platform-images

0a83116

echoix temporarily deployed to dev June 10, 2024 23:14 — with GitHub Actions Inactive

[build-command] Update generated files

aa664d3

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Multi platform images #2273

Multi platform images #2273

bdovaz commented Jan 17, 2023

nvuillam commented Jan 21, 2023 •

edited by github-actions bot

Loading

echoix commented Jan 29, 2023

nvuillam commented Jan 29, 2023

echoix commented Jan 29, 2023

bdovaz commented Jan 29, 2023

bdovaz commented Feb 26, 2023

echoix commented Feb 26, 2023

echoix commented Feb 26, 2023

echoix commented Feb 26, 2023

echoix commented Feb 26, 2023

echoix commented Feb 26, 2023

nvuillam commented Feb 26, 2023

echoix commented Feb 26, 2023

Kurt-von-Laven commented Feb 27, 2023

echoix commented Feb 27, 2023

echoix commented Feb 27, 2023

waterfoul commented Apr 10, 2023

echoix commented Apr 10, 2023

echoix commented Apr 10, 2023

waterfoul commented Apr 10, 2023

Kurt-von-Laven commented Apr 10, 2023

bdovaz commented Apr 10, 2023

echoix commented Feb 3, 2024

nvuillam commented Feb 7, 2024

echoix commented Feb 7, 2024

nvuillam commented Feb 8, 2024 •

edited

Loading

echoix commented Feb 8, 2024

bdovaz commented Jun 1, 2024

nvuillam commented Jun 2, 2024

echoix commented Jun 10, 2024 •

edited by github-actions bot

Loading

bdovaz commented Oct 16, 2024

nvuillam commented Oct 16, 2024

Multi platform images #2273

Are you sure you want to change the base?

Multi platform images #2273

Conversation

bdovaz commented Jan 17, 2023

nvuillam commented Jan 21, 2023 • edited by github-actions bot Loading

🦙 MegaLinter status: ⚠️ WARNING

echoix commented Jan 29, 2023

nvuillam commented Jan 29, 2023

echoix commented Jan 29, 2023

bdovaz commented Jan 29, 2023

bdovaz commented Feb 26, 2023

echoix commented Feb 26, 2023

echoix commented Feb 26, 2023

echoix commented Feb 26, 2023

echoix commented Feb 26, 2023

echoix commented Feb 26, 2023

nvuillam commented Feb 26, 2023

echoix commented Feb 26, 2023

Kurt-von-Laven commented Feb 27, 2023

echoix commented Feb 27, 2023

echoix commented Feb 27, 2023

waterfoul commented Apr 10, 2023

echoix commented Apr 10, 2023

echoix commented Apr 10, 2023

waterfoul commented Apr 10, 2023

Kurt-von-Laven commented Apr 10, 2023

bdovaz commented Apr 10, 2023

echoix commented Feb 3, 2024

nvuillam commented Feb 7, 2024

echoix commented Feb 7, 2024

nvuillam commented Feb 8, 2024 • edited Loading

echoix commented Feb 8, 2024

bdovaz commented Jun 1, 2024

nvuillam commented Jun 2, 2024

echoix commented Jun 10, 2024 • edited by github-actions bot Loading

bdovaz commented Oct 16, 2024

nvuillam commented Oct 16, 2024

nvuillam commented Jan 21, 2023 •

edited by github-actions bot

Loading

nvuillam commented Feb 8, 2024 •

edited

Loading

echoix commented Jun 10, 2024 •

edited by github-actions bot

Loading