Isn't giving Traefik access to Kubernetes secrets potentially dangerous if Traefik were ever exploited? Am I missing something obvious?

I created a topic last week where I had issues upgrading to Traefik 2.2 (or 2.1). from 2.0. I worked out the problem was our config not allowing Traefik access to secrets. This config worked for us fine on 2.0 other than spamming the pod logs with errors relating to not being able to read secrets. However with 2.1 and 2.2 Traefik doesn't get as far as obtaining the Ingress resources used to configure itself unless I grant it permissions to list secrets. Since we don't use certificates for anything at the moment, I don't believe Traefik really needs to actually access secrets?

Anyhow, this got me thinking... The docs suggest creating a CusterRole that gives Traefik get, list and watch to secrets. Then attach that role to a service account for traefik using a ClusterRole binding.

Fair enough, but since the list verb actually gives the service account the ability to see the contents of the secrets it lists, I think i'm now giving that service account and therefore this instance of Traefik access to every single secret in Kubernetes?

Not a problem if Traefik is perfect. But what happens if an exploit was found in Traefik that somehow allowed an attacker to get those secrets? From my fairly simple cluster I have almost 100 secrets for things such as other service accounts, cloud credentials and application configs containing database passwords.

At that point I think an attacker could then potentially grab another service account token for a service account with more permissions that allow it to do who knows what? Create pods? Delete storage? Whatever it wants really. Which seems quite worrying.

I believe things could be limited somewhat by using a RoleBinding rather than a ClusterRole binding and restricting access to secrets in specific namespaces. However that then leads onto a slightly less serious version of the same problem. If (for arguments sake) I have a namespace called production in which I have a service that I create an Ingress for, my exploited Traefik ingress controller still has access to all of my production secrets. It might not now have access to quite so many, but still alot of potentially sensitive things.

I thought perhaps I had solved the problem by putting my ingress controllers into their own namespace. Then granting them access just to secrets in their own namespace. But no, because the Ingress resources still go into my production namespace, I have to give Traefik access to secrets in my production namespace.

It really feels like i'm missing something obvious here. I can't find a way to limit the access that Traefik needs to secrets.
I've looked for a way to limit access to specific secrets, but there's no way I can tell Traefik that.
I've looked for a way to somehow tell it to lookup secrets in a different namespace rather than the one that the Ingresses come from that it's looking for.
I've looked for a way to tell Traefik to not bother looking for secrets, since right now we're not using them anyway.

FWIW: We originally switched to Traefik from Nginx Ingress controllers for this exact reason. It seems we were just unlucky that v2.0 allowed it to kinda do what we wanted. But now we're back to the same situation. We also looked into a bunch of other options for ingress controllers. Most of which suffered from this same problem.

Is my understanding above correct?

If so, are there any plans to try to change this at all? Since if this is the case I can't understand how anyone could think they were safely using Traefik (or pretty much any of the other ingress controllers that exist today)?

If not, corrections to my understanding on how to deal with this are very welcome.

Thanks

I agree with the sentiment. There is an similar issue here: https://github.com/containous/traefik/issues/1784

The issue itself talks about basic auth, but of course there are other places where secrets are used:

to name a few.

One can argue that kubernetes also can be compromised, but according to the least privilege principle it would be nice if traefik did not need access (to the point of refusing to work) to information that is not required for its functions.

Thanks for the reply. It's good (for my own mental health) that i'm not totally mistaken in my concerns.

In relation to the idea that Kubernetes itself could be compromised, you're of course right, but I think I would have to try to argue that my api server isn't available on the public Internet and isn't directly accessible via potentially malicious clients. Therefore I think they would need to find some other flaw in my systems or the software that i'm using to try to exploit that. Which is also entirely possible, but hopefully more difficult to achieve.

The one thing i'm really struggling with here mostly from Kubernetes point of view is that the list verb gives access to the content of the secrets. I have even tried adjusting my role to allow the service account used by trafik to only have list permissions on specific secrets. This still doesn't help. List appears to be all or nothing, there's no middle ground.

The only obvious way I could see that Traefik could make this situation better would be to allow you to specify the list of secrets that you're happy to allow it to get and watch. I think that would then mean listing secrets was no longer required and I could create RBAC to allow the service account used by Traefik that allowed access to only those specific secrets.

Part of me thinks this would already be possible with some pretty trivial changes. I obviously don't know the Traefik code, but if the various resources that are collected tell Traefik which secrets it needs to know about, why does it need to list them? Watching those secrets would be enough to trigger Traefik to update them if they change. Getting those secrets is enough to get the content of them. New resources requiring access to new secrets would be a case of just setting up new watches... Then as long as my Role specifies get and watch of the correct secrets everything would work as I would expect and Traefik would no longer have access to every secret in my cluster.

Is there anything I can do to have someone think about these ideas? :slight_smile:

It would be interesting to see how this is and can be implemented at kuberntes API level. This can be done by studying traefik code and client-go api. While individual methods of the API are documented the same way as any public go api, I have not seen a good description of the model the API uses.

In particular there are notions of Client Sets, Informers and Listers, where Informers basically listen to watch notification and put them to cache, and listers get the information about object from that cache.

This is the mode of operation that controllers are encouraged to use, as querying API for data on as needed basis deemed bad for performance reasons.

I personally have not worked with client-go, so my understanding of it superficial, in particular, while I understand that a watch can watch for individual objects, I'm not sure how to do this with client-go informers, and factories. The current traefik code seems to watch for everything and fails if permissions do not allow it.

Update: it appears that watches give you a stream of added / changed / deleted objects, such a secrets, so it may not really be possible to watch on an individual object level

It would be interesting to me to find out what kind of code changes are required to watch selectively, and how much of an effort would be to implement these changes.

I am unfamiliar with client-go too. I can watch an individual resource by name using kubectl, which I'd like to think is kinda the same thing and done in a very similar way to how anything else would do it:

kubectl get secrets -n production some-secret-name -w

I regularly watch individual pods when i'm doing something and I get multiple lines output as things happen, which I assume are when things are added/changed/deleted since some of the lines are duplicated and the STATUS column doesn't always change.

I guess this is why it seems like what i'm describing would be relatively trivial. But like I said, i'm not very familiar with client-go, I also don't know what client-go informers and factories are, perhaps I should do some reading on them to understand how some of this stuff actually works rather than making assumptions.