sorry if this topic has already been addressed, I didn't find any history (but I might have the wrong keywords).
I am using traefik as main reverse-proxy for a docker infrastructure for a while now, and I've always wanted to know why there is a lot of 404 hits for / resource as you can see in example below.
I do make some API requests on my docker services but not directly on traefik. Furthermore, lots of these requests (GET /) come from IP addresses I don't own (the vast majority actually).
I understand this behavior might be caused by bots, but doing what? And how is it possible to hit / if the request is not managed by any traefik router?
I understand this behavior might be caused by bots, but doing what?
If you expose a webserver or a proxy on the web, it will get scraped and crawled by a variety of bots for a variety of reasons (indexing, IP address usage, etc). Some bots are malicious, looking for exposed systems with vulnerabilities, but not all bots are.
And how is it possible to hit / if the request is not managed by any traefik router?
The HTTP request is processed whether there is matching route or not...If there is no matching route, a 404 is returned, which is exactly what is happening in your log.
Hm okay... Sorry I see you are giving me the explanation, but I still feel confused...
I forgot to mention that I have a maintenance service that should collect any request that doesn't match any route (but now I guess, I'm wrong). I give you below the details of this configuration :
Yes, many bots will use raw IP addresses, since there are far less of them than domain names. You can also enable further headers if you feel it would help out your investigation: (Access Logs - Traefik)
Many firewalls allow you to block requests that don't match your domain list to filter out these kinds of requests.
After all, I still didn't understand why the request could not be handled by any router if there was at least the maintenance service up. My mistake was to use + in the regex although it is possible (eg with telnet) not to set the HTTP Host header.
Pfiou