Access log get /

Hi,

sorry if this topic has already been addressed, I didn't find any history (but I might have the wrong keywords).

I am using traefik as main reverse-proxy for a docker infrastructure for a while now, and I've always wanted to know why there is a lot of 404 hits for / resource as you can see in example below.

traefik            | xxx.xxx.xxx.xxx - - [19/Jul/2021:11:59:26 +0000] "GET / HTTP/1.1" 404 19 "-" "-" 47981 "-" "-" 0ms

I did read this topic but it is case-specific so I can't apply the solution to my situation nor use it for investigation.

Does anyone have some clues for me to understand these requests?

Thanks a lot!

thopic

Hello @thopic,

Often this sort of entry is caused by a healthcheck of some sort. Have you configured http healthchecks on your Traefik service?

Hello @daniel.tomcej,

I do make some API requests on my docker services but not directly on traefik. Furthermore, lots of these requests (GET /) come from IP addresses I don't own (the vast majority actually).
I understand this behavior might be caused by bots, but doing what? And how is it possible to hit / if the request is not managed by any traefik router?

Thank you for your help :slight_smile:

Hello @thopic,

I understand this behavior might be caused by bots, but doing what?

If you expose a webserver or a proxy on the web, it will get scraped and crawled by a variety of bots for a variety of reasons (indexing, IP address usage, etc). Some bots are malicious, looking for exposed systems with vulnerabilities, but not all bots are.

And how is it possible to hit / if the request is not managed by any traefik router?

The HTTP request is processed whether there is matching route or not...If there is no matching route, a 404 is returned, which is exactly what is happening in your log.

Hm okay... Sorry I see you are giving me the explanation, but I still feel confused...

I forgot to mention that I have a maintenance service that should collect any request that doesn't match any route (but now I guess, I'm wrong). I give you below the details of this configuration :


services:
  web:
      build: ..
      image: my_apache:7.4.21
      container_name: maintenance
      networks:
        - proxy
      expose:
        - "80"
      restart: unless-stopped
      volumes:
        - /usr/lib/locale/:/usr/lib/locale/
        - /etc/localtime:/etc/localtime:ro
        - ${DATA_PATH}/public-html:/var/www/html/
      labels:
        - "traefik.enable=true"
        - "traefik.http.routers.maintenance-http.rule=HostRegexp(`{catchall:.+}`)"
        - "traefik.http.routers.maintenance-http.entrypoints=web"
        - "traefik.http.routers.maintenance-http.priority=1"
        - "traefik.http.routers.maintenance-https.rule=HostRegexp(`{catchall:.+}`)"
        - "traefik.http.routers.maintenance-https.entrypoints=websecure"
        - "traefik.http.routers.maintenance-https.priority=1"
        - "traefik.http.routers.maintenance-https.tls=true"
        - "co.elastic.logs/module=apache"
        - "co.elastic.logs/fileset=access"
                        
networks:
  proxy:
    external: true

If my configuration is right, it means these requests are done directly with the ip address? Like http://my.ip.add.ress ?

Thank you again

Hello @thopic,

Yes, many bots will use raw IP addresses, since there are far less of them than domain names. You can also enable further headers if you feel it would help out your investigation: (Access Logs - Traefik)

Many firewalls allow you to block requests that don't match your domain list to filter out these kinds of requests.

It sounds clear enough. Thank you very much @daniel.tomcej for these tips :slight_smile:

After all, I still didn't understand why the request could not be handled by any router if there was at least the maintenance service up. My mistake was to use + in the regex although it is possible (eg with telnet) not to set the HTTP Host header.
Pfiou

This topic was automatically closed 3 days after the last reply. New replies are no longer allowed.