Split Traefik Instances, new setup results in 404

When I initially switched from NPM to Traefik about 6 months ago, I had one service that needed external access, everything else is internal. I was recently thinking about how one could reach said internal services externally, given I have internal DNS. I briefly thought "well that's easy enough to fake, you just edit the hosts file", and didn't give it much thought beyond that. Then I recently saw a video on YouTube that addressed this exact concern and they confirmed simply editing the hosts file would do the trick. So I've been on a quest this whole week to:

  1. Implement some access control/brute force prevention/bad actor prevention
  2. Split access from internal vs. external sources

As the video pointed out, there are a couple ways to do this. One involved using a single Traefik instance, just switching the ports around. The other was to run 2 instances of Traefik; one for internal only services and another for external. I opted for the latter.

To do this, I copied my existing, running Traefik folder containing the docker-compose, traefik.yml, config.yml and the general directory structure to a new directory and started editing files. Not much to change, right? Take the internal Traefik config.yml file and remove any external references, take the original Traefik instance's config.yml and remove any internal references. Finally, create a new DNS A record for my new Traefik (internal) instance, change all the internal CNAME records that pointed to the old and change them to the new. The biggest hurdle was putting the new (internal) instance on an IPVLAN so I can reuse ports 80, 443, and 8080 on the same host, but I got that sorted out. Sounds simple, right?

Not so much. A really weird outcome popped up. I use Uptime Kuma to monitor a bunch of services, both internal and external. I actually used that service/router as a test. For the internal Traefik instance, I setup the router and the service just like I would've previously, changed the DNS record and I was still able to access it; all appeared to be working properly.

Then came the big change, which was fully splitting all the internal services & routers off to the internal instance and removing them from the external instance (leaving the external services & routers) in place. docker-compose up -d --force-recreate for both and SHTF. Uptime Kuma reported every single internal web instance as returning a 404, external web services appeared to be fine. However, within my LAN, I could still access all of these services properly. Another service I run is Dashy and it has the option to provide a status check of these services, buncha little red dots on almost every service. I shell'd into the Uptime Kuma instance and did a curl -vvvv https://myservice.mydomain.com and get:
A bunch of lines like this: * Expire in 1 ms for 1 (transfer 0x56167c2e20f0)
Then:

*   Trying [my external Traefik IP]...
* TCP_NODELAY set
* Expire in 200 ms for 4 (transfer 0x56167c2e20f0)
* Connected to myservice.mydomain.com (my external Traefik IP) port 443 (#0)
* ALPN, offering h2
* ALPN, offering http/1.1
* successfully set certificate verify locations:
*   CAfile: none
  CApath: /etc/ssl/certs
* TLSv1.3 (OUT), TLS handshake, Client hello (1):
* TLSv1.3 (IN), TLS handshake, Server hello (2):
* TLSv1.3 (IN), TLS handshake, Encrypted Extensions (8):
* TLSv1.3 (IN), TLS handshake, Certificate (11):
* TLSv1.3 (IN), TLS handshake, CERT verify (15):
* TLSv1.3 (IN), TLS handshake, Finished (20):
* TLSv1.3 (OUT), TLS change cipher, Change cipher spec (1):
* TLSv1.3 (OUT), TLS handshake, Finished (20):
* SSL connection using TLSv1.3 / TLS_AES_128_GCM_SHA256
* ALPN, server accepted to use h2
* Server certificate:
*  subject: CN=mydomain.com
*  start date: Nov 28 13:53:58 2023 GMT
*  expire date: Feb 26 13:53:57 2024 GMT
*  subjectAltName: host "myserivce.mydomain.com" matched cert's "*.mydomain.com"
*  issuer: C=US; O=Let's Encrypt; CN=R3
*  SSL certificate verify ok.
* Using HTTP2, server supports multi-use
* Connection state changed (HTTP/2 confirmed)
* Copying HTTP/2 data in stream buffer to connection buffer after upgrade: len=0
* Using Stream ID: 1 (easy handle 0x56167c2e20f0)
> GET / HTTP/2
> Host: myhost.mydomain.com
> User-Agent: curl/7.64.0
> Accept: */*
> 
* TLSv1.3 (IN), TLS handshake, Newsession Ticket (4):
* Connection state changed (MAX_CONCURRENT_STREAMS == 250)!
< HTTP/2 404 
< alt-svc: h3=":443"; ma=2592000
< content-type: text/plain; charset=utf-8
< x-content-type-options: nosniff
< content-length: 19
< date: Wed, 13 Dec 2023 23:08:56 GMT
< 
404 page not found
* Connection #0 to host myhost.mydomain.com left intact

If I switch solely back to the original config.yml, Uptime Kuma's happy, my internal LAN is happy, etc. So, at least outwardly, my traefik.yml and docker-compose files appear to be setup correctly, but the new config file throws 404's for Uptime Kuma and Dashy. Given that I'm getting a 404, I figure, check the access and Traefik logs, even on debug, I don't even see the requests come in, let alone seeing a 404 and there are no filters on either of the logs.

In the new config.yml, I removed the middlewares needed for cloudflarewarp (to get true IP addresses from Cloudflare proxying) and crowdsec as these services should only be accessible from with my LAN, so neither apply. Other than that, it's just removing any services that would be isolated to the external access.

So currently, given the only differences between broken and functioning are the config.yml file, I'd imagine my problem is there, but I've reviewed these files extensively and can't figure out what's going on; the absence of logging the 404's is a bit concerning as well, 200's show up no problem. But given that I'm getting 404's, that tells me DNS should be fine, it's the proxy that's the issue.

I don't even know what would be helpful to provide to the community as a starting point given that the new config.yml file is a direct copy of a known functioning version with some definitions removed. If it was indentation or some other oversight on my part, I'd expect to see an error thrown in the Traefik logs, but nothing related near as I can tell.

Note that the middlewares exist in the original because that's intended to be the externally accessible as the end goal. I also put ... as a placeholder for anything redundant such as subsequent router and service definitions, they're almost completely identical.
Original:

http:
# Start of routers
  routers:
    myservice:
      entryPoints:
        - "websecure"
      rule: "Host(`myservice.mydomain.com`)"
      middlewares:
        - default-headers
        - https-redirectscheme
      # tls: {}
      service: myservice
    ...


# Transport definition to skip anything with a valid or a self-signed certificate
  serversTransports:
    skipVerify:
      insecureSkipVerify: true


# Start of services
  services:
    myservice:
      loadBalancer:
        servers:
          - url: "https://[proper IP address]"
        passHostHeader: true
        serversTransport: skipVerify
    ...

# Start of middlewares
  middlewares:
    my-cloudflarewarp:
      plugin:
        cloudflarewarp:
          disableDefault: "false"
          trustip:
            - 2400:cb00::/32

    https-redirectscheme:
      redirectScheme:
        scheme: https
        permanent: true

    default-headers:
      headers:
        frameDeny: true
        browserXssFilter: true
        contentTypeNosniff: true
        forceSTSHeader: true
        stsIncludeSubdomains: true
        stsPreload: true
        stsSeconds: 15552000
        customFrameOptionsValue: SAMEORIGIN
        customRequestHeaders:
          X-Forwarded-Proto: https

    default-whitelist:
      ipWhiteList:
        sourceRange:
        - "10.0.0.0/8"
        - "192.168.0.0/16"
        - "172.16.0.0/12"
    
    secured:
      chain:
        middlewares:
        - default-whitelist
        - default-headers

    crowdsec-bouncer:
      forwardauth:
        address: http://crowdsec-bouncer-traefik:8080/api/v1/forwardAuth
        trustForwardHeader: true

New:

http:
  middlewares:
    my-cloudflarewarp:
      plugin:
        cloudflarewarp:
          disableDefault: "false"
          trustip:
            - 2400:cb00::/32
    https-redirectscheme:
      redirectScheme:
        scheme: https
        permanent: true
    default-headers:
      headers:
        frameDeny: true
        browserXssFilter: true
        contentTypeNosniff: true
        forceSTSHeader: true
        stsIncludeSubdomains: true
        stsPreload: true
        stsSeconds: 15552000
        customFrameOptionsValue: SAMEORIGIN
        customRequestHeaders:
          X-Forwarded-Proto: https
    default-whitelist:
      ipWhiteList:
        sourceRange:
        - "10.0.0.0/8"
        - "192.168.0.0/16"
        - "172.16.0.0/12"
    secured:
      chain:
        middlewares:
        - default-whitelist
        - default-headers
    crowdsec-bouncer:
      forwardauth:
        address: http://crowdsec-bouncer-traefik:8080/api/v1/forwardAuth
        trustForwardHeader: true
  routers:
    myservice:
      entryPoints:
      - websecure
      middlewares:
      - default-headers
      - https-redirectscheme
      rule: Host(`myservice.mydomain.com`)
      service: myservice
    ...
  serversTransports:
    skipVerify:
      insecureSkipVerify: true
  services:
    myservice:
      loadBalancer:
        passHostHeader: true
        servers:
        - url: https://[proper IP address]
        serversTransport: skipVerify
    ...

If there's anything else that I can provide that would be of assistance, I'm happy to share it. But after so many hours of trial and error, I'm at a loss. I have seen weird things with IPVLAN networks like traffic sourced from the same host won't pass, but that's not the case as Uptime Kuma and my main desktop are separate host machines.

Let’s try to condense the information:

  1. you run two Traefik instances in Docker
  2. one is listening on IP, another on IPVLAN
  3. you setup hosts file for domains internally resolve to IPVLAN

Check:

  1. domains internally resolve to IPVLAN, on host and inside container (ping)
  2. make sure Traefik "internal" is reading the correct dynamic config file
  3. Traefik debug log has no errors or warnings
  4. requests go to right Traefik, enable access log

If still not resolved, share your full Traefik static and dynamic config, and docker-compose.yml if used.

Geez, you're everywhere on this forum lol. Not a complaint, you're super helpful.

Seems you skipped the important part at the end :wink:

  routers:
    myservice:
      entryPoints:
      - websecure
      middlewares:
      - default-headers
      - https-redirectscheme
      rule: Host(`myservice.mydomain.com`)
      service: myservice
    ...

Hmmm mild heart attack there. I wrote a script that would ingest the config.yml then rewrite it in alphabetical order (really only wanted routers and services, but hey, I'll take what I can get) and thought script dropped that definition off. I can confirm the routers all have a service KV pair. I'm not sure where you're seeing it missing in this thread, though :confused:

At any rate, something odd happened. I plugged in the external, revamped config.yml (you have no idea how many config.*.yml files I have floating around right now between the split and the script to sort them) and, despite there not being any changes, it appears everything is working. At least Uptime Kuma and Dashy are reporting similarly to the way they were originally. I have quite a bit of testing to do but I think for some reason it's working now.

The error indicates you use the same name (myservice?) for router multiple times.

That error was 3-4x more in the log before I started adjusting things. I believe the root of the issue was that the definition for the Traefik instance itself was defined in both the config.yml as well as the labels for the docker-compose. Once I trimmed that up a bit, I'm left with just these two entries. I'm wondering, since both instances are on the same host machine, one on a bridge network and one on an IPVLAN if they're picking up each other dynamically?

I searched all the files in this instance for traefik-secure and they only appear in my labels in the docker-compose file:

- "traefik.enable=true"
      - "traefik.http.routers.traefik.rule=Host(`traefik-int.mydomain.com`)"
      - "traefik.http.routers.traefik-secure.rule=Host(`traefik-int.mydomain.com`)"
      - "traefik.http.routers.traefik.entrypoints=websecure"
      - "traefik.http.routers.traefik.tls.certresolver=cf_production"
      - "traefik.http.middlewares.traefik-https-redirect.redirectscheme.scheme=websecure"
      - "traefik.http.middlewares.sslheader.headers.customrequestheaders.X-Forwarded-Proto=https"
      - "traefik.http.routers.traefik.middlewares=traefik-https-redirect"
      - "traefik.http.routers.traefik-secure.entrypoints=websecure"
    #  - "traefik.http.routers.traefik-secure.middlewares=traefik-auth"
      - "traefik.http.routers.traefik-secure.tls=true"
      - "traefik.http.routers.traefik-secure.tls.domains[0].main=mydomain.com"
      - "traefik.http.routers.traefik-secure.tls.domains[0].sans=*.mydomain.com"
      - "traefik.http.services.dashboard.loadbalancer.server.port=8080"
      - "traefik.http.routers.traefik-secure.service=api@internal"

Edit: I suppose it's likely traefik.http.routers.[traefik/traefik-secure].rule that's causing that error?

Maybe clean up your labels, you can do the http2https on entrypoint, see simple Traefik example.

Ok, I got rid of that error. The issue was that both instances had routers named "traefik" and "traefik-secure" in their labels.

Unfortunately, when using the new config file only external services are accessible, all the internal stuff still fails with a 404 error. I pulled up the dashboard from the internal instance and, near as I can tell, it looks right. Definitely parsing my config.yml file.

Enable access log in JSON format to see where the error is coming from. OriginStatus is the code coming from the target service.

Well, that's part of the problem. The only things that get logged here are 200s, no 400s. Looking at it, there's a pattern.

{"ClientAddr":"192.168.10.3:56902","ClientHost":"192.168.10.3","ClientPort":"56902","ClientUsername":"-","DownstreamContentSize":343,"DownstreamStatus":200,"Duration":4308400,"OriginContentSize":343,"OriginDuration":4177518,"OriginStatus":200,"Overhead":130882,"RequestAddr":"serviceA.mydomain.com","RequestContentSize":0,"RequestCount":698,"RequestHost":"serviceA.mydomain.com","RequestMethod":"GET","RequestPath":"/api/stat/sites","RequestPort":"-","RequestProtocol":"HTTP/1.1","RequestScheme":"https","RetryAttempts":0,"RouterName":"unifi@file","ServiceAddr":"192.168.10.30:8443","ServiceName":"unifi@file","ServiceURL":{"Scheme":"https","Opaque":"","User":null,"Host":"192.168.10.30:8443","Path":"","RawPath":"","OmitHost":false,"ForceQuery":false,"RawQuery":"","Fragment":"","RawFragment":""},"StartLocal":"2023-12-14T13:17:40.072068514-06:00","StartUTC":"2023-12-14T19:17:40.072068514Z","TLSCipher":"TLS_AES_128_GCM_SHA256","TLSVersion":"1.3","entryPointName":"websecure","level":"info","msg":"","request_User-Agent":"Go-http-client/1.1","time":"2023-12-14T13:17:40-06:00"}
{"ClientAddr":"192.168.10.3:56902","ClientHost":"192.168.10.3","ClientPort":"56902","ClientUsername":"-","DownstreamContentSize":3769,"DownstreamStatus":200,"Duration":2879374,"OriginContentSize":3769,"OriginDuration":2757489,"OriginStatus":200,"Overhead":121885,"RequestAddr":"serviceA.mydomain.com","RequestContentSize":0,"RequestCount":699,"RequestHost":"serviceA.mydomain.com","RequestMethod":"GET","RequestPath":"/api/s/default/stat/sta","RequestPort":"-","RequestProtocol":"HTTP/1.1","RequestScheme":"https","RetryAttempts":0,"RouterName":"unifi@file","ServiceAddr":"192.168.10.30:8443","ServiceName":"unifi@file","ServiceURL":{"Scheme":"https","Opaque":"","User":null,"Host":"192.168.10.30:8443","Path":"","RawPath":"","OmitHost":false,"ForceQuery":false,"RawQuery":"","Fragment":"","RawFragment":""},"StartLocal":"2023-12-14T13:17:40.077557262-06:00","StartUTC":"2023-12-14T19:17:40.077557262Z","TLSCipher":"TLS_AES_128_GCM_SHA256","TLSVersion":"1.3","entryPointName":"websecure","level":"info","msg":"","request_User-Agent":"Go-http-client/1.1","time":"2023-12-14T13:17:40-06:00"}
{"ClientAddr":"192.168.10.3:56902","ClientHost":"192.168.10.3","ClientPort":"56902","ClientUsername":"-","DownstreamContentSize":13724,"DownstreamStatus":200,"Duration":9446514,"OriginContentSize":13724,"OriginDuration":9357503,"OriginStatus":200,"Overhead":89011,"RequestAddr":"serviceA.mydomain.com","RequestContentSize":0,"RequestCount":700,"RequestHost":"serviceA.mydomain.com","RequestMethod":"GET","RequestPath":"/api/s/default/stat/device","RequestPort":"-","RequestProtocol":"HTTP/1.1","RequestScheme":"https","RetryAttempts":0,"RouterName":"unifi@file","ServiceAddr":"192.168.10.30:8443","ServiceName":"unifi@file","ServiceURL":{"Scheme":"https","Opaque":"","User":null,"Host":"192.168.10.30:8443","Path":"","RawPath":"","OmitHost":false,"ForceQuery":false,"RawQuery":"","Fragment":"","RawFragment":""},"StartLocal":"2023-12-14T13:17:40.082345855-06:00","StartUTC":"2023-12-14T19:17:40.082345855Z","TLSCipher":"TLS_AES_128_GCM_SHA256","TLSVersion":"1.3","entryPointName":"websecure","level":"info","msg":"","request_User-Agent":"Go-http-client/1.1","time":"2023-12-14T13:17:40-06:00"}
{"ClientAddr":"192.168.10.80:35070","ClientHost":"192.168.10.80","ClientPort":"35070","ClientUsername":"-","DownstreamContentSize":495,"DownstreamStatus":200,"Duration":112937,"GzipRatio":0,"OriginContentSize":0,"OriginDuration":0,"OriginStatus":0,"Overhead":112937,"RequestAddr":"192.168.10.6:8080","RequestContentSize":0,"RequestCount":701,"RequestHost":"192.168.10.6","RequestMethod":"GET","RequestPath":"/api/overview","RequestPort":"8080","RequestProtocol":"HTTP/1.1","RequestScheme":"http","RetryAttempts":0,"RouterName":"api@internal","StartLocal":"2023-12-14T13:17:41.871662894-06:00","StartUTC":"2023-12-14T19:17:41.871662894Z","entryPointName":"traefik","level":"info","msg":"","request_User-Agent":"Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/120.0.0.0 Safari/537.36","time":"2023-12-14T13:17:41-06:00"}

The serviceA.mydomain.com is the same service, 3 times in a row, 200 RC in the OriginStatus. I suspect this is a poller to populate Prometheus.

  • 192.168.10.3 is a separate host
  • 192.168.10.2 is this host, with 192.168.10.6 as the IPVLAN IP
  • 192.168.10.80 is my desktop that I'm testing from

So there are 30 total routers/services in this Traefik instance's config as can be seen in the dashboard. When I try to hit serviceA.mydomain.com from my desktop, I get a 404 that doesn't get logged, which I think is the theme here, Uptime Kuma should be hitting a good chunk of these 30 services at least once per minute, alas, nothing. I shell'd into this Traefik container and am able to ping serviceA.mydomain.com so DNS is resolving and I can also ping status.mydomain.com (Uptime Kuma), also resolves, also responds.

And switching back to my OG config.yml and restarting, everything's working.

External vols:

volumes:
  - /etc/localtime:/etc/localtime:ro
  - /home/docker/appdata/traefik/traefik.yml:/etc/traefik/traefik.yml
  - /home/docker/appdata/traefik/config.yml:/config.yml
  - /home/docker/appdata/traefik/letsencrypt:/letsencrypt
  - /home/docker/appdata/traefik/logs:/logs
  - /var/run/docker.sock:/var/run/docker.sock:ro

Internal vols:

volumes:
  - /etc/localtime:/etc/localtime:ro
  - /home/docker/appdata/traefik_int/traefik.yml:/etc/traefik/traefik.yml:ro
  - /home/docker/appdata/traefik_int/config.yml:/config.yml:ro
  - /home/docker/appdata/traefik/letsencrypt:/letsencrypt
  - /home/docker/appdata/traefik_int/logs:/logs
  - /var/run/docker.sock:/var/run/docker.sock:ro

So this config.yml works for both internal and external (abbreviated, but all other services/routers are structured nearly if not fully identically):

http:
# Start of routers
  routers:
    status: # Internal
      entryPoints:
        - "websecure"
      rule: "Host(`status.mydomain.com`)"
      middlewares:
        - default-headers
        - https-redirectscheme
      service: status
    notstatus: # External
      entryPoints:
      - websecure
      middlewares:
      - default-headers
      - https-redirectscheme
      rule: "Host(`notstatus.mydomain.com`)"
      service: notstatus


# Transport definition to skip anything with a valid or a self-signed certificate
  serversTransports:
    skipVerify:
      insecureSkipVerify: true


# Start of services
  services:
    status: # Internal
      loadBalancer:
        servers:
          - url: "http://[reachable IP address]:3002"
        passHostHeader: true
   notstatus: # External
    loadBalancer:
      passHostHeader: true
      servers:
      - url: https://[reachable IP address]
      serversTransport: skipVerify


# Start of middlewares
  middlewares:
    my-cloudflarewarp:
      plugin:
        cloudflarewarp:
          disableDefault: "false"
          trustip:
            - 2400:cb00::/32

    https-redirectscheme:
      redirectScheme:
        scheme: https
        permanent: true

When I switch the mount from config.yml to config_new.yml and restart, 404s on all internal services. config_new.yml (abbreviated, but all other services/routers are structured nearly if not fully identically):

http:
  middlewares:
    # crowdsec-bouncer:
    #   forwardauth:
    #     address: http://crowdsec-bouncer-traefik:8080/api/v1/forwardAuth
    #     trustForwardHeader: true
    default-headers:
      headers:
        browserXssFilter: true
        contentTypeNosniff: true
        customFrameOptionsValue: SAMEORIGIN
        customRequestHeaders:
          X-Forwarded-Proto: https
        forceSTSHeader: true
        frameDeny: true
        stsIncludeSubdomains: true
        stsPreload: true
        stsSeconds: 15552000
    default-whitelist:
      ipWhiteList:
        sourceRange:
        - 10.0.0.0/8
        - 192.168.0.0/16
        - 172.16.0.0/12
    https-redirectscheme:
      redirectScheme:
        permanent: true
        scheme: https
    my-cloudflarewarp:
      plugin:
        cloudflarewarp:
          disableDefault: 'false'
          trustip:
          - 2400:cb00::/32
    secured:
      chain:
        middlewares:
        - default-whitelist
        - default-headers
  routers:
    notstatus:
      entryPoints:
      - websecure
      middlewares:
      - default-headers
      - https-redirectscheme
      rule: Host(`notstatus.mydomain.com`)
      service: notstatus
  serversTransports:
    skipVerify:
      insecureSkipVerify: true
  services:
    notstatus:
      loadBalancer:
        passHostHeader: true
        servers:
        - url: https://[reachable IP address]
        serversTransport: skipVerify

And then the config.yml for the internal instance:

http:
  middlewares:
    default-headers:
      headers:
        browserXssFilter: true
        contentTypeNosniff: true
        customFrameOptionsValue: SAMEORIGIN
        customRequestHeaders:
          X-Forwarded-Proto: https
        forceSTSHeader: true
        frameDeny: true
        stsIncludeSubdomains: true
        stsPreload: true
        stsSeconds: 15552000
    default-whitelist:
      ipWhiteList:
        sourceRange:
        - 10.0.0.0/8
        - 192.168.0.0/16
        - 172.16.0.0/12
    https-redirectscheme:
      redirectScheme:
        permanent: true
        scheme: https
    secured:
      chain:
        middlewares:
        - default-whitelist
        - default-headers
  routers:
    status:
      entryPoints:
        - websecure
      middlewares:
        - default-headers
        - https-redirectscheme
      rule: Host(`status.mydomain.com`)
      service: status
  serversTransports:
    skipVerify:
      insecureSkipVerify: true
  services:
    status:
      loadBalancer:
        passHostHeader: true
        servers:
          - url: http://[internal DNS name]:3002

I can then reach notstatus.mydomain.com no problem, but all the internal services (status.mydomain.com as an example) are unreachable.

The thing is status.mydomain.com is pointed (CNAME) at traefik-int.mydomain.com which then has an A record pointing to 192.168.10.6 via the IPVLAN config and I get a 404 when I try to visit it, so it seems like Traefik (internal) is getting it and returning a 404, but I don't get any log entries to indicate why. It's the same DNS layout I've been using. One A record to point a domain to the Traefik instance, then CNAMEs that point to that A record.

I'm really starting to wonder if it's the IPVLAN and trying to switch it to MACVLAN and see if there's any change.

When you have access log enabled, Traefik will log every request, be it ok (2xx) or error (3xx, 4xx, 5xx). If there is nothing in the access logs, you might connect to a different server, which will return 404.

192.168.10.80 - - [15/Dec/2023:16:09:27 +0000] "GET /dashboard/16 HTTP/2.0" 404 19 "-" "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/120.0.0.0 Safari/537.36" 1 "-" "-" 0ms

This is in the external Traefik instance's accesslog after switching to the new config.yml, so the request is hitting the wrong instance.

  • Confirmed DNS is pointed to the correct instance
  • Confirmed that DNS record is working with dig/nslookup
  • Shell into external Traefik ping status.mydomain.com (the root of that log entry), pings echo and the IP is accurate
  • Shell into the internal Traefik, ping status.mydomain.com, pings do not echo and the IP is accurate

DNS hasn't changed in days and the service/router definition in the interval vs. external config.yml files are identical. I'm really starting to question the notion of IPVLAN for this. It makes no sense that the request would hit the external Traefik instance because DNS points it to a different IP (on the same host, but the IPVLAN IP), but even if it did, that's the instance that should throw a 404 because there's no route or service definition for it.

FYI, I switched the internal instance from IPVLAN to MACVLAN and there's no difference.

Friendly bump here. I'm not sure what else could be going on. From my perspective, it sure looks like the original one that was spun up is, for lack of a better term, intercepting the requests destined for the second one off on its own IPVLAN/MACVLAN.

Friendly reference to a similar question (link), maybe it helps.