When I initially switched from NPM to Traefik about 6 months ago, I had one service that needed external access, everything else is internal. I was recently thinking about how one could reach said internal services externally, given I have internal DNS. I briefly thought "well that's easy enough to fake, you just edit the hosts file", and didn't give it much thought beyond that. Then I recently saw a video on YouTube that addressed this exact concern and they confirmed simply editing the hosts file would do the trick. So I've been on a quest this whole week to:
- Implement some access control/brute force prevention/bad actor prevention
- Split access from internal vs. external sources
As the video pointed out, there are a couple ways to do this. One involved using a single Traefik instance, just switching the ports around. The other was to run 2 instances of Traefik; one for internal only services and another for external. I opted for the latter.
To do this, I copied my existing, running Traefik folder containing the docker-compose, traefik.yml, config.yml and the general directory structure to a new directory and started editing files. Not much to change, right? Take the internal Traefik config.yml file and remove any external references, take the original Traefik instance's config.yml and remove any internal references. Finally, create a new DNS A record for my new Traefik (internal) instance, change all the internal CNAME records that pointed to the old and change them to the new. The biggest hurdle was putting the new (internal) instance on an IPVLAN so I can reuse ports 80, 443, and 8080 on the same host, but I got that sorted out. Sounds simple, right?
Not so much. A really weird outcome popped up. I use Uptime Kuma to monitor a bunch of services, both internal and external. I actually used that service/router as a test. For the internal Traefik instance, I setup the router and the service just like I would've previously, changed the DNS record and I was still able to access it; all appeared to be working properly.
Then came the big change, which was fully splitting all the internal services & routers off to the internal instance and removing them from the external instance (leaving the external services & routers) in place. docker-compose up -d --force-recreate
for both and SHTF. Uptime Kuma reported every single internal web instance as returning a 404, external web services appeared to be fine. However, within my LAN, I could still access all of these services properly. Another service I run is Dashy and it has the option to provide a status check of these services, buncha little red dots on almost every service. I shell'd into the Uptime Kuma instance and did a curl -vvvv https://myservice.mydomain.com
and get:
A bunch of lines like this: * Expire in 1 ms for 1 (transfer 0x56167c2e20f0)
Then:
* Trying [my external Traefik IP]...
* TCP_NODELAY set
* Expire in 200 ms for 4 (transfer 0x56167c2e20f0)
* Connected to myservice.mydomain.com (my external Traefik IP) port 443 (#0)
* ALPN, offering h2
* ALPN, offering http/1.1
* successfully set certificate verify locations:
* CAfile: none
CApath: /etc/ssl/certs
* TLSv1.3 (OUT), TLS handshake, Client hello (1):
* TLSv1.3 (IN), TLS handshake, Server hello (2):
* TLSv1.3 (IN), TLS handshake, Encrypted Extensions (8):
* TLSv1.3 (IN), TLS handshake, Certificate (11):
* TLSv1.3 (IN), TLS handshake, CERT verify (15):
* TLSv1.3 (IN), TLS handshake, Finished (20):
* TLSv1.3 (OUT), TLS change cipher, Change cipher spec (1):
* TLSv1.3 (OUT), TLS handshake, Finished (20):
* SSL connection using TLSv1.3 / TLS_AES_128_GCM_SHA256
* ALPN, server accepted to use h2
* Server certificate:
* subject: CN=mydomain.com
* start date: Nov 28 13:53:58 2023 GMT
* expire date: Feb 26 13:53:57 2024 GMT
* subjectAltName: host "myserivce.mydomain.com" matched cert's "*.mydomain.com"
* issuer: C=US; O=Let's Encrypt; CN=R3
* SSL certificate verify ok.
* Using HTTP2, server supports multi-use
* Connection state changed (HTTP/2 confirmed)
* Copying HTTP/2 data in stream buffer to connection buffer after upgrade: len=0
* Using Stream ID: 1 (easy handle 0x56167c2e20f0)
> GET / HTTP/2
> Host: myhost.mydomain.com
> User-Agent: curl/7.64.0
> Accept: */*
>
* TLSv1.3 (IN), TLS handshake, Newsession Ticket (4):
* Connection state changed (MAX_CONCURRENT_STREAMS == 250)!
< HTTP/2 404
< alt-svc: h3=":443"; ma=2592000
< content-type: text/plain; charset=utf-8
< x-content-type-options: nosniff
< content-length: 19
< date: Wed, 13 Dec 2023 23:08:56 GMT
<
404 page not found
* Connection #0 to host myhost.mydomain.com left intact
If I switch solely back to the original config.yml, Uptime Kuma's happy, my internal LAN is happy, etc. So, at least outwardly, my traefik.yml and docker-compose files appear to be setup correctly, but the new config file throws 404's for Uptime Kuma and Dashy. Given that I'm getting a 404, I figure, check the access and Traefik logs, even on debug, I don't even see the requests come in, let alone seeing a 404 and there are no filters on either of the logs.
In the new config.yml, I removed the middlewares needed for cloudflarewarp (to get true IP addresses from Cloudflare proxying) and crowdsec as these services should only be accessible from with my LAN, so neither apply. Other than that, it's just removing any services that would be isolated to the external access.
So currently, given the only differences between broken and functioning are the config.yml file, I'd imagine my problem is there, but I've reviewed these files extensively and can't figure out what's going on; the absence of logging the 404's is a bit concerning as well, 200's show up no problem. But given that I'm getting 404's, that tells me DNS should be fine, it's the proxy that's the issue.
I don't even know what would be helpful to provide to the community as a starting point given that the new config.yml file is a direct copy of a known functioning version with some definitions removed. If it was indentation or some other oversight on my part, I'd expect to see an error thrown in the Traefik logs, but nothing related near as I can tell.
Note that the middlewares exist in the original because that's intended to be the externally accessible as the end goal. I also put ...
as a placeholder for anything redundant such as subsequent router and service definitions, they're almost completely identical.
Original:
http:
# Start of routers
routers:
myservice:
entryPoints:
- "websecure"
rule: "Host(`myservice.mydomain.com`)"
middlewares:
- default-headers
- https-redirectscheme
# tls: {}
service: myservice
...
# Transport definition to skip anything with a valid or a self-signed certificate
serversTransports:
skipVerify:
insecureSkipVerify: true
# Start of services
services:
myservice:
loadBalancer:
servers:
- url: "https://[proper IP address]"
passHostHeader: true
serversTransport: skipVerify
...
# Start of middlewares
middlewares:
my-cloudflarewarp:
plugin:
cloudflarewarp:
disableDefault: "false"
trustip:
- 2400:cb00::/32
https-redirectscheme:
redirectScheme:
scheme: https
permanent: true
default-headers:
headers:
frameDeny: true
browserXssFilter: true
contentTypeNosniff: true
forceSTSHeader: true
stsIncludeSubdomains: true
stsPreload: true
stsSeconds: 15552000
customFrameOptionsValue: SAMEORIGIN
customRequestHeaders:
X-Forwarded-Proto: https
default-whitelist:
ipWhiteList:
sourceRange:
- "10.0.0.0/8"
- "192.168.0.0/16"
- "172.16.0.0/12"
secured:
chain:
middlewares:
- default-whitelist
- default-headers
crowdsec-bouncer:
forwardauth:
address: http://crowdsec-bouncer-traefik:8080/api/v1/forwardAuth
trustForwardHeader: true
New:
http:
middlewares:
my-cloudflarewarp:
plugin:
cloudflarewarp:
disableDefault: "false"
trustip:
- 2400:cb00::/32
https-redirectscheme:
redirectScheme:
scheme: https
permanent: true
default-headers:
headers:
frameDeny: true
browserXssFilter: true
contentTypeNosniff: true
forceSTSHeader: true
stsIncludeSubdomains: true
stsPreload: true
stsSeconds: 15552000
customFrameOptionsValue: SAMEORIGIN
customRequestHeaders:
X-Forwarded-Proto: https
default-whitelist:
ipWhiteList:
sourceRange:
- "10.0.0.0/8"
- "192.168.0.0/16"
- "172.16.0.0/12"
secured:
chain:
middlewares:
- default-whitelist
- default-headers
crowdsec-bouncer:
forwardauth:
address: http://crowdsec-bouncer-traefik:8080/api/v1/forwardAuth
trustForwardHeader: true
routers:
myservice:
entryPoints:
- websecure
middlewares:
- default-headers
- https-redirectscheme
rule: Host(`myservice.mydomain.com`)
service: myservice
...
serversTransports:
skipVerify:
insecureSkipVerify: true
services:
myservice:
loadBalancer:
passHostHeader: true
servers:
- url: https://[proper IP address]
serversTransport: skipVerify
...
If there's anything else that I can provide that would be of assistance, I'm happy to share it. But after so many hours of trial and error, I'm at a loss. I have seen weird things with IPVLAN networks like traffic sourced from the same host won't pass, but that's not the case as Uptime Kuma and my main desktop are separate host machines.