'499 Client Closed Request' caused by: context canceled

I'm running a mender server in the kubernetes and using Traefik as ingress. I'm getting a 499 Client Closed Request error while uploading artifact to mender-server if the upload takes more than 1minute. If I upload smaller artifacts, which takes less than one minute to upload, works fine. I'm only getting error if I upload large artifacts which takes more than one minute to upload.

I saw in the documentation that if this error occurs, we can override the response timeout argument and will solve the problem. I did that but still getting the same 499 error.

I have also checked mender server's configuration, There are no size or time restrictions from the server. Default uplaod timeout is 1 hour and max upload size is 10GB in the mender configuration. So I think it's the traefik causing the error.

Following is my traefik deployment and traefik ingress configuration :

Name:                   traefik
Namespace:              default
CreationTimestamp:      Wed, 23 Oct 2024 10:26:44 +0000
Labels:                 app.kubernetes.io/instance=traefik-default
                        app.kubernetes.io/managed-by=Helm
                        app.kubernetes.io/name=traefik
                        helm.sh/chart=traefik-32.1.1
Annotations:            deployment.kubernetes.io/revision: 3
                        meta.helm.sh/release-name: traefik
                        meta.helm.sh/release-namespace: default
Selector:               app.kubernetes.io/instance=traefik-default,app.kubernetes.io/name=traefik
Replicas:               1 desired | 1 updated | 1 total | 1 available | 0 unavailable
StrategyType:           RollingUpdate
MinReadySeconds:        0
RollingUpdateStrategy:  0 max unavailable, 1 max surge
Pod Template:
  Labels:           app.kubernetes.io/instance=traefik-default
                    app.kubernetes.io/managed-by=Helm
                    app.kubernetes.io/name=traefik
                    helm.sh/chart=traefik-32.1.1
  Annotations:      prometheus.io/path: /metrics
                    prometheus.io/port: 9100
                    prometheus.io/scrape: true
  Service Account:  traefik
  Containers:
   traefik:
    Image:       docker.io/traefik:v3.1.6
    Ports:       9100/TCP, 9000/TCP, 8000/TCP, 8443/TCP
    Host Ports:  0/TCP, 0/TCP, 0/TCP, 0/TCP
    Args:
      --global.checknewversion
      --global.sendanonymoususage
      --entryPoints.metrics.address=:9100/tcp
      --entryPoints.traefik.address=:9000/tcp
      --entryPoints.web.address=:8000/tcp
      --entryPoints.websecure.address=:8443/tcp
      --api.dashboard=true
      --ping=true
      --metrics.prometheus=true
      --metrics.prometheus.entrypoint=metrics
      --providers.kubernetescrd
      --providers.kubernetescrd.allowEmptyServices=true
      --providers.kubernetesingress
      --providers.kubernetesingress.allowEmptyServices=true
      --entryPoints.websecure.http.tls=true
      --entryPoints.websecure.transport.respondingTimeouts.readTimeout=300
      --log.level=INFO
    Liveness:   http-get http://:9000/ping delay=2s timeout=2s period=10s #success=1 #failure=3
    Readiness:  http-get http://:9000/ping delay=2s timeout=2s period=10s #success=1 #failure=1
    Environment:
      POD_NAME:        (v1:metadata.name)
      POD_NAMESPACE:   (v1:metadata.namespace)
    Mounts:
      /data from data (rw)
      /tmp from tmp (rw)
  Volumes:
   data:
    Type:       EmptyDir (a temporary directory that shares a pod's lifetime)
    Medium:     
    SizeLimit:  <unset>
   tmp:
    Type:          EmptyDir (a temporary directory that shares a pod's lifetime)
    Medium:        
    SizeLimit:     <unset>
  Node-Selectors:  <none>
  Tolerations:     <none>
Conditions:
  Type           Status  Reason
  ----           ------  ------
  Available      True    MinimumReplicasAvailable
  Progressing    True    NewReplicaSetAvailable
OldReplicaSets:  traefik-6c56784b69 (0/0 replicas created), traefik-7858bc95cb (0/0 replicas created)
NewReplicaSet:   traefik-7d9bf4d54 (1/1 replicas created)
Events:          <none>
Name:             mender-ingress
Labels:           <none>
Namespace:        default
Address:          
Ingress Class:    traefik
Default backend:  <default>
TLS:
  new-tls-secret terminates mender.scanomat.com
Rules:
  Host                 Path  Backends
  ----                 ----  --------
  mender.scanomat.com  
                       /   mender-api-gateway:80 (10.42.0.172:9080)
Annotations:           cert-manager.io/issuer: letsencrypt
Events:                <none>

I got the following logs in the mender-deployment while calling a mender-api to upload artifact which is taking more than one minute to upload.

time="2024-10-21T16:55:48Z" level=error msg="azblob PutObject: failed to upload object to blob: context canceled" caller="view.(*RESTView).RenderInternalError@view.go:72" request_id=66772fb8-f862-451a-83b7-046743424cc2 user_id=753afdfb-ee20-4fd3-985e-85c74fe4c56e
time="2024-10-21T16:55:48Z" level=info msg="500 59998118μs POST /api/management/v1/deployments/artifacts/generate HTTP/1.1 - Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/129.0.0.0 Safari/537.36" byteswritten=78 caller="accesslog.(*AccessLogMiddleware).MiddlewareFunc.func1@middleware.go:82" method=POST path=/api/management/v1/deployments/artifacts/generate qs= request_id=66772fb8-f862-451a-83b7-046743424cc2 responsetime=59.998118647 status=500 ts="2024-10-21 16:54:48.518920248 +0000 UTC" type=http user_id=753afdfb-ee20-4fd3-985e-85c74fe4c56e

I tried to upload using mender-cli as well and got the 499 error as shown below:

67.22 MiB / 535.74 MiB [------------>] 12.55% 1.11 MiB p/sVERBOSE response: HTTP/1.1 499 status code 499
Connection: close
Content-Length: 21
Date: Wed, 23 Oct 2024 12:14:13 GMT
Referrer-Policy: no-referrer
Strict-Transport-Security: max-age=31536000; includeSubDomains; preloadVary: Accept-Encoding
X-Content-Type-Options: nosniff
X-Xss-Protection: 1; mode=block
Client Closed Request
FAILURE: artifact upload to 'mender.scanomat.com' failed with status 499ERROR: exit status: 1

When I used curl to upload artifact, I got the following response:

curl -v -X POST \  https://mender.scanomat.com/api/management/v1/deployments/artifacts \  
-H "Authorization: Bearer ..." \  
-F "artifact=@boss-imx8mm-var-dart-0.0.0-dev.mender"
Note: Unnecessary use of -X or --request, POST is already inferred.* Host mender.scanomat.com:443 was resolved.* 
IPv6: (none)* IPv4: 13.69.133.251*   Trying 13.69.133.251:443...* Connected to mender.scanomat.com (13.69.133.251) 
port 443* ALPN: curl offers h2,http/1.1* (304) (OUT), TLS handshake, Client hello (1):*  CAfile: /etc/ssl/cert.pem*  
CApath: none* (304) (IN), TLS handshake, Server hello (2):* (304) (IN), TLS handshake, Unknown (8):* (304) (IN), TLS 
handshake, Certificate (11):* (304) (IN), TLS handshake, CERT verify (15):* (304) (IN), TLS handshake, Finished (20):* (304) 
(OUT), TLS handshake, Finished (20):* SSL connection using TLSv1.3 / AEAD-CHACHA20-POLY1305-SHA256 / [blank] / 
UNDEF* ALPN: server accepted h2* Server certificate:*  subject: CN=mender.scanomat.com*  
start date: May 17 06:19:53 2024 GMT*  expire date: Jun 18 06:19:53 2025 GMT*  subjectAltName: 
host "mender.scanomat.com" matched cert's "mender.scanomat.com"*  
issuer: C=US; ST=Arizona; L=Scottsdale; O=GoDaddy.com, Inc.; OU=http://certs.godaddy.com/repository/; 
CN=Go Daddy Secure Certificate Authority - G2*  SSL certificate verify ok.* using HTTP/2* [HTTP/2] [1] OPENED stream for https://mender.scanomat.com/api/management/v1/deployments/artifacts* 
[HTTP/2] [1] [:method: POST]* [HTTP/2] [1] [:scheme: https]* [HTTP/2] [1] [:authority: mender.scanomat.com]* 
[HTTP/2] [1] [:path: /api/management/v1/deployments/artifacts]* 
[HTTP/2] [1] [user-agent: curl/8.7.1]* [HTTP/2] [1] 
[accept: */*]* [HTTP/2] [1] [authorization: Bearer ...]* [HTTP/2] [1] [content-length: 561762549]* [HTTP/2] [1]
 [content-type: multipart/form-data; boundary=------------------------qKlT1gdNLonJbvM5ahJJxT]>
 POST /api/management/v1/deployments/artifacts HTTP/2> Host: mender.scanomat.com> User-Agent: curl/8.7.1> Accept: */*> 
Authorization: Bearer ...> Content-Length: 561762549> Content-Type: multipart/form-data; boundary=------------------------qKlT1gdNLonJbvM5ahJJxT> 
< HTTP/2 499 < date: Wed, 23 Oct 2024 12:54:45 GMT< referrer-policy: no-referrer< strict-transport-security: max-age=31536000; includeSubDomains; preload< vary: 
Accept-Encoding< x-content-type-options: nosniff< x-xss-protection: 1; mode=block< content-length: 21< * 
HTTP error before end of send, stop sending* abort upload after having sent 479788000 bytes* Connection #0 to host mender.scanomat.com left intactClient Closed Request%  

I have tried almost all the solution from internet related to traefik and mender but nothing was helpful. I don't know from where this timeout value is coming from or the overridden values are not taking into effect. I'm totally stuck and have no idea how to solve this issue. I'll be grateful for any suggestion.

Hey @Akshay, I am facing the same problems as you. Did you manage to figure anything out about this issue?

Hey @ianhgenesys, sorry. I didn't get any solution.

Hey @Akshay, I tried adding the unit s to the readTimeout as such:
- --entrypoints.websecure.transport.respondingTimeouts.readTimeout=300s
and it seemed to allow me to upload for longer than 60s.

Maybe you could give it a go and see if that works for you?

@ianhgenesys, I used this but no luck.

I got the same issue, the session is always only keep 60s.
reverse-proxy-1 | 2025-02-09T17:37:06Z DBG github.com/traefik/traefik/v3/pkg/server/service/loadbalancer/wrr/wrr.go:207 > Service selected by WRR: 7de75da966cac97b
reverse-proxy-1 | 2025-02-09T17:38:06Z DBG github.com/traefik/traefik/v3/pkg/proxy/httputil/proxy.go:117 > 499 Client Closed Request error="context canceled"