Inconsistency with rate limit middleware

Hello,
I am a little new to traefik and have some questions about setting up a rate limit on my app. I have read through the docs and tried to set up a rate limit to test based on the information within the docs, but there is no consistency with how my app is rate-limiting these specific routes. When I test the endpoints, I do not consistently get a 429 response. For example, let's say I have a basic rate limit of 10 requests per minute, this means I should consistently be seeing 429s if I make more than 10 requests per minute. But with the middleware, sometimes I see a 429 at my 6th request, sometimes I don't see it at all, and sometimes I see it after making 20 requests.

For my setup, I am using a plugin called htransformation to create a new header from the cookie set by flask login, to get the user-id.

apiVersion: traefik.containo.us/v1alpha1
kind: Middleware
metadata:
    name: set-ratelimit-header
spec:
    plugin:
        htransformation:
          Rules:
            - Header: 'X-User-Id'
              Name: 'Set User-Id-Token'
              Value: ''
              Type: 'Set'
            - Header: 'X-User-Id'
              HeaderPrefix: "^"
              Name: 'join cookie'
              Sep: ','
              Values:
                - '^cookie'
              Type: 'Join'
            - Header: 'X-User-Id'
              Name: 'User-Id-Token rewrite'
              Value: 'remember_token=(\d*)'
              ValueReplace: '$1'
              Type: 'RewriteValueRule'

Then add a rate limit with the sourceCriterion to that header that I just created.

apiVersion: traefik.containo.us/v1alpha1
kind: Middleware
metadata:
  name: token-ratelimit
spec:
  rateLimit:
    average: 10
    burst: 8
    period: 60s
    sourceCriterion:
      requestHeaderName: X-User-Id

I am trying to test a lower rate limit to see if my rate limit is actually working. My goal is to set this at 10 requests per minute with a burst of 8 (which I consider as slots that each refresh every minute). I am not sure if that is correct, please let me know if that is wrong.
Added to the ingress

    - match: ...
      kind: Rule
      ...
      middlewares:
        - name: set-ratelimit-header
        - name: token-ratelimit

Does this header persist throughout the session or is it being created before every request? If it is being recreated before every request, can this still be rate limited by the middleware?

Are you running multiple Traefik instances in parallel? I would expect the rate limiting is only implemented per instance, so with multiple it might not work very precise.

Thank you for your reply. I think I misunderstood how the rate limit works. In my example, I have set it with the understanding that I can make 10 req in a period of 60 seconds, with 8 slots that allow me to handle multiple requests within a period of time. I thought that I should start seeing 429s on my 11th, 12th, etc request

But that was not how it works. The rate limit I set has 8 slots which each replenish every 10/60 (.167) seconds. It was replenishing too fast for me to test the rate limit and that's why it wasn't rate limiting like Flask-Limiter.

Nonetheless, thank you for your time.