Portainer - Unable to retrieve endpoints

Hi,

I am trying to add Portainer as a service and am getting 'Unable to retrieve endpoints' upon creating/trying to account. I'm not sure if this is an issue with my Stack files related to Traefik (I just paste code till stuff finally works) or if this is a Portainer issue but seeing as it references endpoints I thought I would post here and see if anyone can help.

Portainer code below is setup in a seperate Stack file to Traefik. HTTPS redirect is working and when going to the URL I get a Portainer page asking me to create and admin account. After submitting this I received and error and if I refresh the page I then get a login page. Upon entering credentials I get the 'Unable to retrieve endpoints' error and no login occurs. I tried deleting the Portainer data and trying again but got no love second time around either.

Thanks for any help anyone can offer.

#Portainer

version: "3.7"

services:
      
  portainer:
    image: portainer/portainer:1.22.1
    volumes:
      - /var/run/docker.sock:/var/run/docker.sock
      - ./portainer_data:/data
    networks:
      - traefik-public
    deploy:
      labels:
        traefik.enable: "true"
        traefik.http.routers.portainer.rule: "Host(`portainer.myurl.dev`)"
        traefik.http.routers.portainer.middlewares: auth-portainer
        traefik.http.routers.portainer.entrypoints: websecure
        traefik.http.routers.portainer.tls: "true"
        traefik.http.routers.portainer.tls.certresolver: leresolver
        # Swarm Mode
        traefik.http.services.portainer.loadbalancer.server.port: 9000
        # Basic Auth
        traefik.http.middlewares.auth-portainer.basicauth.users: "user:authcodehere"
        #traefik.docker.network: traefik-public
            
networks:
  traefik-public:
    external: true
Traefik

# ### INSTRUCTIONS ###
# Create Swarm network before deployment
  # docker network create --driver=overlay traefik-public
# acme.json needs chmod 600 acme.json

# Based on https://blog.containo.us/traefik-2-0-docker-101-fc2893944b9d
# Swarm Mode [How to install Traefik 2.x on a Docker Swarm](https://creekorful.me/how-to-install-traefik-2-docker-swarm/)

version: "3.7"

services:
  traefik:
    image: traefik:2.0.2
    networks:
      - traefik-public
    command:
      - --entrypoints.web.address=:80
      - --entrypoints.websecure.address=:443
      - --providers.docker.exposedbydefault=false
      # Swarm
      - --providers.docker.swarmMode=true
      # Enables web UI and tells Traefik to listen to docker
      - --providers.docker
      - --api
      # Let's Encrypt
      - --certificatesresolvers.leresolver.acme.email=myemail@email.com
      - --certificatesresolvers.leresolver.acme.storage=/letsencrypt/acme.json
      - --certificatesresolvers.leresolver.acme.tlschallenge=true
      # Logging
      - --log.level=DEBUG # DEBUG, ERROR, INFO???
      - --log.filePath=/traefik.log
      - --log.format=json
    ports:
      - "80:80"
      - "443:443"
    volumes:
      - ./letsencrypt:/letsencrypt
      - ./logs/traefik.log:/traefik.log
      # So that Traefik can listen to the Docker events
      - /var/run/docker.sock:/var/run/docker.sock:ro
    deploy:
      placement:
        constraints:
          - node.role == manager
      labels:
        traefik.enable: "true"
        # Dashboard
        traefik.http.routers.traefik.rule: "Host(`traefik.myurl.dev`)"
        traefik.http.routers.traefik.service: api@internal
        traefik.http.routers.traefik.tls.certresolver: leresolver
        traefik.http.routers.traefik.entrypoints: websecure
        traefik.http.routers.traefik.middlewares: auth-traefik
        # Swarm Mode
        traefik.http.services.traefik.loadbalancer.server.port: 80
        # Basic Auth
        traefik.http.middlewares.auth-traefik.basicauth.users: "user:authcodehere"
        # Global http to https redirect
        traefik.http.routers.http-catchall.rule: "hostregexp(`{host:.+}`)"
        traefik.http.routers.http-catchall.entrypoints: web
        traefik.http.routers.http-catchall.middlewares: redirect-to-https
        # Middleware redirect
        traefik.http.middlewares.redirect-to-https.redirectscheme.scheme: https

networks:
  traefik-public:
    external: true

Hi @mindgonemad, I found the following "issues" by reproducing your case, related to pure Docker Swarm configuration. Your Traefik configuration is valid for me.

  • For the stack traefik, you are doing volumes mounts from a local volume, which was preventing the stack to be created on my clusters. You might want to use named volumes for this (remember: Swarm is expected to be a remote distributed system):
version: '3'

volumes:
  # Declaration of named volume "traefik-letencrypt" with default settings
  traefik-letencrypt:
    # Declaration of named volume "traefik-letencrypt" with default settings
  traefik-logs:

services:
  traefik:
  # (...)
  volumes:
      - traefik-letencrypt:/letsencrypt
      - traefik-logs:/logs/traefik.log
      # So that Traefik can listen to the Docker events
      - /var/run/docker.sock:/var/run/docker.sock:ro
  # (...)
  • Exactly the same issue for portainer's data directory:
version: "3.7"

volumes:
  portainer-data:

services: 
  portainer:
    # (...)
    volumes:
      - /var/run/docker.sock:/var/run/docker.sock
      - portainer-data:/data

I suppose that the error message you had was related to the Swarm behavior not able to schedule the services, as the volume "local" mount was not able to find a valid docker node to start containers on.

Let us know if it fixes your issue :slight_smile:

Hi, I spent a few days messing around and figuring out how to get named volumes working but still can't get Portainer to run. I am using a local bind with the named volume but that shouldn't be an issue.

Volume appears to be working as after trying to run the Portainer site it creates a whole heap of files in the portainer_data folder.

Any more ideas???

volumes:

  portainer-data:
    name: portainer-data
    driver: local
    driver_opts:
      type: none
      o: bind
      device: $PWD/portainer_data

Thanks

Hi @mindgonemad , is there anything blocking you from using the named volume in its simplest form?

# ...
volumes:
  portainer-data:
# End

I want to understand the need to define manually the bind mount, as you are using Swarm (e.g. orchestrating containers on multiple machines so you cannot anticipate on which machine will the data be), and maybe provide alternative solutions.

The idea in Docker is the following: "let Docker manage data for you", as creating a bind mount add a hard constraint to the host machine, while using Docker tries to make things more portable.

Regarding portainer not able to run: what make you say so: is it because you see the container down or restarting? Something else? Could you provide here the logs of the portainer container?

1 Like

Hi @dduportal,

I've pasted the full set of Portainer code at bottom for reference.

  1. Nothing blocking me other than I'll use bind mounts for accessing files to see things and make changes. This may not be necessary with Portainer, it's more a carry over from accessing Traefik logs and Let's Encrypt. I switched to using the simple volume form as requested but get the same response, "authentication in progress" and a couple of error panels popping up. Any refresh just gives a repeat. Maybe I'm doing something else completely wrong but considering I had it working weeks ago back on DigitalOcean I'm not sure. Switched from DO to Linode as my single node VPS kept maxing and locking up and they couldn't help me, haven't had the issue since just one or two restarts while I am away.

  2. Honestly, I have spent sooo many hours trying to get persistent data working with Swarm and in the end gave up. Spent way too many hours trying to get RexRay working on DigitalOcean using block storage and just gave up as needed move on to other aspects. Volumes are starting to make a lot more sense now and in theory I could switch to letting Docker store them rather than the bind mount but considering I'm only using a single node I don't see a difference other than making my life harder. Basically I would like access to all the files for backup purposes and reading of logs etc.
    I'm just doing small sites and pet projects hence the single node though wanted the option of easily switching to a Swarm in the future in case which is why I'm leaving it in 'Swarm Mode' though figuring out persistent storage may stop me as it's looking overly complex for someone who only has an hour or two here or there to mess around and the longer I spend on this the longer before I can actually get a site up and running.

  3. I wouldn't say it isn't running it just won't finish setup/install. I'm not sure how to look at the logs of docker containers, that's still on my long list of shit I need to learn.

Thank you for your time and interest.
Craig

version: "3.7"

services:
      
  portainer:
    image: portainer/portainer:1.22.1
    volumes:
      - /var/run/docker.sock:/var/run/docker.sock
      - portainer-data:/data
    networks:
      - traefik-public
    deploy:
      placement:
        constraints:
          - node.role == manager
      labels:
        traefik.enable: "true"
        traefik.http.routers.portainer.rule: "Host(`portainer.craighofman.art`)"
        traefik.http.routers.portainer.middlewares: auth-portainer
        traefik.http.routers.portainer.entrypoints: websecure
        traefik.http.routers.portainer.tls: "true"
        traefik.http.routers.portainer.tls.certresolver: leresolver
        # Swarm Mode
        traefik.http.services.portainer.loadbalancer.server.port: 9000
        # Basic Auth
        traefik.http.middlewares.auth-portainer.basicauth.users: "user:<password>"
        #traefik.docker.network: traefik-public

volumes:

  portainer-data:
    # name: portainer-data
    # driver: local
    # driver_opts:
    #   type: none
    #   o: bind
    #   device: $PWD/portainer_data

networks:
  traefik-public:
    external: true

Hi @mindgonemad, thanks for this complete explanation, it gives a pretty good context!

Given this context, it totally makes sense to choose swarm and why you need the volume access.

A few elements to help you:

  • The easiest: retrieving logs from a docker swarm stack is done with the command docker service logs <stack name> . If you add the flag -f (short version of --follow), you can even follow it in real time in your terminal. You can read more on this here: https://docs.docker.com/engine/reference/commandline/service/

Example:

docker stack deploy -c portainer.yml portainer

docker service logs portainer_portainer
  • For accessing the volume:
    • If you can access directly to the machine where the volume is (either the only swarm node you have, or any machine of the swarm node if you setup rexray successfully), then you can find the volume's data by default in /var/lib/docker/volumes/<volume id>/_data. If it is not there, use the command docker volume inspect <volume id> to retrieve the full path instead
      (ref. https://docs.docker.com/engine/reference/commandline/volume_inspect/).
    • Alternatively, there is the "dockerize everything" pattern, where you even use a docker container for running your backup (or your interactive commande line to peek in the data), with the volume attached:
docker stack deploy -c portainer.yml portainer
# Stack is deployed, with the volume referenced as "portainer-data"
# This volume is named after the stack and its reference in the docker-compose.yml file

# so retrieve the full name with:
docker volume ls # Named is "portainer_portainer-data" here

# Start an interactive and ephemeral container to browse the data in the volume:
docker run --rm -ti -v portainer_portainer-data:/DATA alpine:3.10 sh
> ls -l /DATA # Check content of volume

# Start a container that will backup the data to a remote server using rsync:
docker run -d -v portainer_portainer-data:/DATA backup_container rsync -av /DATA/ admin@backup-server:/backups/DATA/

I was able to finish installation of portainer using your configuration and disabling user authentication + let's encrypt, locally on my Docker4Mac. Can you check the logs of the portainer service with the command line I gave you earlier so we can check together what is going wrong in your setup (I assume something related to the data volume on your VPS)?

Thank you for this, it's very helpful and I appreciate the effort putting it together.

Traefik:
I switched the Traefik Logs over to using a standard named volume and will use your info provided for accessing them. I must say though, using bind mounts and opening log files in a GUI is a hell of a lot nicer and easier than creating a temporary and using tail :grin:.
I didn't switch over Let's Encrypt though as trying to get the existing file into the named volume just sounds like too much trouble. In theory starting fresh might force a new file to be created and have Let's Encrypt create all new certificates except that I blew out my usage trying to get all my issues resolved yesterday and have to wait a week. Maybe once that's back maybe I'll give it a spin.

Portainer:
I didn't actually think about the auth and looking back when it ran originally it was probably without auth. If I remove the basic auth then it runs and finishes installing though if I add it back in it breaks again. I did some searching and found some people mentioning it and something about headers clashing but no solutions. Unfortunately running it without some kind of auth isn't acceptable. Any suggestions on possible fixes? See below for the data when I looked into the Portainer log on docker.
PS. Portainer is now using just a standard named volume as you suggest.

portainer_portainer.1.xcruwg1c5n0k@li1773-204    | 2019/11/16 01:14:33 server: Reverse tunnelling enabled
portainer_portainer.1.xcruwg1c5n0k@li1773-204    | 2019/11/16 01:14:33 server: Fingerprint 4b:12:e2:4b:32:13:0d:c1:d6:ca:6d:e5:68:2a:49:1a
portainer_portainer.1.xcruwg1c5n0k@li1773-204    | 2019/11/16 01:14:33 server: Listening on 0.0.0.0:8000...
portainer_portainer.1.xcruwg1c5n0k@li1773-204    | 2019/11/16 01:14:33 Starting Portainer 1.22.1 on :9000
portainer_portainer.1.xcruwg1c5n0k@li1773-204    | 2019/11/16 01:14:33 [DEBUG] [chisel, monitoring] [check_interval_seconds: 10.000000] [message: starting tunnel management process]
portainer_portainer.1.xcruwg1c5n0k@li1773-204    | 2019/11/16 01:14:36 http error: No administrator account found inside the database (err=Object not found inside the database) (code=404)
portainer_portainer.1.xcruwg1c5n0k@li1773-204    | 2019/11/16 01:14:36 http error: No administrator account found inside the database (err=Object not found inside the database) (code=404)

RexRay
And as for RexRay, Linode don't have a plugin like DO so even though I never got it working I have a feeling it's going to be a hell of a lot harder on Linode :grin:.

UPDATE
Ahhhh, what a cock up. Was messing around with trying to get files into the volume and thought I'd delete the acme.json file I assumed it had created so I could copy over mine, though forgot to remove the existing volume, so while I had changed the named volume in yml and restarted Traefik it was still pointing to the file on my server. Now I have lost use of my dev domain for a week as I hit my rate limit :frowning:. No ones fault but my own... just venting :grin:.

ACME Permissions
I was expecting to run into permission issues with the ACME file when letting it be created automatically as when I was creating the file on the bind mount I had to remember to change the permissions to 600 whilst when it's go auto created on the minimal named volume it appears to have created it as 600. It's great that I don't need to mess with permissions but just seems strange.
Would it be ok to run Swarm and just allow any nodes managing Traefik to just create ACME file for each or should the file really be moved to block storage and use a volume plugin???

Thanks

I can reproduce this issue. As soon as an auth middleware is enabled portainer fails. Did you open an issue on portainer or traefik?

The issue is explained here, without a solution: https://github.com/portainer/portainer/issues/1629
Also setting this to true does not help: https://docs.traefik.io/middlewares/basicauth/#removeheader

I never thought to raise the issue anywhere but here. Good to know it's not just me as thought I had done something wrong.