Skip to content

Reconfiguring built-in Prometheus listeners (TCP ports 20201, 20202) #898

@jeremyvisser

Description

@jeremyvisser

Describe the bug

Prometheus exporter TCP ports (20201, 20202) are enabled by default on Ops Agent, which causes problems for users wanting to bind to those ports for other purposes, or reduce network exposure.

While the Prometheus listeners are fairly minimal (a fairly simple handler for /metrics), since the daemons run as root, users wanting to run Ops Agent in a security–sensitive environment will want to eliminate inbound requests.

Additionally, users wanting to run their own service binding to TCP ports 20201 or 20202 will run into conflicts.

There's no obvious way to reconfigure these ports, whether changing the binding address or port numbers. At the very least, it should be possible to bind these ports to localhost instead (::1 or 127.0.0.1).

I recognise that these ports are used by Ops Agent for self monitoring, so avoiding listening on the ports entirely is likely infeasible.

To Reproduce
Steps to reproduce the behavior:

  1. Environment: RHEL 7, Ops Agent google-cloud-ops-agent-2.22.0-1.el7.x86_64
  2. Use default config
  3. Run netstat -anp | grep :2020:
    tcp6       0      0 :::20201                :::*                    LISTEN      15072/otelopscol
    tcp        0      0 0.0.0.0:20202           0.0.0.0:*               LISTEN      15123/fluent-bit
    tcp        0      0 127.0.0.1:20202         127.0.0.1:45340         ESTABLISHED 15123/fluent-bit
    tcp        0      0 127.0.0.1:45340         127.0.0.1:20202         ESTABLISHED 15072/otelopscol
    tcp        0      0 127.0.0.1:51388         127.0.0.1:20201         ESTABLISHED 15072/otelopscol
    tcp6       0      0 127.0.0.1:20201         127.0.0.1:51388         ESTABLISHED 15072/otelopscol
    
    Observe that ports 20201 and 20202 are bound to the zero address.
  4. Observe this config in /run/google-cloud-ops-agent-opentelemetry-collector/otel.yaml:
      telemetry:
        metrics:
          address: 0.0.0.0:20201
    
  5. Observe this config in /run/google-cloud-ops-agent-fluent-bit/fluent_bit_main.conf:
    [OUTPUT]
        Match *
        Name  prometheus_exporter
        host  0.0.0.0
        port  20202
    

Expected behavior

I would expect to be able to reconfigure the bind address to force the ports to bind to localhost only (::1 and 127.0.0.1), as well as change the port numbers.

While iptables rules are additionally useful as a defense-in-depth method, avoiding binding unnecessarily in the first place may be preferable.

Environment (please complete the following information):

  • VM distro / OS: RHEL 7
  • Ops Agent version: google-cloud-ops-agent-2.22.0-1.el7.x86_64
  • Ops Agent configuration: default config

Additional context
https://issuetracker.google.com/251023934

Metadata

Metadata

Assignees

No one assigned

    Labels

    no-staleExempt this issue/PR from the staleness bot

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions