Google Cloud Run error with OpenTelemetry CloudMonitoringMetricsExporter: "One or more points were written more frequently than the maximum sampl..."

See this issue described in StackOverflow: [https://stackoverflow.com/questions/79783245/google-cloud-run-error-with-opentelemetry-cloudmonitoringmetricsexporter-one-o](https://stackoverflow.com/questions/79783245/google-cloud-run-error-with-opentelemetry-cloudmonitoringmetricsexporter-one-o)

**Background**
I have a containerized Python Flask application that is deployed on Google Cloud Run. I want to extract custom metrics from this app and send them to Google Cloud Monitoring.

I followed the example in these two websites, using `CloudMonitoringMetricsExporter` from `opentelemetry.exporter.cloud_monitoring` to export metrics directly to Google Cloud Monitoring (without using a collector sidecar as described [here](https://cloud.google.com/stackdriver/docs/managed-prometheus/cloudrun-sidecar)):

- [https://pypi.org/project/opentelemetry-exporter-gcp-monitoring/](https://pypi.org/project/opentelemetry-exporter-gcp-monitoring/)
- [https://google-cloud-opentelemetry.readthedocs.io/en/latest/examples/cloud_monitoring/README.html](https://google-cloud-opentelemetry.readthedocs.io/en/latest/examples/cloud_monitoring/README.html)

**Error**
Sometimes, but not always, almost exactly 15 minutes after my Cloud Run service records the last activity in the logs, I see the following in the logs, showing a termination signal from Cloud Run, following by an error writing to Google Cloud Monitoring:

```
[2025-10-05 13:03:54 +0000] [1] [INFO] Handling signal: term
[2025-10-05 13:03:54 +0000] [2] [INFO] Worker exiting (pid: 2)
[ERROR] - Error while writing to Cloud Monitoring
Traceback (most recent call last):
  File "/usr/local/lib/python3.13/site-packages/google/api_core/grpc_helpers.py", line 75, in error_remapped_callable
    return callable_(*args, **kwargs)
  File "/usr/local/lib/python3.13/site-packages/grpc/_interceptor.py", line 277, in __call__
    response, ignored_call = self._with_call(
        request,
    ...<4 lines>...
    compression=compression,
    )
  File "/usr/local/lib/python3.13/site-packages/grpc/_interceptor.py", line 332, in _with_call
    return call.result(), call
  File "/usr/local/lib/python3.13/site-packages/grpc/_channel.py", line 440, in result
    raise self
  File "/usr/local/lib/python3.13/site-packages/grpc/_interceptor.py", line 315, in continuation
    response, call = self._thunk(new_method).with_call(
        request,
    ...<4 lines>...
        compression=new_compression,
    )
  File "/usr/local/lib/python3.13/site-packages/grpc/_channel.py", line 1195, in with_call
    return _end_unary_response_blocking(state, call, True, None)
  File "/usr/local/lib/python3.13/site-packages/grpc/_channel.py", line 1009, in _end_unary_response_blocking
    raise _InactiveRpcError(state)  # pytype: disable=not-instantiable
grpc._channel._InactiveRpcError: <_InactiveRpcError of RPC that terminated with:
    status = StatusCode.INVALID_ARGUMENT
```

The specific error is then "**One or more points were written more frequently than the maximum sampling period configured for the metric**" (same one as called out [here](https://google-cloud-opentelemetry.readthedocs.io/en/latest/examples/cloud_monitoring/README.html#troubleshooting)):

```
 "  details = "One or more TimeSeries could not be written: timeSeries[0-2] (example metric.type="workload.googleapis.com/<redacted>", metric.labels={"net_peer_name": "<redacted>", "environment": "prod", "webhook_label": "generic", "component": "forwarder", "http_status_code": "200", "http_status_bucket": "2xx", "user_agent": "<redacted>", "opentelemetry_id": "d731413a"}): write for resource=generic_task{namespace:cloud-run,location:us-central1,job:<redacted>,task_id:02f24696-0786-4970-a93b-02176d5f1d75} failed with: One or more points were written more frequently than the maximum sampling period configured for the metric. {Metric: workload.googleapis.com/<redacted>, Timestamps: {Youngest Existing: '2025/10/05-06:03:53.004', New: '2025/10/05-06:03:54.778'}}""
```

The error log continues:

```
"   debug_error_string = "UNKNOWN:Error received from peer ipv4:173.194.194.95:443 {grpc_message:"One or more TimeSeries could not be written: timeSeries[0-2] (example metric.type=\"workload.googleapis.com/<redacted>\", metric.labels={\"net_peer_name\": \"<redacted>\", \"environment\": \"prod\", \"webhook_label\": \"generic\", \"component\": \"forwarder\", \"http_status_code\": \"200\", \"http_status_bucket\": \"2xx\", \"user_agent\": \"<redacted>\", \"opentelemetry_id\": \"d731413a\"}): write for resource=generic_task{namespace:cloud-run,location:us-central1,job:<redacted>,task_id:02f24696-0786-4970-a93b-02176d5f1d75} failed with: One or more points were written more frequently than the maximum sampling period configured for the metric. {Metric: workload.googleapis.com/<redacted>, Timestamps: {Youngest Existing: \'2025/10/05-06:03:53.004\', New: \'2025/10/05-06:03:54.778\'}}", grpc_status:3}""
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
  File "/usr/local/lib/python3.13/site-packages/opentelemetry/exporter/cloud_monitoring/__init__.py", line 371, in export
    self._batch_write(all_series)
    ~~~~~~~~~~~~~~~~~^^^^^^^^^^^^
  File "/usr/local/lib/python3.13/site-packages/opentelemetry/exporter/cloud_monitoring/__init__.py", line 155, in _batch_write
    self.client.create_time_series(
        CreateTimeSeriesRequest(
    ...<4 lines>...
        ),
    )
  File "/usr/local/lib/python3.13/site-packages/google/cloud/monitoring_v3/services/metric_service/client.py", line 1791, in create_time_series
    rpc(
        request,
    ...<2 lines>...
        metadata=metadata,
    )
  File "/usr/local/lib/python3.13/site-packages/google/api_core/gapic_v1/method.py", line 131, in __call__
    return wrapped_func(*args, **kwargs)
  File "/usr/local/lib/python3.13/site-packages/google/api_core/timeout.py", line 130, in func_with_timeout
    return func(*args, **kwargs)
  File "/usr/local/lib/python3.13/site-packages/google/api_core/grpc_helpers.py", line 77, in error_remapped_callable
    raise exceptions.from_grpc_error(exc) from exc
"google.api_core.exceptions.InvalidArgument: 400 One or more TimeSeries could not be written: timeSeries[0-2] (example metric.type="workload.googleapis.com/<redacted>", metric.labels={"net_peer_name": "<redacted>", "environment": "prod", "webhook_label": "generic", "component": "forwarder", "http_status_code": "200", "http_status_bucket": "2xx", "user_agent": "<redacted>", "opentelemetry_id": "d731413a"}): write for resource=generic_task{namespace:cloud-run,location:us-central1,job:<redacted>,task_id:02f24696-0786-4970-a93b-02176d5f1d75} failed with: One or more points were written more frequently than the maximum sampling period configured for the metric. {Metric: workload.googleapis.com/<redacted>, Timestamps: {Youngest Existing: '2025/10/05-06:03:53.004', New: '2025/10/05-06:03:54.778'}} [type_url: "type.googleapis.com/google.monitoring.v3.CreateTimeSeriesSummary""
value: "\010\003\032\006\n\002\010\t\020\003
]
[2025-10-05 13:03:57 +0000] [1] [INFO] Shutting down: Master
```

**My Code**
I have a function called configure_metrics, which is (simplified):

```
def configure_metrics(**kwargs):
    """
    Configure OpenTelemetry metrics for Cloud Monitoring.
    """    
    name, namespace, version, instance_id = _infer_service_identity(
        service_name, service_namespace, service_version, service_instance_id
    ) # Custom internal function

    # Base resource with service-specific attributes; avoid platform-specific hardcoding here.
    base_resource = Resource.create(
        {
            "service.name": name,
            "service.namespace": namespace,
            "service.version": version,
            "service.instance.id": instance_id,
        }
    )

    # Detect environment-specific resource (e.g., GCE VM, GKE Pod, Cloud Run instance) and merge.
    try:
        detected_resource = GoogleCloudResourceDetector().detect()
    except Exception as e:
        logger.debug(
            "GCP resource detection failed; continuing with base resource: %s", e
        )
        detected_resource = Resource.create({})

    resource = detected_resource.merge(base_resource)

    exporter = CloudMonitoringMetricsExporter(
        # Helps avoid 'written more frequently than the maximum sampling period' conflicts
        add_unique_identifier=add_unique_identifier
    )

    reader = PeriodicExportingMetricReader(
        exporter, export_interval_millis=export_interval_ms
    )
        provider = MeterProvider(metric_readers=[reader], resource=resource)

    # Sets the global MeterProvider
    # After this, any metrics.get_meter(<any_name>) in your process gets a Meter from this provider.
    metrics.set_meter_provider(provider)
```

In `main.py`, I configure OpenTelemetry metrics as:

```
def create_app() -> Flask:
    app = Flask(__name__)

    # Initialize OTel metrics provider once per process/worker.
    configure_metrics(
        export_interval_ms=60000
    )  # Export every minute, instead of default every 5 seconds

    # Only now import and register blueprints (routes) so instruments are created
    # against the meter provider installed in configure_metrics()
    from app.routes import webhook

    app.register_blueprint(webhook.bp)

    return app


app = create_app()

if __name__ == "__main__":
    app.run(port=8080)
```

And in other files, such as `webhook.py` referenced above, I define my own custom metrics as in this example:

```
# ---------------------------------------------
# OpenTelemetry (OTel) metrics
# ---------------------------------------------
# Get a meter from the provider that was installed in main.py
meter = metrics.get_meter(
    "webhooks"
)  # Any stable string works for naming this meter

# Request counter
# Metric name maps to workload.googleapis.com/request_counter in Cloud Monitoring.
requests_counter = meter.create_counter(
    name="webhook_request_counter",
    description="Total number of HTTP requests processed by the webhooks blueprint",
    unit="1",
)
```

And the metric is updated where needed as:

```
requests_counter.add(1, attributes=attrs)
```

**Possible Explanation**
I think something along these lines is happening:

- The exporter to Cloud Monitoring is running every 60 seconds.
- Suppose at time T a scheduled export occurs, sending new points for each time series.
- Then some time later, the container is being terminated (e.g. Cloud Run shutting down or scaling), and before exiting, the application invokes a shutdown handler or signal handler that triggers a flush of metrics (force flush).
- That flush occurs shortly (~1 or 2 seconds) after the last scheduled export. Some of the same time series get a new “point” during that flush with a timestamp that is only 1–2 seconds apart from the previous. Because that's < 5s, Cloud Monitoring rejects it.

**Help**
I do not know how to handle this event in the code in such a as to avoid the error but not result in data loss. What edits should I make? **Or separately, is this an issue to be solved instead in the OpenTelemetry Python SDK for GCP?**

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

Google Cloud Run error with OpenTelemetry CloudMonitoringMetricsExporter: "One or more points were written more frequently than the maximum sampl..." #431

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Uh oh!

Google Cloud Run error with OpenTelemetry CloudMonitoringMetricsExporter: "One or more points were written more frequently than the maximum sampl..." #431

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions