cat_gateway/service/api/health/
live_get.rs

1//! # Implementation of the `GET /health/live` endpoint.
2//!
3//! This module provides an HTTP endpoint to monitor the liveness of the API service using
4//! a simple counter mechanism. It uses an atomic boolean named `IS_LIVE` to track whether
5//! the service is operational. The `IS_LIVE` boolean is initially set to `true`.
6//!
7//! ## Key Features
8//!
9//! 1. **Atomic Counter**:   The endpoint maintains an atomic counter that increments
10//!    every time the endpoint is accessed. This counter helps track the number of
11//!    requests made to the endpoint.
12//!
13//! 2. **Counter Reset**:   Every 30 seconds, the counter is automatically reset to zero.
14//!    This ensures that the count reflects recent activity rather than cumulative usage
15//!    over a long period.
16//!
17//! 3. **Threshold Check**:   If the counter reaches a predefined threshold (e.g., 100),
18//!    the `IS_LIVE` boolean is set to `false`. This indicates that the service is no
19//!    longer operational. Once `IS_LIVE` is set to `false`, it cannot be changed back to
20//!    `true`.
21//!
22//! 4. **Response Logic**:
23//!    - If `IS_LIVE` is `true`, the endpoint returns a `204 No Content` response,
24//!      indicating that the service is healthy and operational.
25//!    - If `IS_LIVE` is `false`, the endpoint returns a `503 Service Unavailable`
26//!      response, indicating that the service is no longer operational.
27//!
28//! ## How It Works
29//!
30//! - When the endpoint is called, the atomic counter increments by 1.
31//! - Every 30 seconds, the counter is reset to 0 to ensure it only reflects recent
32//!   activity.
33//! - If the counter reaches the threshold (e.g., 100), the `IS_LIVE` boolean is set to
34//!   `false`.
35//! - Once `IS_LIVE` is `false`, the endpoint will always respond with `503 Service
36//!   Unavailable`.
37//! - If `IS_LIVE` is `true`, the endpoint responds with `204 No Content`.
38//!
39//! ## Example Scenarios
40//!
41//! 1. **Normal Operation**:
42//!    - The counter is below the threshold.
43//!    - `IS_LIVE` remains `true`.
44//!    - The endpoint returns `204 No Content`.
45//!
46//! 2. **Threshold Exceeded**:
47//!    - The counter reaches 100.
48//!    - `IS_LIVE` is set to `false`.
49//!    - The endpoint returns `503 Service Unavailable`.
50//!
51//! 3. **After Threshold Exceeded**:
52//!    - The counter is reset to 0, but `IS_LIVE` remains `false`.
53//!    - The endpoint continues to return `503 Service Unavailable`.
54//!
55//! ## Notes
56//!
57//! - The `IS_LIVE` boolean is atomic, meaning it is thread-safe and can be accessed
58//!   concurrently without issues.
59//!
60//! This endpoint is useful for monitoring service liveness and automatically marking it
61//! as unavailable if it becomes overloaded or encounters issues.
62
63use poem_openapi::ApiResponse;
64
65use crate::service::{
66    common::{responses::WithErrorResponses, types::headers::retry_after::RetryAfterOption},
67    utilities::health::is_live,
68};
69
70/// Endpoint responses.
71#[derive(ApiResponse)]
72pub(crate) enum Responses {
73    /// ## No Content
74    ///
75    /// Service is OK and can keep running.
76    #[oai(status = 204)]
77    NoContent,
78}
79
80/// All responses.
81pub(crate) type AllResponses = WithErrorResponses<Responses>;
82
83/// # GET /health/live
84///
85/// Liveness endpoint.
86///
87/// Kubernetes (and others) use this endpoint to determine if the service is able
88/// to keep running.
89///
90/// In this service, liveness is assumed unless there are multiple panics generated
91/// by an endpoint in a short window.
92pub(crate) fn endpoint() -> AllResponses {
93    if is_live() {
94        Responses::NoContent.into()
95    } else {
96        AllResponses::service_unavailable_with_msg(
97            "Service is not live, do not send other requests.".to_string(),
98            RetryAfterOption::Default,
99        )
100    }
101}