cat_gateway/service/api/health/live_get.rs
1//! # Implementation of the `GET /health/live` endpoint.
2//!
3//! This module provides an HTTP endpoint to monitor the liveness of the API service using
4//! a simple counter mechanism. It uses an atomic boolean named `IS_LIVE` to track whether
5//! the service is operational. The `IS_LIVE` boolean is initially set to `true`.
6//!
7//! ## Key Features
8//!
9//! 1. **Atomic Counter**: The endpoint maintains an atomic counter that increments
10//! every time the endpoint is accessed. This counter helps track the number of
11//! requests made to the endpoint.
12//!
13//! 2. **Counter Reset**: Every 30 seconds, the counter is automatically reset to zero.
14//! This ensures that the count reflects recent activity rather than cumulative usage
15//! over a long period.
16//!
17//! 3. **Threshold Check**: If the counter reaches a predefined threshold (e.g., 100),
18//! the `IS_LIVE` boolean is set to `false`. This indicates that the service is no
19//! longer operational. Once `IS_LIVE` is set to `false`, it cannot be changed back to
20//! `true`.
21//!
22//! 4. **Response Logic**:
23//! - If `IS_LIVE` is `true`, the endpoint returns a `204 No Content` response,
24//! indicating that the service is healthy and operational.
25//! - If `IS_LIVE` is `false`, the endpoint returns a `503 Service Unavailable`
26//! response, indicating that the service is no longer operational.
27//!
28//! ## How It Works
29//!
30//! - When the endpoint is called, the atomic counter increments by 1.
31//! - Every 30 seconds, the counter is reset to 0 to ensure it only reflects recent
32//! activity.
33//! - If the counter reaches the threshold (e.g., 100), the `IS_LIVE` boolean is set to
34//! `false`.
35//! - Once `IS_LIVE` is `false`, the endpoint will always respond with `503 Service
36//! Unavailable`.
37//! - If `IS_LIVE` is `true`, the endpoint responds with `204 No Content`.
38//!
39//! ## Example Scenarios
40//!
41//! 1. **Normal Operation**:
42//! - The counter is below the threshold.
43//! - `IS_LIVE` remains `true`.
44//! - The endpoint returns `204 No Content`.
45//!
46//! 2. **Threshold Exceeded**:
47//! - The counter reaches 100.
48//! - `IS_LIVE` is set to `false`.
49//! - The endpoint returns `503 Service Unavailable`.
50//!
51//! 3. **After Threshold Exceeded**:
52//! - The counter is reset to 0, but `IS_LIVE` remains `false`.
53//! - The endpoint continues to return `503 Service Unavailable`.
54//!
55//! ## Notes
56//!
57//! - The `IS_LIVE` boolean is atomic, meaning it is thread-safe and can be accessed
58//! concurrently without issues.
59//!
60//! This endpoint is useful for monitoring service liveness and automatically marking it
61//! as unavailable if it becomes overloaded or encounters issues.
62
63use poem_openapi::ApiResponse;
64
65use crate::service::{
66 common::{responses::WithErrorResponses, types::headers::retry_after::RetryAfterOption},
67 utilities::health::is_live,
68};
69
70/// Endpoint responses.
71#[derive(ApiResponse)]
72pub(crate) enum Responses {
73 /// ## No Content
74 ///
75 /// Service is OK and can keep running.
76 #[oai(status = 204)]
77 NoContent,
78}
79
80/// All responses.
81pub(crate) type AllResponses = WithErrorResponses<Responses>;
82
83/// # GET /health/live
84///
85/// Liveness endpoint.
86///
87/// Kubernetes (and others) use this endpoint to determine if the service is able
88/// to keep running.
89///
90/// In this service, liveness is assumed unless there are multiple panics generated
91/// by an endpoint in a short window.
92pub(crate) fn endpoint() -> AllResponses {
93 if is_live() {
94 Responses::NoContent.into()
95 } else {
96 AllResponses::service_unavailable_with_msg(
97 "Service is not live, do not send other requests.".to_string(),
98 RetryAfterOption::Default,
99 )
100 }
101}