RSPEED-2943: add authorization monitoring metrics#1639
RSPEED-2943: add authorization monitoring metrics#1639major wants to merge 4 commits intolightspeed-core:mainfrom
Conversation
|
Warning Rate limit exceeded
You’ve run out of usage credits. Purchase more in the billing tab. ⌛ How to resolve this issue?After the wait time has elapsed, a review can be triggered using the We recommend that you space out your commits to avoid hitting the rate limit. 🚦 How do rate limits work?CodeRabbit enforces hourly rate limits for each developer per organization. Our paid plans have higher rate limits than the trial, open-source and free plans. In all cases, we re-allow further reviews after a brief timeout. Please see our FAQ for further information. ℹ️ Review info⚙️ Run configurationConfiguration used: Path: .coderabbit.yaml Review profile: ASSERTIVE Plan: Pro Run ID: 📒 Files selected for processing (26)
WalkthroughAuthorization latency metrics are now measured and recorded throughout the authorization middleware. A new metrics module component provides helper functions to normalize authorization labels and safely record both authorization check counts and duration observations. Metrics are recorded in a finally block to ensure emission even when authorization errors occur. Changes
Sequence DiagramsequenceDiagram
participant Client
participant AuthMiddleware as Auth Middleware
participant RoleResolver as Role Resolver
participant MetricsRecorder as Metrics Recorder
participant Prometheus as Prometheus Metrics
Client->>AuthMiddleware: Authorization Request
activate AuthMiddleware
Note over AuthMiddleware: Start timing (monotonic)
AuthMiddleware->>RoleResolver: Resolve Role
activate RoleResolver
RoleResolver-->>AuthMiddleware: Role/Result
deactivate RoleResolver
alt Authorization Success
AuthMiddleware->>AuthMiddleware: Set result = "success"
else Authorization Denied
AuthMiddleware->>AuthMiddleware: Set result = "denied"
Note over AuthMiddleware: Raise 403
end
AuthMiddleware->>MetricsRecorder: Record Check (action, result)
activate MetricsRecorder
MetricsRecorder->>Prometheus: Increment Counter
Prometheus-->>MetricsRecorder: OK
deactivate MetricsRecorder
AuthMiddleware->>MetricsRecorder: Record Duration (action, result, latency)
activate MetricsRecorder
MetricsRecorder->>Prometheus: Observe Histogram
Prometheus-->>MetricsRecorder: OK
deactivate MetricsRecorder
deactivate AuthMiddleware
AuthMiddleware-->>Client: Response/Exception
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~25 minutes 🚥 Pre-merge checks | ✅ 5✅ Passed checks (5 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
✨ Simplify code
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
0469516 to
4d7fe24
Compare
There was a problem hiding this comment.
Actionable comments posted: 1
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@src/metrics/__init__.py`:
- Around line 78-90: The new module-level metrics authorization_checks_total and
authorization_duration_seconds must be annotated as constants using Final;
import Final from typing if not present and add type annotations like
Final[Counter] for authorization_checks_total and Final[Histogram] for
authorization_duration_seconds so they follow the project's constant-typing
rule; leave the instantiation expressions unchanged and ensure the import of
Final is added at the top of the module if missing.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: ASSERTIVE
Plan: Pro
Run ID: 7933fdef-3790-4918-b1ff-b17e571bddad
📒 Files selected for processing (5)
src/authorization/middleware.pysrc/metrics/__init__.pysrc/metrics/recording.pytests/unit/authorization/test_middleware.pytests/unit/metrics/test_recording.py
📜 Review details
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (12)
- GitHub Check: Pylinter
- GitHub Check: integration_tests (3.12)
- GitHub Check: radon
- GitHub Check: Pyright
- GitHub Check: build-pr
- GitHub Check: E2E: server mode / ci / group 3
- GitHub Check: E2E: server mode / ci / group 2
- GitHub Check: E2E: server mode / ci / group 1
- GitHub Check: E2E: library mode / ci / group 2
- GitHub Check: E2E: library mode / ci / group 3
- GitHub Check: E2E: library mode / ci / group 1
- GitHub Check: E2E Tests for Lightspeed Evaluation job
🧰 Additional context used
📓 Path-based instructions (4)
tests/unit/**/*.py
📄 CodeRabbit inference engine (AGENTS.md)
tests/unit/**/*.py: Use pytest for all unit and integration tests
Usepytest-mockfor AsyncMock objects in unit tests
Use markerpytest.mark.asynciofor async tests
Files:
tests/unit/authorization/test_middleware.pytests/unit/metrics/test_recording.py
tests/**/*.py
📄 CodeRabbit inference engine (AGENTS.md)
Do not use unittest - pytest is the standard testing framework for this project
Files:
tests/unit/authorization/test_middleware.pytests/unit/metrics/test_recording.py
src/**/*.py
📄 CodeRabbit inference engine (AGENTS.md)
src/**/*.py: Use absolute imports for internal modules:from authentication import get_auth_dependency
All modules start with descriptive docstrings explaining purpose
Uselogger = get_logger(__name__)fromlog.pyfor module logging
Type aliases defined at module level for clarity
Use Final[type] as type hint for all constants
All functions require docstrings with brief descriptions
Complete type annotations for function parameters and return types
Use Union types with modern syntax:str | int
UseOptional[Type]for optional type hints
Use snake_case with descriptive, action-oriented function names (get_, validate_, check_)
Avoid in-place parameter modification anti-patterns: return new data structures instead of modifying parameters
Useasync deffor I/O operations and external API calls
Uselogger.debug()for detailed diagnostic information
Uselogger.info()for general information about program execution
Uselogger.warning()for unexpected events or potential problems
Uselogger.error()for serious problems that prevented function execution
All classes require descriptive docstrings explaining purpose
Use PascalCase for class names with descriptive names and standard suffixes:Configuration,Error/Exception,Resolver,Interface
Abstract classes use ABC with@abstractmethoddecorators
Complete type annotations for all class attributes, use specific types, notAny
Follow Google Python docstring conventions for all modules, classes, and functions
Docstring Parameters section documents function parameters
Docstring Returns section documents function return values
Docstring Raises section documents exceptions that may be raised
Use black for code formatting
Use pylint for static analysis with source-roots configuration set to "src"
Use pyright for type checking
Use ruff for fast linting
Use pydocstyle for docstring style validation
Use mypy for additional type checking
Use bandit for security issue detection
Files:
src/metrics/__init__.pysrc/authorization/middleware.pysrc/metrics/recording.py
src/**/__init__.py
📄 CodeRabbit inference engine (AGENTS.md)
Package
__init__.pyfiles contain brief package descriptions
Files:
src/metrics/__init__.py
🔇 Additional comments (4)
tests/unit/authorization/test_middleware.py (1)
325-355: Good resilience coverage for metric failure in the success path.This test correctly verifies that authorization success is not masked by metric failures and that warning logging plus duration recording still occur.
tests/unit/metrics/test_recording.py (1)
191-290: Strong coverage for authorization recorder behavior.The new parametrized tests and normalization assertions validate both happy-path recording and swallowed metric-update failures cleanly.
src/authorization/middleware.py (1)
32-55: Resilient instrumentation infinallyis implemented correctly.The result-state transitions and isolated metric-recording error handling preserve authorization outcomes while still emitting observability data for all paths.
Also applies to: 154-195
src/metrics/recording.py (1)
18-33: Bounded-label normalization and record helpers look solid.The new normalization + recorder functions are clear, type-annotated, and aligned with the intended bounded-label metric strategy.
Also applies to: 35-61, 160-195
48e2c0b to
4569392
Compare
Move telemetry functions from responses.py into a dedicated responses_telemetry.py module. Centralize fire-and-forget dispatch logic in observability/splunk.py as dispatch_splunk_event(). No behavioral changes to request handling. Reduces responses.py from 1201 to 1057 lines. Signed-off-by: Major Hayden <major@redhat.com>
…t restores llama-stack (lightspeed-core#1628) * Add diagnostic pod logs on e2e failure and remove disrupt-once optimization * Increase vLLM max-model-len to 35936 (GPU memory limit) * Accept 503 as valid port-forward proof in e2e connectivity check Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Signed-off-by: Major Hayden <major@redhat.com>
Signed-off-by: Major Hayden <major@redhat.com>
|
Superseded by #1640 which now contains all three metrics commits as a stack. |
Description
Adds bounded Prometheus metrics for authorization checks and latency by protected action. This gives SLO dashboards visibility into authorization success, denial, and unexpected error paths without adding user or request identifiers as labels.
Type of change
Tools used to create PR
Related Tickets & Documents
Checklist before requesting a review
Testing