Models¶
Data models for rate limit configuration and status.
Limit¶
Limit
dataclass
¶
Token bucket rate limit configuration.
Refill rate is stored as a fraction (refill_amount / refill_period_seconds) to avoid floating point precision issues.
Attributes:
| Name | Type | Description |
|---|---|---|
name |
str
|
Unique identifier for this limit type (e.g., "rpm", "tpm") |
capacity |
int
|
Max tokens in the bucket (ceiling) |
refill_amount |
int
|
Numerator of refill rate |
refill_period_seconds |
int
|
Denominator of refill rate |
per_second
classmethod
¶
Create a limit that refills rate tokens per second.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
name
|
str
|
Limit name (e.g., "rps") |
required |
rate
|
int
|
Sustained tokens per second (also the refill amount) |
required |
burst
|
int | None
|
Optional burst ceiling. When set, |
None
|
per_minute
classmethod
¶
Create a limit that refills rate tokens per minute.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
name
|
str
|
Limit name (e.g., "rpm", "tpm") |
required |
rate
|
int
|
Sustained tokens per minute (also the refill amount) |
required |
burst
|
int | None
|
Optional burst ceiling. When set, |
None
|
per_hour
classmethod
¶
Create a limit that refills rate tokens per hour.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
name
|
str
|
Limit name (e.g., "rph") |
required |
rate
|
int
|
Sustained tokens per hour (also the refill amount) |
required |
burst
|
int | None
|
Optional burst ceiling. When set, |
None
|
per_day
classmethod
¶
Create a limit that refills rate tokens per day.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
name
|
str
|
Limit name (e.g., "rpd") |
required |
rate
|
int
|
Sustained tokens per day (also the refill amount) |
required |
burst
|
int | None
|
Optional burst ceiling. When set, |
None
|
custom
classmethod
¶
Create a custom limit with explicit refill rate.
Allow ceiling of 1000 tokens, sustained at 100/sec
Limit.custom("requests", capacity=1000, refill_amount=100, refill_period_seconds=1)
from_bucket_state
classmethod
¶
Reconstruct a Limit from BucketState fields.
Entity¶
Entity
dataclass
¶
An entity that can have rate limits applied.
Entities can be parents (projects) or children (API keys). Children have a parent_id reference.
Note: This model does not validate in post_init to support DynamoDB deserialization and avoid performance overhead. Validation is performed in Repository.create_entity() at the API boundary.
LimitStatus¶
LimitStatus
dataclass
¶
LimitStatus(entity_id, resource, limit_name, limit, available, requested, exceeded, retry_after_seconds)
Status of a specific limit check.
Returned in RateLimitExceeded to provide full visibility into all limits that were checked.
Note: This is an internal model created by the limiter from validated inputs. No validation is performed here to avoid performance overhead.
BucketState¶
BucketState
dataclass
¶
BucketState(entity_id, resource, limit_name, tokens_milli, last_refill_ms, capacity_milli, refill_amount_milli, refill_period_ms, total_consumed_milli=None)
Internal state of a token bucket.
All token values are stored in millitokens (x1000) for precision.
Note: This is an internal model. Validation is performed in from_limit() for user-provided inputs, not in post_init to support DynamoDB deserialization and avoid performance overhead on frequent operations.
from_limit
classmethod
¶
Create a new bucket at full capacity from a Limit.
Note: This is an internal factory method. Validation of entity_id and resource is performed at the API boundary (RateLimiter public methods) before calling this method.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
entity_id
|
str
|
Entity identifier (pre-validated by caller) |
required |
resource
|
str
|
Resource name (pre-validated by caller) |
required |
limit
|
Limit
|
Limit configuration (validated via post_init) |
required |
now_ms
|
int
|
Current time in milliseconds |
required |
AuditEvent¶
AuditEvent
dataclass
¶
Security audit event for tracking modifications.
Audit events are logged for security-sensitive operations: - Entity creation and deletion - Limit configuration changes
Attributes:
| Name | Type | Description |
|---|---|---|
event_id |
str
|
Unique identifier for the event (timestamp-based) |
timestamp |
str
|
ISO timestamp when the event occurred |
action |
str
|
Type of action (see AuditAction constants) |
entity_id |
str
|
ID of the entity affected |
principal |
str | None
|
Caller identity who performed the action (optional) |
resource |
str | None
|
Resource name for limit-related actions (optional) |
details |
dict[str, Any]
|
Additional action-specific details |
AuditAction¶
AuditAction
¶
Audit action type constants.
UsageSnapshot¶
UsageSnapshot
dataclass
¶
Aggregated usage for a time window.
Created by the aggregator Lambda from DynamoDB stream events. Tracks token consumption per limit type within a time window.
Attributes:
| Name | Type | Description |
|---|---|---|
entity_id |
str
|
Entity that consumed tokens |
resource |
str
|
Resource being rate-limited (e.g., "gpt-4") |
window_start |
str
|
ISO timestamp of window start (e.g., "2024-01-01T14:00:00Z") |
window_end |
str
|
ISO timestamp of window end |
window_type |
str
|
Window granularity ("hourly", "daily") |
counters |
dict[str, int]
|
Consumption by limit name (e.g., {"tpm": 5000, "rpm": 10}) |
total_events |
int
|
Number of consumption events in this window |
UsageSummary¶
UsageSummary
dataclass
¶
Aggregated usage summary across multiple snapshots.
Returned by RateLimiter.get_usage_summary() to provide
total and average consumption statistics over a time range.
Attributes:
| Name | Type | Description |
|---|---|---|
snapshot_count |
int
|
Number of snapshots aggregated |
total |
dict[str, int]
|
Sum of consumption by limit name (e.g., {"tpm": 50000, "rpm": 100}) |
average |
dict[str, float]
|
Average consumption per snapshot by limit name |
min_window_start |
str | None
|
Earliest snapshot window start (ISO timestamp) |
max_window_start |
str | None
|
Latest snapshot window start (ISO timestamp) |
Example
summary = await limiter.get_usage_summary( entity_id="user-123", resource="gpt-4", window_type="hourly", ) print(f"Total tokens: {summary.total.get('tpm', 0)}") print(f"Average per hour: {summary.average.get('tpm', 0.0):.1f}")
LimiterInfo¶
LimiterInfo
dataclass
¶
LimiterInfo(stack_name, user_name, region, stack_status, creation_time, last_updated_time=None, version=None, lambda_version=None, schema_version=None, stack_type=None)
Information about a deployed rate limiter instance.
Represents a CloudFormation stack discovered in a region via
RateLimiter.list_deployed() or the zae-limiter list CLI command.
This is a READ-ONLY model describing observed infrastructure state.
Example
Discover all limiters in us-east-1¶
limiters = await RateLimiter.list_deployed(region="us-east-1") for limiter in limiters: if limiter.is_failed: print(f"⚠️ {limiter.user_name}: {limiter.stack_status}")
Attributes:
| Name | Type | Description |
|---|---|---|
stack_name |
str
|
Full CloudFormation stack name (e.g., "my-app") |
user_name |
str
|
User-friendly name (e.g., "my-app") |
region |
str
|
AWS region where the stack is deployed |
stack_status |
str
|
CloudFormation stack status (e.g., "CREATE_COMPLETE") |
creation_time |
str
|
ISO 8601 timestamp of stack creation |
last_updated_time |
str | None
|
ISO 8601 timestamp of last update (None if never updated) |
version |
str | None
|
Value of zae-limiter:version tag (client version at deployment) |
lambda_version |
str | None
|
Value of zae-limiter:lambda-version tag |
schema_version |
str | None
|
Value of zae-limiter:schema-version tag |
StackOptions¶
StackOptions
dataclass
¶
StackOptions(snapshot_windows='hourly,daily', usage_retention_days=90, audit_retention_days=90, enable_aggregator=True, pitr_recovery_days=None, log_retention_days=30, lambda_timeout=60, lambda_memory=256, enable_alarms=True, alarm_sns_topic=None, lambda_duration_threshold_pct=80, permission_boundary=None, role_name_format=None, policy_name_format=None, enable_audit_archival=True, audit_archive_glacier_days=90, enable_tracing=False, create_iam_roles=False, create_iam=True, aggregator_role_arn=None, enable_deletion_protection=False, tags=None)
Configuration options for CloudFormation stack creation and updates.
When passed to RateLimiter constructor, triggers automatic stack creation. When None is passed (default), no stack creation is attempted.
Attributes:
| Name | Type | Description |
|---|---|---|
snapshot_windows |
str
|
Comma-separated list of snapshot windows (e.g., "hourly,daily") |
usage_retention_days |
int
|
Number of days to retain usage snapshots |
audit_retention_days |
int
|
Number of days to retain audit records in DynamoDB |
enable_aggregator |
bool
|
Deploy Lambda aggregator for usage snapshots |
pitr_recovery_days |
int | None
|
Point-in-Time Recovery period (1-35, None for AWS default) |
log_retention_days |
int
|
CloudWatch log retention period in days (must be valid CloudWatch value) |
lambda_timeout |
int
|
Lambda timeout in seconds (1-900) |
lambda_memory |
int
|
Lambda memory size in MB (128-3008) |
enable_alarms |
bool
|
Deploy CloudWatch alarms for monitoring |
alarm_sns_topic |
str | None
|
SNS topic ARN for alarm notifications |
lambda_duration_threshold_pct |
int
|
Duration alarm threshold as percentage of timeout (1-100) |
permission_boundary |
str | None
|
IAM permission boundary (policy name or full ARN) |
role_name_format |
str | None
|
Format template for role name, {} = default role name |
policy_name_format |
str | None
|
Format template for managed policy name, {} = default policy name |
enable_audit_archival |
bool
|
Archive expired audit events to S3 via TTL |
audit_archive_glacier_days |
int
|
Days before transitioning archives to Glacier IR (1-3650) |
enable_tracing |
bool
|
Enable AWS X-Ray tracing for Lambda aggregator |
create_iam_roles |
bool
|
Create App/Admin/ReadOnly IAM roles (default: False). Managed policies are always created unless create_iam=False. |
create_iam |
bool
|
Create IAM resources (policies and roles). Set to False for restricted IAM environments (e.g., PowerUserAccess). When False, aggregator is disabled unless aggregator_role_arn is provided. |
aggregator_role_arn |
str | None
|
ARN of an existing IAM role for the Lambda aggregator. Use this when deploying without iam:CreateRole permissions. |
enable_deletion_protection |
bool
|
Enable DynamoDB table deletion protection |
tags |
dict[str, str] | None
|
User-defined tags to apply to the CloudFormation stack. Dict of key-value
pairs. AWS tag constraints apply (max 50 total including managed tags,
key 1-128 chars, value 0-256 chars). The |
get_role_name
¶
Get the final role name for a given stack name and component.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
stack_name
|
str
|
Stack name |
required |
component
|
str
|
Role component (aggr, app, admin, read) |
required |
Returns:
| Type | Description |
|---|---|
str | None
|
Final role name, or None if role_name_format not set |
Raises:
| Type | Description |
|---|---|
ValidationError
|
If resulting name exceeds 64 characters |
get_policy_name
¶
Get the final policy name for a given stack name and component.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
stack_name
|
str
|
Stack name |
required |
component
|
str
|
Policy component (app, admin, read) |
required |
Returns:
| Type | Description |
|---|---|
str | None
|
Final policy name, or None if policy_name_format not set |
Raises:
| Type | Description |
|---|---|
ValidationError
|
If resulting name exceeds 128 characters |
to_parameters
¶
Convert to stack parameters dict for StackManager.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
stack_name
|
str | None
|
Stack name for role_name_format substitution |
None
|
Returns:
| Type | Description |
|---|---|
dict[str, str]
|
Dict with snake_case keys matching stack_manager parameter mapping. |
Status¶
Status
dataclass
¶
Status(available, latency_ms, stack_status, table_status, aggregator_enabled, name, region, schema_version, lambda_version, client_version, table_item_count, table_size_bytes, app_role_arn=None, admin_role_arn=None, readonly_role_arn=None)
Comprehensive status of a rate limiter instance.
Consolidates connectivity, infrastructure, identity, versions, and table
metrics into a single status object. Used by the CLI status command.
Attributes:
| Name | Type | Description |
|---|---|---|
available |
bool
|
Whether DynamoDB is reachable and responding |
latency_ms |
float | None
|
Round-trip latency in milliseconds (None if unavailable) |
stack_status |
str | None
|
CloudFormation stack status (e.g., 'CREATE_COMPLETE') |
table_status |
str | None
|
DynamoDB table status (e.g., 'ACTIVE') |
aggregator_enabled |
bool
|
Whether Lambda aggregator is deployed |
name |
str
|
Resource name |
region |
str | None
|
AWS region (None if using default) |
schema_version |
str | None
|
Deployed schema version |
lambda_version |
str | None
|
Deployed Lambda version |
client_version |
str
|
Current client library version |
table_item_count |
int | None
|
Approximate item count in table |
table_size_bytes |
int | None
|
Approximate table size in bytes |
app_role_arn |
str | None
|
IAM role ARN for applications (None if roles disabled) |
admin_role_arn |
str | None
|
IAM role ARN for administrators (None if roles disabled) |
readonly_role_arn |
str | None
|
IAM role ARN for read-only access (None if roles disabled) |
BackendCapabilities¶
BackendCapabilities
dataclass
¶
BackendCapabilities(supports_audit_logging=False, supports_usage_snapshots=False, supports_infrastructure_management=False, supports_change_streams=False, supports_batch_operations=False)
Declares which extended features a backend supports.
Used by RateLimiter to gracefully degrade when features are unavailable.
Backend implementations should return an instance from their capabilities
property.
See ADR-109 for the capability matrix across backends.
supports_audit_logging
class-attribute
instance-attribute
¶
Whether the backend supports audit event storage and retrieval.
supports_usage_snapshots
class-attribute
instance-attribute
¶
Whether the backend supports usage snapshot aggregation.
supports_infrastructure_management
class-attribute
instance-attribute
¶
Whether the backend supports declarative infrastructure (e.g., CloudFormation).
supports_change_streams
class-attribute
instance-attribute
¶
Whether the backend supports real-time change notifications.
supports_batch_operations
class-attribute
instance-attribute
¶
Whether the backend supports batch_get_buckets() for optimized reads.
ResourceCapacity¶
ResourceCapacity
dataclass
¶
Aggregated capacity info for a resource across entities.
EntityCapacity¶
EntityCapacity
dataclass
¶
Capacity info for a single entity.
SpeculativeResult¶
SpeculativeResult
dataclass
¶
SpeculativeResult(success, buckets=list(), cascade=False, parent_id=None, old_buckets=None, parent_result=None, shard_id=0, shard_count=1, failure_reason=None)
Result of a speculative UpdateItem attempt.
Attributes:
| Name | Type | Description |
|---|---|---|
success |
bool
|
True if the speculative write succeeded. |
buckets |
list[BucketState]
|
On success, deserialized BucketStates from ALL_NEW response.
Includes the |
cascade |
bool
|
On success, whether the entity has cascade enabled. |
parent_id |
str | None
|
On success, the entity's parent_id (if any). |
old_buckets |
list[BucketState] | None
|
On failure, deserialized BucketStates from ALL_OLD response. None if the bucket doesn't exist (first acquire). |
parent_result |
SpeculativeResult | None
|
On cache hit + cascade, nested parent SpeculativeResult from parallel UpdateItem. None for cache miss, non-cascade, or failure. |
shard_id |
int
|
The shard index targeted by this speculative write. |
shard_count |
int
|
The total shard count read from the bucket item. Used by
the limiter to decide whether to retry on another shard or double
shards when the |