Models¶

Data models for rate limit configuration and status.

Limit¶

Limit `dataclass` ¶

Limit(name, capacity, refill_amount, refill_period_seconds)

Token bucket rate limit configuration.

Refill rate is stored as a fraction (refill_amount / refill_period_seconds) to avoid floating point precision issues.

Attributes:

Name	Type	Description
`name`	`str`	Unique identifier for this limit type (e.g., "rpm", "tpm")
`capacity`	`int`	Max tokens in the bucket (ceiling)
`refill_amount`	`int`	Numerator of refill rate
`refill_period_seconds`	`int`	Denominator of refill rate

refill_rate `property` ¶

refill_rate

Tokens per second (for display/debugging).

per_second `classmethod` ¶

per_second(name, rate, burst=None)

Create a limit that refills rate tokens per second.

Parameters:

Name	Type	Description	Default
`name`	`str`	Limit name (e.g., "rps")	required
`rate`	`int`	Sustained tokens per second (also the refill amount)	required
`burst`	`int \| None`	Optional burst ceiling. When set, `capacity` is `burst` and `refill_amount` is `rate`, allowing temporary spikes above the sustained rate.	`None`

per_minute `classmethod` ¶

per_minute(name, rate, burst=None)

Create a limit that refills rate tokens per minute.

Parameters:

Name	Type	Description	Default
`name`	`str`	Limit name (e.g., "rpm", "tpm")	required
`rate`	`int`	Sustained tokens per minute (also the refill amount)	required
`burst`	`int \| None`	Optional burst ceiling. When set, `capacity` is `burst` and `refill_amount` is `rate`, allowing temporary spikes above the sustained rate.	`None`

per_hour `classmethod` ¶

per_hour(name, rate, burst=None)

Create a limit that refills rate tokens per hour.

Parameters:

Name	Type	Description	Default
`name`	`str`	Limit name (e.g., "rph")	required
`rate`	`int`	Sustained tokens per hour (also the refill amount)	required
`burst`	`int \| None`	Optional burst ceiling. When set, `capacity` is `burst` and `refill_amount` is `rate`, allowing temporary spikes above the sustained rate.	`None`

per_day `classmethod` ¶

per_day(name, rate, burst=None)

Create a limit that refills rate tokens per day.

Parameters:

Name	Type	Description	Default
`name`	`str`	Limit name (e.g., "rpd")	required
`rate`	`int`	Sustained tokens per day (also the refill amount)	required
`burst`	`int \| None`	Optional burst ceiling. When set, `capacity` is `burst` and `refill_amount` is `rate`, allowing temporary spikes above the sustained rate.	`None`

custom `classmethod` ¶

custom(name, capacity, refill_amount, refill_period_seconds)

Create a custom limit with explicit refill rate.

Allow ceiling of 1000 tokens, sustained at 100/sec

Limit.custom("requests", capacity=1000, refill_amount=100, refill_period_seconds=1)

to_dict ¶

to_dict()

Serialize to dictionary for storage.

from_dict `classmethod` ¶

from_dict(data)

Deserialize from dictionary.

from_bucket_state `classmethod` ¶

from_bucket_state(state)

Reconstruct a Limit from BucketState fields.

Entity¶

Entity `dataclass` ¶

Entity(id, name=None, parent_id=None, cascade=False, metadata=dict(), created_at=None)

An entity that can have rate limits applied.

Entities can be parents (projects) or children (API keys). Children have a parent_id reference.

Note: This model does not validate in post_init to support DynamoDB deserialization and avoid performance overhead. Validation is performed in Repository.create_entity() at the API boundary.

is_parent `property` ¶

is_parent

True if this entity has no parent (is a root/project).

is_child `property` ¶

is_child

True if this entity has a parent.

LimitStatus¶

LimitStatus `dataclass` ¶

LimitStatus(entity_id, resource, limit_name, limit, available, requested, exceeded, retry_after_seconds)

Status of a specific limit check.

Returned in RateLimitExceeded to provide full visibility into all limits that were checked.

Note: This is an internal model created by the limiter from validated inputs. No validation is performed here to avoid performance overhead.

deficit `property` ¶

deficit

How many tokens short we are (0 if not exceeded).

BucketState¶

BucketState `dataclass` ¶

BucketState(entity_id, resource, limit_name, tokens_milli, last_refill_ms, capacity_milli, refill_amount_milli, refill_period_ms, total_consumed_milli=None)

Internal state of a token bucket.

All token values are stored in millitokens (x1000) for precision.

Note: This is an internal model. Validation is performed in from_limit() for user-provided inputs, not in post_init to support DynamoDB deserialization and avoid performance overhead on frequent operations.

tokens `property` ¶

tokens

Current tokens (not millitokens).

capacity `property` ¶

capacity

Capacity / ceiling (not millitokens).

from_limit `classmethod` ¶

from_limit(entity_id, resource, limit, now_ms)

Create a new bucket at full capacity from a Limit.

Note: This is an internal factory method. Validation of entity_id and resource is performed at the API boundary (RateLimiter public methods) before calling this method.

Parameters:

Name	Type	Description	Default
`entity_id`	`str`	Entity identifier (pre-validated by caller)	required
`resource`	`str`	Resource name (pre-validated by caller)	required
`limit`	`Limit`	Limit configuration (validated via post_init)	required
`now_ms`	`int`	Current time in milliseconds	required

AuditEvent¶

AuditEvent `dataclass` ¶

AuditEvent(event_id, timestamp, action, entity_id, principal=None, resource=None, details=dict())

Security audit event for tracking modifications.

Audit events are logged for security-sensitive operations: - Entity creation and deletion - Limit configuration changes

Attributes:

Name	Type	Description
`event_id`	`str`	Unique identifier for the event (timestamp-based)
`timestamp`	`str`	ISO timestamp when the event occurred
`action`	`str`	Type of action (see AuditAction constants)
`entity_id`	`str`	ID of the entity affected
`principal`	`str \| None`	Caller identity who performed the action (optional)
`resource`	`str \| None`	Resource name for limit-related actions (optional)
`details`	`dict[str, Any]`	Additional action-specific details

to_dict ¶

to_dict()

Serialize to dictionary for storage.

from_dict `classmethod` ¶

from_dict(data)

Deserialize from dictionary.

AuditAction¶

AuditAction ¶

Audit action type constants.

UsageSnapshot¶

UsageSnapshot `dataclass` ¶

UsageSnapshot(entity_id, resource, window_start, window_end, window_type, counters, total_events)

Aggregated usage for a time window.

Created by the aggregator Lambda from DynamoDB stream events. Tracks token consumption per limit type within a time window.

Attributes:

Name	Type	Description
`entity_id`	`str`	Entity that consumed tokens
`resource`	`str`	Resource being rate-limited (e.g., "gpt-4")
`window_start`	`str`	ISO timestamp of window start (e.g., "2024-01-01T14:00:00Z")
`window_end`	`str`	ISO timestamp of window end
`window_type`	`str`	Window granularity ("hourly", "daily")
`counters`	`dict[str, int]`	Consumption by limit name (e.g., {"tpm": 5000, "rpm": 10})
`total_events`	`int`	Number of consumption events in this window

UsageSummary¶

UsageSummary `dataclass` ¶

UsageSummary(snapshot_count, total, average, min_window_start, max_window_start)

Aggregated usage summary across multiple snapshots.

Returned by RateLimiter.get_usage_summary() to provide total and average consumption statistics over a time range.

Attributes:

Name	Type	Description
`snapshot_count`	`int`	Number of snapshots aggregated
`total`	`dict[str, int]`	Sum of consumption by limit name (e.g., {"tpm": 50000, "rpm": 100})
`average`	`dict[str, float]`	Average consumption per snapshot by limit name
`min_window_start`	`str \| None`	Earliest snapshot window start (ISO timestamp)
`max_window_start`	`str \| None`	Latest snapshot window start (ISO timestamp)

Example

summary = await limiter.get_usage_summary( entity_id="user-123", resource="gpt-4", window_type="hourly", ) print(f"Total tokens: {summary.total.get('tpm', 0)}") print(f"Average per hour: {summary.average.get('tpm', 0.0):.1f}")

LimiterInfo¶

LimiterInfo `dataclass` ¶

LimiterInfo(stack_name, user_name, region, stack_status, creation_time, last_updated_time=None, version=None, lambda_version=None, schema_version=None, stack_type=None)

Information about a deployed rate limiter instance.

Represents a CloudFormation stack discovered in a region via RateLimiter.list_deployed() or the zae-limiter list CLI command. This is a READ-ONLY model describing observed infrastructure state.

Example

Discover all limiters in us-east-1¶

limiters = await RateLimiter.list_deployed(region="us-east-1") for limiter in limiters: if limiter.is_failed: print(f"⚠️ {limiter.user_name}: {limiter.stack_status}")

Attributes:

Name	Type	Description
`stack_name`	`str`	Full CloudFormation stack name (e.g., "my-app")
`user_name`	`str`	User-friendly name (e.g., "my-app")
`region`	`str`	AWS region where the stack is deployed
`stack_status`	`str`	CloudFormation stack status (e.g., "CREATE_COMPLETE")
`creation_time`	`str`	ISO 8601 timestamp of stack creation
`last_updated_time`	`str \| None`	ISO 8601 timestamp of last update (None if never updated)
`version`	`str \| None`	Value of zae-limiter:version tag (client version at deployment)
`lambda_version`	`str \| None`	Value of zae-limiter:lambda-version tag
`schema_version`	`str \| None`	Value of zae-limiter:schema-version tag

is_healthy `property` ¶

is_healthy

Stack is in a stable, operational state.

is_in_progress `property` ¶

is_in_progress

Stack operation is in progress.

is_failed `property` ¶

is_failed

Stack is in a failed or rollback state.

StackOptions¶

StackOptions `dataclass` ¶

StackOptions(snapshot_windows='hourly,daily', usage_retention_days=90, audit_retention_days=90, enable_aggregator=True, pitr_recovery_days=None, log_retention_days=30, lambda_timeout=60, lambda_memory=256, enable_alarms=True, alarm_sns_topic=None, lambda_duration_threshold_pct=80, permission_boundary=None, role_name_format=None, policy_name_format=None, enable_audit_archival=True, audit_archive_glacier_days=90, enable_tracing=False, create_iam_roles=False, create_iam=True, aggregator_role_arn=None, enable_deletion_protection=False, tags=None)

Configuration options for CloudFormation stack creation and updates.

When passed to RateLimiter constructor, triggers automatic stack creation. When None is passed (default), no stack creation is attempted.

Attributes:

Name	Type	Description
`snapshot_windows`	`str`	Comma-separated list of snapshot windows (e.g., "hourly,daily")
`usage_retention_days`	`int`	Number of days to retain usage snapshots
`audit_retention_days`	`int`	Number of days to retain audit records in DynamoDB
`enable_aggregator`	`bool`	Deploy Lambda aggregator for usage snapshots
`pitr_recovery_days`	`int \| None`	Point-in-Time Recovery period (1-35, None for AWS default)
`log_retention_days`	`int`	CloudWatch log retention period in days (must be valid CloudWatch value)
`lambda_timeout`	`int`	Lambda timeout in seconds (1-900)
`lambda_memory`	`int`	Lambda memory size in MB (128-3008)
`enable_alarms`	`bool`	Deploy CloudWatch alarms for monitoring
`alarm_sns_topic`	`str \| None`	SNS topic ARN for alarm notifications
`lambda_duration_threshold_pct`	`int`	Duration alarm threshold as percentage of timeout (1-100)
`permission_boundary`	`str \| None`	IAM permission boundary (policy name or full ARN)
`role_name_format`	`str \| None`	Format template for role name, {} = default role name
`policy_name_format`	`str \| None`	Format template for managed policy name, {} = default policy name
`enable_audit_archival`	`bool`	Archive expired audit events to S3 via TTL
`audit_archive_glacier_days`	`int`	Days before transitioning archives to Glacier IR (1-3650)
`enable_tracing`	`bool`	Enable AWS X-Ray tracing for Lambda aggregator
`create_iam_roles`	`bool`	Create App/Admin/ReadOnly IAM roles (default: False). Managed policies are always created unless create_iam=False.
`create_iam`	`bool`	Create IAM resources (policies and roles). Set to False for restricted IAM environments (e.g., PowerUserAccess). When False, aggregator is disabled unless aggregator_role_arn is provided.
`aggregator_role_arn`	`str \| None`	ARN of an existing IAM role for the Lambda aggregator. Use this when deploying without iam:CreateRole permissions.
`enable_deletion_protection`	`bool`	Enable DynamoDB table deletion protection
`tags`	`dict[str, str] \| None`	User-defined tags to apply to the CloudFormation stack. Dict of key-value pairs. AWS tag constraints apply (max 50 total including managed tags, key 1-128 chars, value 0-256 chars). The `aws:` prefix is reserved.

__post_init__ ¶

__post_init__()

Validate options and emit deprecation warning.

get_role_name ¶

get_role_name(stack_name, component)

Get the final role name for a given stack name and component.

Parameters:

Name	Type	Description	Default
`stack_name`	`str`	Stack name	required
`component`	`str`	Role component (aggr, app, admin, read)	required

Returns:

Type	Description
`str \| None`	Final role name, or None if role_name_format not set

Raises:

Type	Description
`ValidationError`	If resulting name exceeds 64 characters

get_policy_name ¶

get_policy_name(stack_name, component)

Get the final policy name for a given stack name and component.

Parameters:

Name	Type	Description	Default
`stack_name`	`str`	Stack name	required
`component`	`str`	Policy component (app, admin, read)	required

Returns:

Type	Description
`str \| None`	Final policy name, or None if policy_name_format not set

Raises:

Type	Description
`ValidationError`	If resulting name exceeds 128 characters

to_parameters ¶

to_parameters(stack_name=None)

Convert to stack parameters dict for StackManager.

Parameters:

Name	Type	Description	Default
`stack_name`	`str \| None`	Stack name for role_name_format substitution	`None`

Returns:

Type	Description
`dict[str, str]`	Dict with snake_case keys matching stack_manager parameter mapping.

Status¶

Status `dataclass` ¶

Status(available, latency_ms, stack_status, table_status, aggregator_enabled, name, region, schema_version, lambda_version, client_version, table_item_count, table_size_bytes, app_role_arn=None, admin_role_arn=None, readonly_role_arn=None)

Comprehensive status of a rate limiter instance.

Consolidates connectivity, infrastructure, identity, versions, and table metrics into a single status object. Used by the CLI status command.

Attributes:

Name	Type	Description
`available`	`bool`	Whether DynamoDB is reachable and responding
`latency_ms`	`float \| None`	Round-trip latency in milliseconds (None if unavailable)
`stack_status`	`str \| None`	CloudFormation stack status (e.g., 'CREATE_COMPLETE')
`table_status`	`str \| None`	DynamoDB table status (e.g., 'ACTIVE')
`aggregator_enabled`	`bool`	Whether Lambda aggregator is deployed
`name`	`str`	Resource name
`region`	`str \| None`	AWS region (None if using default)
`schema_version`	`str \| None`	Deployed schema version
`lambda_version`	`str \| None`	Deployed Lambda version
`client_version`	`str`	Current client library version
`table_item_count`	`int \| None`	Approximate item count in table
`table_size_bytes`	`int \| None`	Approximate table size in bytes
`app_role_arn`	`str \| None`	IAM role ARN for applications (None if roles disabled)
`admin_role_arn`	`str \| None`	IAM role ARN for administrators (None if roles disabled)
`readonly_role_arn`	`str \| None`	IAM role ARN for read-only access (None if roles disabled)

BackendCapabilities¶

BackendCapabilities `dataclass` ¶

BackendCapabilities(supports_audit_logging=False, supports_usage_snapshots=False, supports_infrastructure_management=False, supports_change_streams=False, supports_batch_operations=False)

Declares which extended features a backend supports.

Used by RateLimiter to gracefully degrade when features are unavailable. Backend implementations should return an instance from their capabilities property.

See ADR-109 for the capability matrix across backends.

supports_audit_logging `class-attribute` `instance-attribute` ¶

supports_audit_logging = False

Whether the backend supports audit event storage and retrieval.

supports_usage_snapshots `class-attribute` `instance-attribute` ¶

supports_usage_snapshots = False

Whether the backend supports usage snapshot aggregation.

supports_infrastructure_management `class-attribute` `instance-attribute` ¶

supports_infrastructure_management = False

Whether the backend supports declarative infrastructure (e.g., CloudFormation).

supports_change_streams `class-attribute` `instance-attribute` ¶

supports_change_streams = False

Whether the backend supports real-time change notifications.

supports_batch_operations `class-attribute` `instance-attribute` ¶

supports_batch_operations = False

Whether the backend supports batch_get_buckets() for optimized reads.

ResourceCapacity¶

ResourceCapacity `dataclass` ¶

ResourceCapacity(resource, limit_name, total_capacity, total_available, utilization_pct, entities)

Aggregated capacity info for a resource across entities.

EntityCapacity¶

EntityCapacity `dataclass` ¶

EntityCapacity(entity_id, capacity, available, utilization_pct)

Capacity info for a single entity.

SpeculativeResult¶

SpeculativeResult `dataclass` ¶

SpeculativeResult(success, buckets=list(), cascade=False, parent_id=None, old_buckets=None, parent_result=None, shard_id=0, shard_count=1, failure_reason=None)

Result of a speculative UpdateItem attempt.

Attributes:

Name	Type	Description
`success`	`bool`	True if the speculative write succeeded.
`buckets`	`list[BucketState]`	On success, deserialized BucketStates from ALL_NEW response. Includes the `wcu` infrastructure limit bucket state.
`cascade`	`bool`	On success, whether the entity has cascade enabled.
`parent_id`	`str \| None`	On success, the entity's parent_id (if any).
`old_buckets`	`list[BucketState] \| None`	On failure, deserialized BucketStates from ALL_OLD response. None if the bucket doesn't exist (first acquire).
`parent_result`	`SpeculativeResult \| None`	On cache hit + cascade, nested parent SpeculativeResult from parallel UpdateItem. None for cache miss, non-cascade, or failure.
`shard_id`	`int`	The shard index targeted by this speculative write.
`shard_count`	`int`	The total shard count read from the bucket item. Used by the limiter to decide whether to retry on another shard or double shards when the `wcu` limit is exhausted.

CacheStats¶

CacheStats `dataclass` ¶

CacheStats(hits=0, misses=0, size=0, ttl_seconds=0)

Statistics for cache performance monitoring.

as_dict ¶

as_dict()

Return stats as a dictionary.

Models¶

Limit¶

Limit dataclass ¶

refill_rate property ¶

per_second classmethod ¶

per_minute classmethod ¶

per_hour classmethod ¶

per_day classmethod ¶

custom classmethod ¶

to_dict ¶

from_dict classmethod ¶

from_bucket_state classmethod ¶

Entity¶

Entity dataclass ¶

is_parent property ¶

is_child property ¶

LimitStatus¶

LimitStatus dataclass ¶

deficit property ¶

BucketState¶

BucketState dataclass ¶

tokens property ¶

capacity property ¶

from_limit classmethod ¶

AuditEvent¶

AuditEvent dataclass ¶

to_dict ¶

from_dict classmethod ¶

AuditAction¶

AuditAction ¶

UsageSnapshot¶

UsageSnapshot dataclass ¶

UsageSummary¶

UsageSummary dataclass ¶

LimiterInfo¶

LimiterInfo dataclass ¶

Discover all limiters in us-east-1¶

is_healthy property ¶

is_in_progress property ¶

is_failed property ¶

StackOptions¶

StackOptions dataclass ¶

__post_init__ ¶

get_role_name ¶

get_policy_name ¶

to_parameters ¶

Status¶

Status dataclass ¶

BackendCapabilities¶

BackendCapabilities dataclass ¶

supports_audit_logging class-attribute instance-attribute ¶

supports_usage_snapshots class-attribute instance-attribute ¶

supports_infrastructure_management class-attribute instance-attribute ¶

supports_change_streams class-attribute instance-attribute ¶

supports_batch_operations class-attribute instance-attribute ¶

ResourceCapacity¶

ResourceCapacity dataclass ¶

EntityCapacity¶

EntityCapacity dataclass ¶

SpeculativeResult¶

SpeculativeResult dataclass ¶

CacheStats¶

CacheStats dataclass ¶

as_dict ¶

Limit `dataclass` ¶

refill_rate `property` ¶

per_second `classmethod` ¶

per_minute `classmethod` ¶

per_hour `classmethod` ¶

per_day `classmethod` ¶

custom `classmethod` ¶

from_dict `classmethod` ¶

from_bucket_state `classmethod` ¶

Entity `dataclass` ¶

is_parent `property` ¶

is_child `property` ¶

LimitStatus `dataclass` ¶

deficit `property` ¶

BucketState `dataclass` ¶

tokens `property` ¶

capacity `property` ¶

from_limit `classmethod` ¶

AuditEvent `dataclass` ¶

from_dict `classmethod` ¶

UsageSnapshot `dataclass` ¶

UsageSummary `dataclass` ¶

LimiterInfo `dataclass` ¶

is_healthy `property` ¶

is_in_progress `property` ¶

is_failed `property` ¶

StackOptions `dataclass` ¶

Status `dataclass` ¶

BackendCapabilities `dataclass` ¶

supports_audit_logging `class-attribute` `instance-attribute` ¶

supports_usage_snapshots `class-attribute` `instance-attribute` ¶

supports_infrastructure_management `class-attribute` `instance-attribute` ¶

supports_change_streams `class-attribute` `instance-attribute` ¶

supports_batch_operations `class-attribute` `instance-attribute` ¶

ResourceCapacity `dataclass` ¶

EntityCapacity `dataclass` ¶

SpeculativeResult `dataclass` ¶

CacheStats `dataclass` ¶