Skip to content

Models

Data models for rate limit configuration and status.

Limit

Limit dataclass

Limit(name, capacity, refill_amount, refill_period_seconds)

Token bucket rate limit configuration.

Refill rate is stored as a fraction (refill_amount / refill_period_seconds) to avoid floating point precision issues.

Attributes:

Name Type Description
name str

Unique identifier for this limit type (e.g., "rpm", "tpm")

capacity int

Max tokens in the bucket (ceiling)

refill_amount int

Numerator of refill rate

refill_period_seconds int

Denominator of refill rate

refill_rate property

refill_rate

Tokens per second (for display/debugging).

per_second classmethod

per_second(name, rate, burst=None)

Create a limit that refills rate tokens per second.

Parameters:

Name Type Description Default
name str

Limit name (e.g., "rps")

required
rate int

Sustained tokens per second (also the refill amount)

required
burst int | None

Optional burst ceiling. When set, capacity is burst and refill_amount is rate, allowing temporary spikes above the sustained rate.

None

per_minute classmethod

per_minute(name, rate, burst=None)

Create a limit that refills rate tokens per minute.

Parameters:

Name Type Description Default
name str

Limit name (e.g., "rpm", "tpm")

required
rate int

Sustained tokens per minute (also the refill amount)

required
burst int | None

Optional burst ceiling. When set, capacity is burst and refill_amount is rate, allowing temporary spikes above the sustained rate.

None

per_hour classmethod

per_hour(name, rate, burst=None)

Create a limit that refills rate tokens per hour.

Parameters:

Name Type Description Default
name str

Limit name (e.g., "rph")

required
rate int

Sustained tokens per hour (also the refill amount)

required
burst int | None

Optional burst ceiling. When set, capacity is burst and refill_amount is rate, allowing temporary spikes above the sustained rate.

None

per_day classmethod

per_day(name, rate, burst=None)

Create a limit that refills rate tokens per day.

Parameters:

Name Type Description Default
name str

Limit name (e.g., "rpd")

required
rate int

Sustained tokens per day (also the refill amount)

required
burst int | None

Optional burst ceiling. When set, capacity is burst and refill_amount is rate, allowing temporary spikes above the sustained rate.

None

custom classmethod

custom(name, capacity, refill_amount, refill_period_seconds)

Create a custom limit with explicit refill rate.

Allow ceiling of 1000 tokens, sustained at 100/sec

Limit.custom("requests", capacity=1000, refill_amount=100, refill_period_seconds=1)

to_dict

to_dict()

Serialize to dictionary for storage.

from_dict classmethod

from_dict(data)

Deserialize from dictionary.

from_bucket_state classmethod

from_bucket_state(state)

Reconstruct a Limit from BucketState fields.

Entity

Entity dataclass

Entity(id, name=None, parent_id=None, cascade=False, metadata=dict(), created_at=None)

An entity that can have rate limits applied.

Entities can be parents (projects) or children (API keys). Children have a parent_id reference.

Note: This model does not validate in post_init to support DynamoDB deserialization and avoid performance overhead. Validation is performed in Repository.create_entity() at the API boundary.

is_parent property

is_parent

True if this entity has no parent (is a root/project).

is_child property

is_child

True if this entity has a parent.

LimitStatus

LimitStatus dataclass

LimitStatus(entity_id, resource, limit_name, limit, available, requested, exceeded, retry_after_seconds)

Status of a specific limit check.

Returned in RateLimitExceeded to provide full visibility into all limits that were checked.

Note: This is an internal model created by the limiter from validated inputs. No validation is performed here to avoid performance overhead.

deficit property

deficit

How many tokens short we are (0 if not exceeded).

BucketState

BucketState dataclass

BucketState(entity_id, resource, limit_name, tokens_milli, last_refill_ms, capacity_milli, refill_amount_milli, refill_period_ms, total_consumed_milli=None)

Internal state of a token bucket.

All token values are stored in millitokens (x1000) for precision.

Note: This is an internal model. Validation is performed in from_limit() for user-provided inputs, not in post_init to support DynamoDB deserialization and avoid performance overhead on frequent operations.

tokens property

tokens

Current tokens (not millitokens).

capacity property

capacity

Capacity / ceiling (not millitokens).

from_limit classmethod

from_limit(entity_id, resource, limit, now_ms)

Create a new bucket at full capacity from a Limit.

Note: This is an internal factory method. Validation of entity_id and resource is performed at the API boundary (RateLimiter public methods) before calling this method.

Parameters:

Name Type Description Default
entity_id str

Entity identifier (pre-validated by caller)

required
resource str

Resource name (pre-validated by caller)

required
limit Limit

Limit configuration (validated via post_init)

required
now_ms int

Current time in milliseconds

required

AuditEvent

AuditEvent dataclass

AuditEvent(event_id, timestamp, action, entity_id, principal=None, resource=None, details=dict())

Security audit event for tracking modifications.

Audit events are logged for security-sensitive operations: - Entity creation and deletion - Limit configuration changes

Attributes:

Name Type Description
event_id str

Unique identifier for the event (timestamp-based)

timestamp str

ISO timestamp when the event occurred

action str

Type of action (see AuditAction constants)

entity_id str

ID of the entity affected

principal str | None

Caller identity who performed the action (optional)

resource str | None

Resource name for limit-related actions (optional)

details dict[str, Any]

Additional action-specific details

to_dict

to_dict()

Serialize to dictionary for storage.

from_dict classmethod

from_dict(data)

Deserialize from dictionary.

AuditAction

AuditAction

Audit action type constants.

UsageSnapshot

UsageSnapshot dataclass

UsageSnapshot(entity_id, resource, window_start, window_end, window_type, counters, total_events)

Aggregated usage for a time window.

Created by the aggregator Lambda from DynamoDB stream events. Tracks token consumption per limit type within a time window.

Attributes:

Name Type Description
entity_id str

Entity that consumed tokens

resource str

Resource being rate-limited (e.g., "gpt-4")

window_start str

ISO timestamp of window start (e.g., "2024-01-01T14:00:00Z")

window_end str

ISO timestamp of window end

window_type str

Window granularity ("hourly", "daily")

counters dict[str, int]

Consumption by limit name (e.g., {"tpm": 5000, "rpm": 10})

total_events int

Number of consumption events in this window

UsageSummary

UsageSummary dataclass

UsageSummary(snapshot_count, total, average, min_window_start, max_window_start)

Aggregated usage summary across multiple snapshots.

Returned by RateLimiter.get_usage_summary() to provide total and average consumption statistics over a time range.

Attributes:

Name Type Description
snapshot_count int

Number of snapshots aggregated

total dict[str, int]

Sum of consumption by limit name (e.g., {"tpm": 50000, "rpm": 100})

average dict[str, float]

Average consumption per snapshot by limit name

min_window_start str | None

Earliest snapshot window start (ISO timestamp)

max_window_start str | None

Latest snapshot window start (ISO timestamp)

Example

summary = await limiter.get_usage_summary( entity_id="user-123", resource="gpt-4", window_type="hourly", ) print(f"Total tokens: {summary.total.get('tpm', 0)}") print(f"Average per hour: {summary.average.get('tpm', 0.0):.1f}")

LimiterInfo

LimiterInfo dataclass

LimiterInfo(stack_name, user_name, region, stack_status, creation_time, last_updated_time=None, version=None, lambda_version=None, schema_version=None, stack_type=None)

Information about a deployed rate limiter instance.

Represents a CloudFormation stack discovered in a region via RateLimiter.list_deployed() or the zae-limiter list CLI command. This is a READ-ONLY model describing observed infrastructure state.

Example

Discover all limiters in us-east-1

limiters = await RateLimiter.list_deployed(region="us-east-1") for limiter in limiters: if limiter.is_failed: print(f"⚠️ {limiter.user_name}: {limiter.stack_status}")

Attributes:

Name Type Description
stack_name str

Full CloudFormation stack name (e.g., "my-app")

user_name str

User-friendly name (e.g., "my-app")

region str

AWS region where the stack is deployed

stack_status str

CloudFormation stack status (e.g., "CREATE_COMPLETE")

creation_time str

ISO 8601 timestamp of stack creation

last_updated_time str | None

ISO 8601 timestamp of last update (None if never updated)

version str | None

Value of zae-limiter:version tag (client version at deployment)

lambda_version str | None

Value of zae-limiter:lambda-version tag

schema_version str | None

Value of zae-limiter:schema-version tag

is_healthy property

is_healthy

Stack is in a stable, operational state.

is_in_progress property

is_in_progress

Stack operation is in progress.

is_failed property

is_failed

Stack is in a failed or rollback state.

StackOptions

StackOptions dataclass

StackOptions(snapshot_windows='hourly,daily', usage_retention_days=90, audit_retention_days=90, enable_aggregator=True, pitr_recovery_days=None, log_retention_days=30, lambda_timeout=60, lambda_memory=256, enable_alarms=True, alarm_sns_topic=None, lambda_duration_threshold_pct=80, permission_boundary=None, role_name_format=None, policy_name_format=None, enable_audit_archival=True, audit_archive_glacier_days=90, enable_tracing=False, create_iam_roles=False, create_iam=True, aggregator_role_arn=None, enable_deletion_protection=False, tags=None)

Configuration options for CloudFormation stack creation and updates.

When passed to RateLimiter constructor, triggers automatic stack creation. When None is passed (default), no stack creation is attempted.

Attributes:

Name Type Description
snapshot_windows str

Comma-separated list of snapshot windows (e.g., "hourly,daily")

usage_retention_days int

Number of days to retain usage snapshots

audit_retention_days int

Number of days to retain audit records in DynamoDB

enable_aggregator bool

Deploy Lambda aggregator for usage snapshots

pitr_recovery_days int | None

Point-in-Time Recovery period (1-35, None for AWS default)

log_retention_days int

CloudWatch log retention period in days (must be valid CloudWatch value)

lambda_timeout int

Lambda timeout in seconds (1-900)

lambda_memory int

Lambda memory size in MB (128-3008)

enable_alarms bool

Deploy CloudWatch alarms for monitoring

alarm_sns_topic str | None

SNS topic ARN for alarm notifications

lambda_duration_threshold_pct int

Duration alarm threshold as percentage of timeout (1-100)

permission_boundary str | None

IAM permission boundary (policy name or full ARN)

role_name_format str | None

Format template for role name, {} = default role name

policy_name_format str | None

Format template for managed policy name, {} = default policy name

enable_audit_archival bool

Archive expired audit events to S3 via TTL

audit_archive_glacier_days int

Days before transitioning archives to Glacier IR (1-3650)

enable_tracing bool

Enable AWS X-Ray tracing for Lambda aggregator

create_iam_roles bool

Create App/Admin/ReadOnly IAM roles (default: False). Managed policies are always created unless create_iam=False.

create_iam bool

Create IAM resources (policies and roles). Set to False for restricted IAM environments (e.g., PowerUserAccess). When False, aggregator is disabled unless aggregator_role_arn is provided.

aggregator_role_arn str | None

ARN of an existing IAM role for the Lambda aggregator. Use this when deploying without iam:CreateRole permissions.

enable_deletion_protection bool

Enable DynamoDB table deletion protection

tags dict[str, str] | None

User-defined tags to apply to the CloudFormation stack. Dict of key-value pairs. AWS tag constraints apply (max 50 total including managed tags, key 1-128 chars, value 0-256 chars). The aws: prefix is reserved.

__post_init__

__post_init__()

Validate options and emit deprecation warning.

get_role_name

get_role_name(stack_name, component)

Get the final role name for a given stack name and component.

Parameters:

Name Type Description Default
stack_name str

Stack name

required
component str

Role component (aggr, app, admin, read)

required

Returns:

Type Description
str | None

Final role name, or None if role_name_format not set

Raises:

Type Description
ValidationError

If resulting name exceeds 64 characters

get_policy_name

get_policy_name(stack_name, component)

Get the final policy name for a given stack name and component.

Parameters:

Name Type Description Default
stack_name str

Stack name

required
component str

Policy component (app, admin, read)

required

Returns:

Type Description
str | None

Final policy name, or None if policy_name_format not set

Raises:

Type Description
ValidationError

If resulting name exceeds 128 characters

to_parameters

to_parameters(stack_name=None)

Convert to stack parameters dict for StackManager.

Parameters:

Name Type Description Default
stack_name str | None

Stack name for role_name_format substitution

None

Returns:

Type Description
dict[str, str]

Dict with snake_case keys matching stack_manager parameter mapping.

Status

Status dataclass

Status(available, latency_ms, stack_status, table_status, aggregator_enabled, name, region, schema_version, lambda_version, client_version, table_item_count, table_size_bytes, app_role_arn=None, admin_role_arn=None, readonly_role_arn=None)

Comprehensive status of a rate limiter instance.

Consolidates connectivity, infrastructure, identity, versions, and table metrics into a single status object. Used by the CLI status command.

Attributes:

Name Type Description
available bool

Whether DynamoDB is reachable and responding

latency_ms float | None

Round-trip latency in milliseconds (None if unavailable)

stack_status str | None

CloudFormation stack status (e.g., 'CREATE_COMPLETE')

table_status str | None

DynamoDB table status (e.g., 'ACTIVE')

aggregator_enabled bool

Whether Lambda aggregator is deployed

name str

Resource name

region str | None

AWS region (None if using default)

schema_version str | None

Deployed schema version

lambda_version str | None

Deployed Lambda version

client_version str

Current client library version

table_item_count int | None

Approximate item count in table

table_size_bytes int | None

Approximate table size in bytes

app_role_arn str | None

IAM role ARN for applications (None if roles disabled)

admin_role_arn str | None

IAM role ARN for administrators (None if roles disabled)

readonly_role_arn str | None

IAM role ARN for read-only access (None if roles disabled)

BackendCapabilities

BackendCapabilities dataclass

BackendCapabilities(supports_audit_logging=False, supports_usage_snapshots=False, supports_infrastructure_management=False, supports_change_streams=False, supports_batch_operations=False)

Declares which extended features a backend supports.

Used by RateLimiter to gracefully degrade when features are unavailable. Backend implementations should return an instance from their capabilities property.

See ADR-109 for the capability matrix across backends.

supports_audit_logging class-attribute instance-attribute

supports_audit_logging = False

Whether the backend supports audit event storage and retrieval.

supports_usage_snapshots class-attribute instance-attribute

supports_usage_snapshots = False

Whether the backend supports usage snapshot aggregation.

supports_infrastructure_management class-attribute instance-attribute

supports_infrastructure_management = False

Whether the backend supports declarative infrastructure (e.g., CloudFormation).

supports_change_streams class-attribute instance-attribute

supports_change_streams = False

Whether the backend supports real-time change notifications.

supports_batch_operations class-attribute instance-attribute

supports_batch_operations = False

Whether the backend supports batch_get_buckets() for optimized reads.

ResourceCapacity

ResourceCapacity dataclass

ResourceCapacity(resource, limit_name, total_capacity, total_available, utilization_pct, entities)

Aggregated capacity info for a resource across entities.

EntityCapacity

EntityCapacity dataclass

EntityCapacity(entity_id, capacity, available, utilization_pct)

Capacity info for a single entity.

SpeculativeResult

SpeculativeResult dataclass

SpeculativeResult(success, buckets=list(), cascade=False, parent_id=None, old_buckets=None, parent_result=None, shard_id=0, shard_count=1, failure_reason=None)

Result of a speculative UpdateItem attempt.

Attributes:

Name Type Description
success bool

True if the speculative write succeeded.

buckets list[BucketState]

On success, deserialized BucketStates from ALL_NEW response. Includes the wcu infrastructure limit bucket state.

cascade bool

On success, whether the entity has cascade enabled.

parent_id str | None

On success, the entity's parent_id (if any).

old_buckets list[BucketState] | None

On failure, deserialized BucketStates from ALL_OLD response. None if the bucket doesn't exist (first acquire).

parent_result SpeculativeResult | None

On cache hit + cascade, nested parent SpeculativeResult from parallel UpdateItem. None for cache miss, non-cascade, or failure.

shard_id int

The shard index targeted by this speculative write.

shard_count int

The total shard count read from the bucket item. Used by the limiter to decide whether to retry on another shard or double shards when the wcu limit is exhausted.

CacheStats

CacheStats dataclass

CacheStats(hits=0, misses=0, size=0, ttl_seconds=0)

Statistics for cache performance monitoring.

as_dict

as_dict()

Return stats as a dictionary.