1125 quotas for Amazon Bedrock. 410 can be increased.
| Quota | Default | Status |
|---|---|---|
Minimum number of records per batch inference job for GLM 4.7 Flash Minimum number of records per batch inference job for GLM 4.7 Flash general | 100 count | Fixed |
On-demand model inference tokens per minute for Amazon Titan Image Generator G1 On-demand model inference tokens per minute for Amazon Titan Image Generator G1 general | 2,000 count | Fixed |
Records per batch inference job for Claude Opus 4.5 Records per batch inference job for Claude Opus 4.5 general | 100,000 count | Adjustable |
On-demand model inference requests per minute for Mistral Large 3 On-demand model inference requests per minute for Mistral Large 3 throughput | 10,000 count | Fixed |
(Model customization) Custom models per account (Model customization) Custom models per account general | 100 count | Adjustable |
Cross-region model inference tokens per minute for Anthropic Claude Opus 4.6 V1 Cross-region model inference tokens per minute for Anthropic Claude Opus 4.6 V1 general | 3,000,000 count | Adjustable |
On-demand model inference tokens per minute for Mistral Large 3 On-demand model inference tokens per minute for Mistral Large 3 general | 100,000,000 count | Fixed |
On-demand model inference tokens per minute for Cohere Command R Plus On-demand model inference tokens per minute for Cohere Command R Plus general | 300,000 count | Fixed |
On-demand model inference requests per minute for Meta Llama 2 Chat 70B On-demand model inference requests per minute for Meta Llama 2 Chat 70B throughput | 400 count | Fixed |
Model invocation max tokens per day for Nemotron Nano 3 30B (doubled for cross-region calls) Model invocation max tokens per day for Nemotron Nano 3 30B (doubled for cross-region calls) general | 144,000,000,000 count | Fixed |
(Model customization) Sum of training and validation records for a Claude 3-5-Haiku v1 Fine-tuning job (Model customization) Sum of training and validation records for a Claude 3-5-Haiku v1 Fine-tuning job general | 10,000 count | Adjustable |
Batch inference job size (in GB) for Gemma 3 12B Batch inference job size (in GB) for Gemma 3 12B storage | 5 count | Fixed |
Model invocation max tokens per day for Z.ai GLM-4.7 (doubled for cross-region calls) Model invocation max tokens per day for Z.ai GLM-4.7 (doubled for cross-region calls) general | 144,000,000,000 count | Fixed |
Batch inference job size (in GB) for Claude Sonnet 4.5 Batch inference job size (in GB) for Claude Sonnet 4.5 storage | 5 count | Fixed |
Global cross-region model inference requests per minute for Anthropic Claude Sonnet 4.6 Global cross-region model inference requests per minute for Anthropic Claude Sonnet 4.6 throughput | 10,000 count | Adjustable |
(Model customization) Total number of custom model deployments (Model customization) Total number of custom model deployments general | 10 count | Adjustable |
ListAgentVersions requests per second ListAgentVersions requests per second throughput | 10 count | Fixed |
Records per batch inference job for Llama 3.1 405B Instruct Records per batch inference job for Llama 3.1 405B Instruct general | 100,000 count | Adjustable |
Cross-region model inference requests per minute for Anthropic Claude Haiku 4.5 Cross-region model inference requests per minute for Anthropic Claude Haiku 4.5 throughput | 10,000 count | Adjustable |
On-demand model inference tokens per minute for Amazon Titan Text Express On-demand model inference tokens per minute for Amazon Titan Text Express general | 300,000 count | Fixed |
Cross-region model inference requests per minute for Amazon Nova Lite Cross-region model inference requests per minute for Amazon Nova Lite throughput | 4,000 count | Fixed |
Model units per provisioned model for Anthropic Claude 3 Haiku 200K Model units per provisioned model for Anthropic Claude 3 Haiku 200K general | 0 count | Adjustable |
Model invocation max tokens per day for Gemma 3 4B (doubled for cross-region calls) Model invocation max tokens per day for Gemma 3 4B (doubled for cross-region calls) general | 144,000,000,000 count | Fixed |
Sum of in-progress and submitted batch inference jobs using a base model for Claude 3.7 Sonnet Sum of in-progress and submitted batch inference jobs using a base model for Claude 3.7 Sonnet general | 100 count | Adjustable |
DisassociateAgentKnowledgeBase requests per second DisassociateAgentKnowledgeBase requests per second throughput | 4 count | Fixed |
Records per batch inference job for Llama 3.2 1B Instruct Records per batch inference job for Llama 3.2 1B Instruct general | 100,000 count | Adjustable |
(Flows) Conditions per condition node (Flows) Conditions per condition node capacity | 5 count | Fixed |
Cross-region model inference requests per minute for Anthropic Claude 3.7 Sonnet V1 Cross-region model inference requests per minute for Anthropic Claude 3.7 Sonnet V1 throughput | 250 count | Fixed |
On-demand model inference requests per minute for AI21 Labs Jamba Instruct On-demand model inference requests per minute for AI21 Labs Jamba Instruct throughput | 100 count | Fixed |
Batch inference input file size (in GB) for OpenAI GPT OSS 20b Batch inference input file size (in GB) for OpenAI GPT OSS 20b storage | 1 count | Fixed |
Model units no-commitment Provisioned Throughputs across base models Model units no-commitment Provisioned Throughputs across base models general | 0 count | Adjustable |
Batch inference job size (in GB) for OpenAI GPT OSS 120b Batch inference job size (in GB) for OpenAI GPT OSS 120b storage | 5 count | Fixed |
Records per input file per batch inference job for Nova 2 Lite Records per input file per batch inference job for Nova 2 Lite storage | 100,000 count | Adjustable |
Sum of in-progress and submitted batch inference jobs using a base model for Magistral Small 2509 Sum of in-progress and submitted batch inference jobs using a base model for Magistral Small 2509 general | 100 count | Adjustable |
Global cross-region model inference tokens per day for Amazon Nova 2 Pro Preview Global cross-region model inference tokens per day for Amazon Nova 2 Pro Preview general | 1,440,000,000 count | Fixed |
(Flows) DeleteFlowVersion requests per second (Flows) DeleteFlowVersion requests per second throughput | 2 count | Fixed |
(Advanced Prompt Optimization) Active jobs per account (Advanced Prompt Optimization) Active jobs per account general | 20 count | Fixed |
Sum of in-progress and submitted batch inference jobs using a base model for Llama 3.1 405B Instruct Sum of in-progress and submitted batch inference jobs using a base model for Llama 3.1 405B Instruct general | 100 count | Adjustable |
Batch inference input file size (in GB) for Claude 3.5 Sonnet v2 Batch inference input file size (in GB) for Claude 3.5 Sonnet v2 storage | 1 count | Fixed |
On-demand model inference requests per minute for NVIDIA Nemotron Nano 2 On-demand model inference requests per minute for NVIDIA Nemotron Nano 2 throughput | 10,000 count | Fixed |
Batch inference input file size (in GB) for Claude Sonnet 4 Batch inference input file size (in GB) for Claude Sonnet 4 storage | 1 count | Adjustable |
Records per input file per batch inference job for Llama 4 Maverick Records per input file per batch inference job for Llama 4 Maverick storage | 100,000 count | Adjustable |
(Data Automation) Maximum number of Blueprints per Start Inference request (Audios) (Data Automation) Maximum number of Blueprints per Start Inference request (Audios) throughput | 1 count | Fixed |
Cross-region model inference tokens per minute for Meta Llama 4 Maverick V1 Cross-region model inference tokens per minute for Meta Llama 4 Maverick V1 general | 600,000 count | Adjustable |
Minimum number of records per batch inference job for Claude 3 Opus Minimum number of records per batch inference job for Claude 3 Opus general | 100 count | Fixed |
Throttle rate limit for GetDataAutomationProject Throttle rate limit for GetDataAutomationProject throughput | 5 count | Fixed |
Batch inference input file size (in GB) for Llama 3.1 8B Instruct Batch inference input file size (in GB) for Llama 3.1 8B Instruct storage | 1 count | Fixed |
(Model customization) Sum of training and validation records for a Claude 3 Haiku v1 Fine-tuning job (Model customization) Sum of training and validation records for a Claude 3 Haiku v1 Fine-tuning job general | 10,000 count | Adjustable |
Global cross-region model inference requests per minute for Amazon Nova 2 Lite Global cross-region model inference requests per minute for Amazon Nova 2 Lite throughput | 2,000 count | Adjustable |
Batch inference job size (in GB) for OpenAI GPT OSS Safeguard 20b Batch inference job size (in GB) for OpenAI GPT OSS Safeguard 20b storage | 5 count | Fixed |
On-demand model inference tokens per minute for Amazon Titan Text Premier On-demand model inference tokens per minute for Amazon Titan Text Premier general | 300,000 count | Fixed |
Batch inference job size (in GB) for Llama 4 Maverick Batch inference job size (in GB) for Llama 4 Maverick storage | 5 count | Fixed |
Model invocation max tokens per day for Minimax M2 (doubled for cross-region calls) Model invocation max tokens per day for Minimax M2 (doubled for cross-region calls) general | 144,000,000,000 count | Fixed |
Cross-region model inference tokens per minute for Anthropic Claude Opus 4.7 Cross-region model inference tokens per minute for Anthropic Claude Opus 4.7 general | 15,000,000 count | Adjustable |
On-demand model inference requests per minute for GPT OSS Safeguard 20B On-demand model inference requests per minute for GPT OSS Safeguard 20B throughput | 10,000 count | Fixed |
Batch inference job size (in GB) for Llama 3.2 3B Instruct Batch inference job size (in GB) for Llama 3.2 3B Instruct storage | 5 count | Fixed |
Records per input file per batch inference job for Qwen3 Coder Next Records per input file per batch inference job for Qwen3 Coder Next storage | 100,000 count | Adjustable |
On-demand model inference requests per minute for Meta Llama 3 8B Instruct On-demand model inference requests per minute for Meta Llama 3 8B Instruct throughput | 800 count | Fixed |
On-demand model inference requests per minute for Minimax M2 On-demand model inference requests per minute for Minimax M2 throughput | 10,000 count | Fixed |
On-demand model inference requests per minute for DeepSeek V3.2 On-demand model inference requests per minute for DeepSeek V3.2 throughput | 10,000 count | Fixed |
(Knowledge Bases) RetrieveAndGenerate requests per second (Knowledge Bases) RetrieveAndGenerate requests per second throughput | 20 count | Fixed |
Records per input file per batch inference job for Llama 3.2 90B Instruct Records per input file per batch inference job for Llama 3.2 90B Instruct storage | 100,000 count | Adjustable |
Batch inference job size (in GB) for Llama 3.1 8B Instruct Batch inference job size (in GB) for Llama 3.1 8B Instruct storage | 5 count | Fixed |
On-demand model inference requests per minute for AI21 Labs Jamba 1.5 Large On-demand model inference requests per minute for AI21 Labs Jamba 1.5 Large throughput | 100 count | Fixed |
Model invocation max tokens per day for Ministral 3B 3.0 (doubled for cross-region calls) Model invocation max tokens per day for Ministral 3B 3.0 (doubled for cross-region calls) general | 144,000,000,000 count | Fixed |
(Model customization) Sum of on demand custom model deployment requests per minute for Amazon Nova Micro (Model customization) Sum of on demand custom model deployment requests per minute for Amazon Nova Micro throughput | 2,000 count | Fixed |
Characters in Agent instructions Characters in Agent instructions general | 20,000 count | Fixed |
Sum of in-progress and submitted batch inference jobs using a base model for DeepSeek V3.2 Sum of in-progress and submitted batch inference jobs using a base model for DeepSeek V3.2 general | 100 count | Adjustable |
Cross-region model inference requests per minute for Anthropic Claude Opus 4 V1 Cross-region model inference requests per minute for Anthropic Claude Opus 4 V1 throughput | 200 count | Fixed |
Cross-region model inference tokens per minute for Anthropic Claude Sonnet 4.5 V1 1M Context Length Cross-region model inference tokens per minute for Anthropic Claude Sonnet 4.5 V1 1M Context Length general | 1,000,000 count | Adjustable |
Minimum number of records per batch inference job for Ministral 3B Minimum number of records per batch inference job for Ministral 3B general | 100 count | Fixed |
Model units per provisioned model for Amazon Titan Text Premier V1 32K Model units per provisioned model for Amazon Titan Text Premier V1 32K general | 0 count | Adjustable |
GetAgentActionGroup requests per second GetAgentActionGroup requests per second throughput | 20 count | Fixed |
Model invocation max tokens per day for Anthropic Claude Sonnet 4.5 V1 1M Context Length (doubled for cross-region calls) Model invocation max tokens per day for Anthropic Claude Sonnet 4.5 V1 1M Context Length (doubled for cross-region calls) general | 720,000,000 count | Fixed |
Global cross-region model inference tokens per day for Anthropic Claude Haiku 4.5 Global cross-region model inference tokens per day for Anthropic Claude Haiku 4.5 general | 7,200,000,000 count | Fixed |
Custom models with a creating status per account Custom models with a creating status per account general | 2 count | Adjustable |
Minimum number of records per batch inference job for Claude Sonnet 4.6 Minimum number of records per batch inference job for Claude Sonnet 4.6 general | 100 count | Fixed |
Model invocation max tokens per day for Voxtral Mini 1.0 (doubled for cross-region calls) Model invocation max tokens per day for Voxtral Mini 1.0 (doubled for cross-region calls) general | 144,000,000,000 count | Fixed |
Batch inference job size (in GB) for Ministral 3 8B Batch inference job size (in GB) for Ministral 3 8B storage | 5 count | Fixed |
Model invocation max tokens per day for Amazon Nova Pro (doubled for cross-region calls) Model invocation max tokens per day for Amazon Nova Pro (doubled for cross-region calls) general | 1,440,000,000 count | Fixed |
On-demand model inference tokens per minute for Cohere Embed English On-demand model inference tokens per minute for Cohere Embed English general | 300,000 count | Fixed |
Model invocation max tokens per day for Anthropic Claude Opus 4.5 (doubled for cross-region calls) Model invocation max tokens per day for Anthropic Claude Opus 4.5 (doubled for cross-region calls) general | 1,440,000,000 count | Fixed |
Global cross-region model inference tokens per day for Anthropic Claude Opus 4.6 V1 Global cross-region model inference tokens per day for Anthropic Claude Opus 4.6 V1 general | 4,320,000,000 count | Fixed |
On-demand InvokeModel concurrent requests for Amazon Nova Reel1.0 On-demand InvokeModel concurrent requests for Amazon Nova Reel1.0 compute | 10 count | Fixed |
Model units per provisioned model for Amazon Titan Text Embeddings V2 Model units per provisioned model for Amazon Titan Text Embeddings V2 general | 0 count | Adjustable |
On-demand model inference tokens per minute for Meta Llama 3.2 3B Instruct On-demand model inference tokens per minute for Meta Llama 3.2 3B Instruct general | 300,000 count | Fixed |
Model units per provisioned model for the 300k context length variant for Amazon Nova Lite Model units per provisioned model for the 300k context length variant for Amazon Nova Lite general | 0 count | Adjustable |
Batch inference input file size (in GB) for Claude Opus 4.5 Batch inference input file size (in GB) for Claude Opus 4.5 storage | 1 count | Fixed |
On-demand model inference tokens per minute for Z.ai GLM-4.7 On-demand model inference tokens per minute for Z.ai GLM-4.7 general | 100,000,000 count | Fixed |
Batch inference input file size (in GB) for Titan Text Embeddings V2 Batch inference input file size (in GB) for Titan Text Embeddings V2 storage | 1 count | Fixed |
(Data Automation) InvokeDataAutomationAsync - Audio - Max number of concurrent jobs (Data Automation) InvokeDataAutomationAsync - Audio - Max number of concurrent jobs compute | 20 count | Adjustable |
Batch inference job size (in GB) for Nova Pro V1 Batch inference job size (in GB) for Nova Pro V1 storage | 100 count | Fixed |
Batch inference input file size (in GB) for Claude 3 Opus Batch inference input file size (in GB) for Claude 3 Opus storage | 1 count | Fixed |
Batch inference input file size (in GB) for Mistral Large 2 (24.07) Batch inference input file size (in GB) for Mistral Large 2 (24.07) storage | 1 count | Fixed |
(Knowledge Bases) ListKnowledgeBases requests per second (Knowledge Bases) ListKnowledgeBases requests per second throughput | 10 count | Fixed |
(Model customization) Minimum number of prompts for distillation customization jobs (Model customization) Minimum number of prompts for distillation customization jobs general | 100 count | Fixed |
(Automated Reasoning) ListAutomatedReasoningPolicyBuildWorkflows requests per second (Automated Reasoning) ListAutomatedReasoningPolicyBuildWorkflows requests per second throughput | 5 count | Adjustable |
Model invocation max tokens per day for Z.ai GLM-4.7 Flash (doubled for cross-region calls) Model invocation max tokens per day for Z.ai GLM-4.7 Flash (doubled for cross-region calls) general | 144,000,000,000 count | Fixed |
Records per input file per batch inference job for OpenAI GPT OSS Safeguard 120b Records per input file per batch inference job for OpenAI GPT OSS Safeguard 120b storage | 100,000 count | Adjustable |
Records per batch inference job for Nova Lite V1 Records per batch inference job for Nova Lite V1 general | 100,000 count | Adjustable |
Global cross-region model inference tokens per day for Anthropic Claude Sonnet 4 V1 Global cross-region model inference tokens per day for Anthropic Claude Sonnet 4 V1 general | 288,000,000 count | Fixed |
Records per input file per batch inference job for Claude Sonnet 4 Records per input file per batch inference job for Claude Sonnet 4 storage | 100,000 count | Adjustable |
On-demand model inference requests per minute for Anthropic Claude 3 Opus On-demand model inference requests per minute for Anthropic Claude 3 Opus throughput | 50 count | Fixed |
On-demand model inference requests per minute for Anthropic Claude 3.5 Sonnet On-demand model inference requests per minute for Anthropic Claude 3.5 Sonnet throughput | 50 count | Fixed |
(Knowledge Bases) DeleteKnowledgeBase requests per second (Knowledge Bases) DeleteKnowledgeBase requests per second throughput | 2 count | Fixed |
Cross-region model inference tokens per minute for Amazon Nova Micro Cross-region model inference tokens per minute for Amazon Nova Micro general | 8,000,000 count | Adjustable |
(Evaluation) Number of prompts in a custom prompt dataset (Evaluation) Number of prompts in a custom prompt dataset general | 1,000 count | Fixed |
On-demand model inference requests per minute for Amazon Titan Text Lite On-demand model inference requests per minute for Amazon Titan Text Lite throughput | 800 count | Fixed |
Records per batch inference job for Qwen3 Next 80B Records per batch inference job for Qwen3 Next 80B general | 100,000 count | Adjustable |
On-demand model inference requests per minute for Stable Image Creative Upscale On-demand model inference requests per minute for Stable Image Creative Upscale throughput | 2 count | Fixed |
Batch inference input file size (in GB) for Ministral 3 8B Batch inference input file size (in GB) for Ministral 3 8B storage | 1 count | Fixed |
On-demand model inference requests per minute for NVIDIA Nemotron 3 Super 120B A12B On-demand model inference requests per minute for NVIDIA Nemotron 3 Super 120B A12B throughput | 10,000 count | Fixed |
(Flows) GetFlow requests per second (Flows) GetFlow requests per second throughput | 10 count | Fixed |
Batch inference job size (in GB) for Amazon Nova Premier Batch inference job size (in GB) for Amazon Nova Premier storage | 5 count | Fixed |
Batch inference job size (in GB) for Llama 3.2 11B Instruct Batch inference job size (in GB) for Llama 3.2 11B Instruct storage | 5 count | Fixed |
Model invocation max tokens per day for GPT OSS Safeguard 20B (doubled for cross-region calls) Model invocation max tokens per day for GPT OSS Safeguard 20B (doubled for cross-region calls) general | 144,000,000,000 count | Fixed |
Sum of in-progress and submitted batch inference jobs using a base model for Llama 3.2 3B Instruct Sum of in-progress and submitted batch inference jobs using a base model for Llama 3.2 3B Instruct general | 100 count | Adjustable |
Batch inference job size (in GB) for GLM 4.7 Flash Batch inference job size (in GB) for GLM 4.7 Flash storage | 5 count | Fixed |
(Model customization) Sum of training and validation records for a Titan Multimodal Embeddings G1 v1 Fine-tuning job (Model customization) Sum of training and validation records for a Titan Multimodal Embeddings G1 v1 Fine-tuning job general | 50,000 count | Adjustable |
Enabled action groups per agent Enabled action groups per agent general | 15 count | Adjustable |
Records per batch inference job for Writer Palmyra Vision 7B Records per batch inference job for Writer Palmyra Vision 7B general | 100,000 count | Adjustable |
Sum of in-progress and submitted batch inference jobs using a base model for Llama 4 Scout Sum of in-progress and submitted batch inference jobs using a base model for Llama 4 Scout general | 100 count | Adjustable |
(Evaluation) Number of models in a model evaluation job that uses human workers (Evaluation) Number of models in a model evaluation job that uses human workers general | 2 count | Fixed |
Cross-region model inference requests per minute for Anthropic Claude Sonnet 4.5 V1 1M Context Length Cross-region model inference requests per minute for Anthropic Claude Sonnet 4.5 V1 1M Context Length throughput | 1,000 count | Adjustable |
Model units per provisioned model for Meta Llama 3 70B Instruct Model units per provisioned model for Meta Llama 3 70B Instruct general | 0 count | Adjustable |
(Flows) DeleteFlow requests per second (Flows) DeleteFlow requests per second throughput | 2 count | Fixed |
Records per input file per batch inference job for GLM 5 Records per input file per batch inference job for GLM 5 storage | 100,000 count | Adjustable |
On-demand model inference tokens per minute for Z.ai GLM 5 On-demand model inference tokens per minute for Z.ai GLM 5 general | 100,000,000 count | Fixed |
(Model customization) Sum of on demand custom model deployment requests per minute for Amazon Nova Lite (Model customization) Sum of on demand custom model deployment requests per minute for Amazon Nova Lite throughput | 2,000 count | Fixed |
Minimum number of records per batch inference job for OpenAI GPT OSS Safeguard 120b Minimum number of records per batch inference job for OpenAI GPT OSS Safeguard 120b general | 100 count | Fixed |
Batch inference input file size (in GB) for NVIDIA Nemotron Nano 12B Batch inference input file size (in GB) for NVIDIA Nemotron Nano 12B storage | 1 count | Fixed |
Batch inference job size (in GB) for Writer Palmyra Vision 7B Batch inference job size (in GB) for Writer Palmyra Vision 7B storage | 5 count | Fixed |
(Knowledge Bases) Maximum number of files for Foundation Models as a parser (Knowledge Bases) Maximum number of files for Foundation Models as a parser storage | 1,000 count | Fixed |
(Knowledge Bases) Concurrent IngestKnowledgeBaseDocuments and DeleteKnowledgeBaseDocuments requests per account (Knowledge Bases) Concurrent IngestKnowledgeBaseDocuments and DeleteKnowledgeBaseDocuments requests per account compute | 10 count | Fixed |
On-demand model inference tokens per minute for AI21 Labs Jurassic-2 Ultra On-demand model inference tokens per minute for AI21 Labs Jurassic-2 Ultra general | 300,000 count | Fixed |
Model invocation max tokens per day for NVIDIA Nemotron Nano 2 (doubled for cross-region calls) Model invocation max tokens per day for NVIDIA Nemotron Nano 2 (doubled for cross-region calls) general | 144,000,000,000 count | Fixed |
Cross-region model inference tokens per minute for Meta Llama 3.1 70B Instruct Cross-region model inference tokens per minute for Meta Llama 3.1 70B Instruct general | 600,000 count | Adjustable |
Sum of in-progress and submitted batch inference jobs using a base model for GLM 4.7 Flash Sum of in-progress and submitted batch inference jobs using a base model for GLM 4.7 Flash general | 100 count | Adjustable |
Batch inference job size (in GB) for Mistral Large 3 Batch inference job size (in GB) for Mistral Large 3 storage | 5 count | Fixed |
Sum of in-progress and submitted batch inference jobs using a base model for Llama 4 Maverick Sum of in-progress and submitted batch inference jobs using a base model for Llama 4 Maverick general | 100 count | Adjustable |
Model invocation max tokens per day for Mistral AI Mistral Small (doubled for cross-region calls) Model invocation max tokens per day for Mistral AI Mistral Small (doubled for cross-region calls) general | 432,000,000 count | Fixed |
Sum of in-progress and submitted batch inference jobs using a custom model for Titan Multimodal Embeddings G1 Sum of in-progress and submitted batch inference jobs using a custom model for Titan Multimodal Embeddings G1 general | 3 count | Fixed |
Cross-region model inference requests per minute for Anthropic Claude Opus 4.5 Cross-region model inference requests per minute for Anthropic Claude Opus 4.5 throughput | 10,000 count | Adjustable |
Cross-region model inference requests per minute for Mistral Pixtral Large 25.02 V1 Cross-region model inference requests per minute for Mistral Pixtral Large 25.02 V1 throughput | 10 count | Fixed |
Batch inference job size (in GB) for NVIDIA Nemotron Nano 3 30B Batch inference job size (in GB) for NVIDIA Nemotron Nano 3 30B storage | 5 count | Fixed |
Model units per provisioned model for Anthropic Claude 3.5 Sonnet 200K Model units per provisioned model for Anthropic Claude 3.5 Sonnet 200K general | 0 count | Adjustable |
Model units per provisioned model for Amazon Nova 2 Lite V1.0 256K Model units per provisioned model for Amazon Nova 2 Lite V1.0 256K general | 0 count | Adjustable |
Model units per provisioned model for Anthropic Claude V2 100K Model units per provisioned model for Anthropic Claude V2 100K general | 0 count | Adjustable |
(Model customization) Sum of on demand custom model deployment tokens per minute for Amazon Nova Micro (Model customization) Sum of on demand custom model deployment tokens per minute for Amazon Nova Micro general | 4,000,000 count | Fixed |
(Automated Reasoning) GetAutomatedReasoningPolicyTestResult requests per second (Automated Reasoning) GetAutomatedReasoningPolicyTestResult requests per second throughput | 10 count | Adjustable |
On-demand model inference requests per minute for Anthropic Claude 3.5 Haiku On-demand model inference requests per minute for Anthropic Claude 3.5 Haiku throughput | 1,000 count | Fixed |
Cross-region model inference tokens per minute for Meta Llama 3.2 1B Instruct Cross-region model inference tokens per minute for Meta Llama 3.2 1B Instruct general | 600,000 count | Adjustable |
Records per input file per batch inference job for GLM 4.7 Records per input file per batch inference job for GLM 4.7 storage | 100,000 count | Adjustable |
Sum of in-progress and submitted batch inference jobs using a base model for Voxtral Small 24B 2507 Sum of in-progress and submitted batch inference jobs using a base model for Voxtral Small 24B 2507 general | 100 count | Adjustable |
On-demand model inference requests per minute for Amazon Titan Text Embeddings On-demand model inference requests per minute for Amazon Titan Text Embeddings throughput | 2,000 count | Fixed |
(Flows) Agent nodes per flow (Flows) Agent nodes per flow capacity | 20 count | Fixed |
(Knowledge Bases) Data sources per knowledge base (Knowledge Bases) Data sources per knowledge base general | 5 count | Fixed |
On-demand model inference requests per minute for Meta Llama 3.1 70B Instruct On-demand model inference requests per minute for Meta Llama 3.1 70B Instruct throughput | 400 count | Fixed |
(Automated Reasoning) GetAutomatedReasoningPolicy requests per second (Automated Reasoning) GetAutomatedReasoningPolicy requests per second throughput | 10 count | Adjustable |
Model units per provisioned model for Anthropic Claude 3.5 Haiku 16K Model units per provisioned model for Anthropic Claude 3.5 Haiku 16K general | 0 count | Adjustable |
(Automated Reasoning) ExportAutomatedReasoningPolicyVersion requests per second (Automated Reasoning) ExportAutomatedReasoningPolicyVersion requests per second throughput | 5 count | Adjustable |
Model invocation max tokens per day for NVIDIA Nemotron Nano 2 VL (doubled for cross-region calls) Model invocation max tokens per day for NVIDIA Nemotron Nano 2 VL (doubled for cross-region calls) general | 144,000,000,000 count | Fixed |
Cross-region model inference tokens per minute for Anthropic Claude Opus 4.5 Cross-region model inference tokens per minute for Anthropic Claude Opus 4.5 general | 2,000,000 count | Adjustable |
Global cross-region model inference tokens per day for Cohere Embed V4 Global cross-region model inference tokens per day for Cohere Embed V4 general | 432,000,000 count | Fixed |
Records per batch inference job for GLM 5 Records per batch inference job for GLM 5 general | 100,000 count | Adjustable |
On-demand model inference requests per minute for Ministral 3B 3.0 On-demand model inference requests per minute for Ministral 3B 3.0 throughput | 10,000 count | Fixed |
Cross-Region model inference tokens per minute for Anthropic Claude 3.5 Sonnet V2 Cross-Region model inference tokens per minute for Anthropic Claude 3.5 Sonnet V2 general | 800,000 count | Adjustable |
Sum of in-progress and submitted batch inference jobs using a base model for Nova Lite V1 Sum of in-progress and submitted batch inference jobs using a base model for Nova Lite V1 general | 100 count | Adjustable |
Cross-region model inference requests per minute for Stable Image Fast Upscale Cross-region model inference requests per minute for Stable Image Fast Upscale throughput | 20 count | Fixed |
Batch inference input file size (in GB) for Magistral Small 2509 Batch inference input file size (in GB) for Magistral Small 2509 storage | 1 count | Fixed |
On-demand model inference requests per minute for Stable Image Control Sketch On-demand model inference requests per minute for Stable Image Control Sketch throughput | 10 count | Fixed |
Records per batch inference job for Claude 3.5 Sonnet Records per batch inference job for Claude 3.5 Sonnet general | 100,000 count | Adjustable |
Global cross-region model inference tokens per minute for Amazon Nova 2 Omni Global cross-region model inference tokens per minute for Amazon Nova 2 Omni general | 8,000,000 count | Adjustable |
On-demand model inference tokens per minute for Amazon Nova Micro On-demand model inference tokens per minute for Amazon Nova Micro general | 4,000,000 count | Fixed |
Sum of in-progress and submitted batch inference jobs using a base model for Qwen3 Next 80B Sum of in-progress and submitted batch inference jobs using a base model for Qwen3 Next 80B general | 100 count | Adjustable |
Sum of in-progress and submitted batch inference jobs using a base model for NVIDIA Nemotron Nano 9B Sum of in-progress and submitted batch inference jobs using a base model for NVIDIA Nemotron Nano 9B general | 100 count | Adjustable |
On-demand InvokeModel concurrent requests for Amazon Nova 2 Sonic On-demand InvokeModel concurrent requests for Amazon Nova 2 Sonic compute | 20 count | Fixed |
On-demand model inference tokens per minute for AI21 Labs Jamba Instruct On-demand model inference tokens per minute for AI21 Labs Jamba Instruct general | 300,000 count | Fixed |
Sum of in-progress and submitted batch inference jobs using a base model for Qwen3 Coder Next Sum of in-progress and submitted batch inference jobs using a base model for Qwen3 Coder Next general | 100 count | Adjustable |
Cross-region model inference requests per minute for Stable Image Search and Recolor Cross-region model inference requests per minute for Stable Image Search and Recolor throughput | 20 count | Fixed |
On-demand model inference requests per minute for Amazon Nova Canvas On-demand model inference requests per minute for Amazon Nova Canvas throughput | 100 count | Fixed |
Model invocation max tokens per day for Amazon Nova 2 Pro Preview (doubled for cross-region calls) Model invocation max tokens per day for Amazon Nova 2 Pro Preview (doubled for cross-region calls) general | 720,000,000 count | Fixed |
On-demand model inference requests per minute for Amazon Titan Text Premier On-demand model inference requests per minute for Amazon Titan Text Premier throughput | 100 count | Fixed |
Minimum number of records per batch inference job for Llama 3.3 70B Instruct Minimum number of records per batch inference job for Llama 3.3 70B Instruct general | 100 count | Fixed |
Records per input file per batch inference job for Nova Pro V1 Records per input file per batch inference job for Nova Pro V1 storage | 100,000 count | Adjustable |
Sum of in-progress and submitted batch inference jobs using a base model for Mistral Small Sum of in-progress and submitted batch inference jobs using a base model for Mistral Small general | 100 count | Adjustable |
(Knowledge Bases) GetKnowledgeBase requests per second (Knowledge Bases) GetKnowledgeBase requests per second throughput | 10 count | Fixed |
Records per input file per batch inference job for Qwen3 Coder 30B Records per input file per batch inference job for Qwen3 Coder 30B storage | 100,000 count | Adjustable |
Batch inference input file size (in GB) for Amazon Nova 2 Multimodal Embeddings V1 Batch inference input file size (in GB) for Amazon Nova 2 Multimodal Embeddings V1 storage | 1 count | Fixed |
Cross-region model inference tokens per minute for Amazon Nova 2 Pro Preview Cross-region model inference tokens per minute for Amazon Nova 2 Pro Preview general | 1,000,000 count | Adjustable |
Model units per provisioned model for Anthropic Claude 3.5 Sonnet 18K Model units per provisioned model for Anthropic Claude 3.5 Sonnet 18K general | 0 count | Adjustable |
Records per batch inference job for Gemma 3 4B Records per batch inference job for Gemma 3 4B general | 100,000 count | Adjustable |
Minimum number of records per batch inference job for Nova 2 Lite Minimum number of records per batch inference job for Nova 2 Lite general | 100 count | Fixed |
Minimum number of records per batch inference job for GLM 4.7 Minimum number of records per batch inference job for GLM 4.7 general | 100 count | Fixed |
On-demand model inference tokens per minute for Minimax M2.1 On-demand model inference tokens per minute for Minimax M2.1 general | 100,000,000 count | Fixed |
Records per input file per batch inference job for Llama 3.1 405B Instruct Records per input file per batch inference job for Llama 3.1 405B Instruct storage | 100,000 count | Adjustable |
Sum of in-progress and submitted batch inference jobs using a base model for Titan Text Embeddings V2 Sum of in-progress and submitted batch inference jobs using a base model for Titan Text Embeddings V2 general | 100 count | Adjustable |
(Guardrails) On-demand ApplyGuardrail Content filter policy text units per second (Guardrails) On-demand ApplyGuardrail Content filter policy text units per second identity | 200 count | Adjustable |
Model invocation max tokens per day for Meta Llama 3.2 1B Instruct (doubled for cross-region calls) Model invocation max tokens per day for Meta Llama 3.2 1B Instruct (doubled for cross-region calls) general | 432,000,000 count | Fixed |
Records per input file per batch inference job for Mistral Large 2 (24.07) Records per input file per batch inference job for Mistral Large 2 (24.07) storage | 100,000 count | Adjustable |
Batch inference input file size (in GB) for Gemma 3 27B Batch inference input file size (in GB) for Gemma 3 27B storage | 1 count | Fixed |
Records per input file per batch inference job for Voxtral Small 24B 2507 Records per input file per batch inference job for Voxtral Small 24B 2507 storage | 100,000 count | Adjustable |
Records per input file per batch inference job for Ministral 3 14B Records per input file per batch inference job for Ministral 3 14B storage | 100,000 count | Adjustable |
(Knowledge Bases) GetIngestionJob requests per second (Knowledge Bases) GetIngestionJob requests per second throughput | 10 count | Fixed |
Model invocation max tokens per day for Meta Llama 4 Maverick V1 (doubled for cross-region calls) Model invocation max tokens per day for Meta Llama 4 Maverick V1 (doubled for cross-region calls) general | 432,000,000 count | Fixed |
(Prompt management) ListPrompts requests per second (Prompt management) ListPrompts requests per second throughput | 10 count | Fixed |
On-demand model inference requests per minute for OpenAI GPT OSS 20B On-demand model inference requests per minute for OpenAI GPT OSS 20B throughput | 10,000 count | Fixed |
Batch inference job size (in GB) for Mistral Small Batch inference job size (in GB) for Mistral Small storage | 5 count | Fixed |
Sum of in-progress and submitted batch inference jobs using a custom model for Titan Text Embeddings V2 Sum of in-progress and submitted batch inference jobs using a custom model for Titan Text Embeddings V2 general | 3 count | Fixed |
Batch inference input file size (in GB) for Voxtral Mini 3B 2507 Batch inference input file size (in GB) for Voxtral Mini 3B 2507 storage | 1 count | Fixed |
On-demand model inference requests per minute for Minimax M2.1 On-demand model inference requests per minute for Minimax M2.1 throughput | 10,000 count | Fixed |
(Model customization) Sum of training and validation records for a Titan Text G1 - Lite v1 Continued Pre-Training job (Model customization) Sum of training and validation records for a Titan Text G1 - Lite v1 Continued Pre-Training job general | 100,000 count | Adjustable |
Batch inference job size (in GB) for Ministral 3B Batch inference job size (in GB) for Ministral 3B storage | 5 count | Fixed |
Cross-region model inference requests per minute for Twelve Labs Pegasus Cross-region model inference requests per minute for Twelve Labs Pegasus throughput | 120 count | Adjustable |
On-demand model inference requests per minute for Kimi K2 Thinking On-demand model inference requests per minute for Kimi K2 Thinking throughput | 10,000 count | Fixed |
(Model customization) Maximum student model fine tuning context length for Amazon Nova V1 distillation customization jobs (Model customization) Maximum student model fine tuning context length for Amazon Nova V1 distillation customization jobs general | 32,000 count | Fixed |
Cross-region model inference tokens per minute for Anthropic Claude Haiku 4.5 Cross-region model inference tokens per minute for Anthropic Claude Haiku 4.5 general | 5,000,000 count | Adjustable |
Records per input file per batch inference job for OpenAI GPT OSS Safeguard 20b Records per input file per batch inference job for OpenAI GPT OSS Safeguard 20b storage | 100,000 count | Adjustable |
Model invocation max tokens per day for Anthropic Claude Opus 4.6 V1 (doubled for cross-region calls) Model invocation max tokens per day for Anthropic Claude Opus 4.6 V1 (doubled for cross-region calls) general | 2,160,000,000 count | Fixed |
(Data Automation) InvokeDataAutomationAsync - Document - Max number of concurrent jobs (Data Automation) InvokeDataAutomationAsync - Document - Max number of concurrent jobs compute | 25 count | Adjustable |
(Data Automation) Maximum number of Blueprints per Start Inference request (Documents) (Data Automation) Maximum number of Blueprints per Start Inference request (Documents) throughput | 10 count | Fixed |
On-demand model inference requests per minute for Magistral Small 1.2 On-demand model inference requests per minute for Magistral Small 1.2 throughput | 10,000 count | Fixed |
Batch inference input file size (in GB) for Qwen3 Coder 30B Batch inference input file size (in GB) for Qwen3 Coder 30B storage | 1 count | Fixed |
Records per batch inference job for MiniMax M2.5 Records per batch inference job for MiniMax M2.5 general | 100,000 count | Adjustable |
(Automated Reasoning) Annotations in policy (Automated Reasoning) Annotations in policy identity | 10 count | Fixed |
Minimum number of records per batch inference job for Llama 3.2 3B Instruct Minimum number of records per batch inference job for Llama 3.2 3B Instruct general | 100 count | Fixed |
Records per batch inference job for Qwen3 32B Records per batch inference job for Qwen3 32B general | 100,000 count | Adjustable |
(Flows) ListFlowAliases requests per second (Flows) ListFlowAliases requests per second throughput | 10 count | Fixed |
Cross-region model inference tokens per minute for Amazon Nova Premier V1 Cross-region model inference tokens per minute for Amazon Nova Premier V1 general | 2,000,000 count | Adjustable |
(Guardrails) Word length in characters (Guardrails) Word length in characters general | 100 count | Fixed |
Records per input file per batch inference job for Kimi K2 Thinking Records per input file per batch inference job for Kimi K2 Thinking storage | 100,000 count | Adjustable |
On-demand model inference tokens per minute for Amazon Titan Text Embeddings On-demand model inference tokens per minute for Amazon Titan Text Embeddings general | 300,000 count | Fixed |
Records per batch inference job for MiniMax M2 Records per batch inference job for MiniMax M2 general | 100,000 count | Adjustable |
Global cross-region model inference tokens per minute for Anthropic Claude Haiku 4.5 Global cross-region model inference tokens per minute for Anthropic Claude Haiku 4.5 general | 5,000,000 count | Adjustable |
Batch inference input file size (in GB) for Ministral 3 14B Batch inference input file size (in GB) for Ministral 3 14B storage | 1 count | Fixed |
Minimum number of records per batch inference job for Ministral 3 8B Minimum number of records per batch inference job for Ministral 3 8B general | 100 count | Fixed |
Cross-region model inference requests per minute for Anthropic Claude 3.5 Sonnet Cross-region model inference requests per minute for Anthropic Claude 3.5 Sonnet throughput | 100 count | Fixed |
(Flows) ListFlowVersions requests per second (Flows) ListFlowVersions requests per second throughput | 10 count | Fixed |
Model units per provisioned model for Anthropic Claude V2.1 200K Model units per provisioned model for Anthropic Claude V2.1 200K general | 0 count | Adjustable |
(Knowledge Bases) GetKnowledgeBaseDocuments requests per second (Knowledge Bases) GetKnowledgeBaseDocuments requests per second throughput | 5 count | Fixed |
Batch inference input file size (in GB) for Claude 3 Sonnet Batch inference input file size (in GB) for Claude 3 Sonnet storage | 1 count | Fixed |
Records per input file per batch inference job for Claude 3.5 Sonnet v2 Records per input file per batch inference job for Claude 3.5 Sonnet v2 storage | 100,000 count | Adjustable |
Sum of in-progress and submitted batch inference jobs using a base model for Voxtral Mini 3B 2507 Sum of in-progress and submitted batch inference jobs using a base model for Voxtral Mini 3B 2507 general | 100 count | Adjustable |
Cross-region model inference requests per minute for Anthropic Claude 3 Opus Cross-region model inference requests per minute for Anthropic Claude 3 Opus throughput | 100 count | Fixed |
Throttle rate limit for UpdateBlueprint Throttle rate limit for UpdateBlueprint throughput | 5 count | Fixed |
On-Demand, latency-optimized model inference tokens per minute for Amazon Nova Pro V1 On-Demand, latency-optimized model inference tokens per minute for Amazon Nova Pro V1 general | 40,000 count | Fixed |
On-demand model inference tokens per minute for Qwen3 Next 80B A3B On-demand model inference tokens per minute for Qwen3 Next 80B A3B general | 100,000,000 count | Fixed |
Global cross-region model inference tokens per minute for Anthropic Claude Sonnet 4.5 V1 1M Context Length Global cross-region model inference tokens per minute for Anthropic Claude Sonnet 4.5 V1 1M Context Length general | 1,000,000 count | Adjustable |
Batch inference job size (in GB) for Claude 3 Opus Batch inference job size (in GB) for Claude 3 Opus storage | 5 count | Fixed |
(Automated Reasoning) Source document size (MB) (Automated Reasoning) Source document size (MB) storage | 5 count | Fixed |
Records per input file per batch inference job for Kimi K2.5 Records per input file per batch inference job for Kimi K2.5 storage | 100,000 count | Adjustable |
Records per input file per batch inference job for Magistral Small 2509 Records per input file per batch inference job for Magistral Small 2509 storage | 100,000 count | Adjustable |
(Data Automation) Maximum audio length (Minutes) (Data Automation) Maximum audio length (Minutes) general | 240 count | Fixed |
(Data Automation) InvokeBlueprintOptimizationAsync - Max number of blueprint optimization jobs per day (Data Automation) InvokeBlueprintOptimizationAsync - Max number of blueprint optimization jobs per day general | 30 count | Fixed |
On-demand model inference tokens per minute for Cohere Command R On-demand model inference tokens per minute for Cohere Command R general | 300,000 count | Fixed |
On-demand model inference tokens per minute for Writer Palmyra Vision 7B On-demand model inference tokens per minute for Writer Palmyra Vision 7B general | 100,000,000 count | Fixed |
Model invocation max tokens per day for Amazon Nova 2 Lite (doubled for cross-region calls) Model invocation max tokens per day for Amazon Nova 2 Lite (doubled for cross-region calls) general | 5,760,000,000 count | Fixed |
Records per batch inference job for MiniMax M2.1 Records per batch inference job for MiniMax M2.1 general | 100,000 count | Adjustable |
Records per batch inference job for Nova Micro V1 Records per batch inference job for Nova Micro V1 general | 100,000 count | Adjustable |
Records per batch inference job for Llama 3.1 8B Instruct Records per batch inference job for Llama 3.1 8B Instruct general | 100,000 count | Adjustable |
(Evaluation) Number of concurrent automatic model evaluation jobs (Evaluation) Number of concurrent automatic model evaluation jobs compute | 20 count | Fixed |
On-demand model inference requests per minute for Mistral AI Mistral Small On-demand model inference requests per minute for Mistral AI Mistral Small throughput | 400 count | Fixed |
(Prompt management) CreatePromptVersion requests per second (Prompt management) CreatePromptVersion requests per second throughput | 2 count | Fixed |
(Prompt management) Versions per prompt (Prompt management) Versions per prompt general | 10 count | Fixed |
Concurrent model import jobs Concurrent model import jobs compute | 1 count | Fixed |
Global cross-region model inference requests per minute for Cohere Embed V4 Global cross-region model inference requests per minute for Cohere Embed V4 throughput | 2,000 count | Adjustable |
Records per batch inference job for GLM 4.7 Flash Records per batch inference job for GLM 4.7 Flash general | 100,000 count | Adjustable |
(Model customization) Scheduled customization jobs (Model customization) Scheduled customization jobs general | 10 count | Fixed |
Model invocation max tokens per day for MiniMax M2.5 (doubled for cross-region calls) Model invocation max tokens per day for MiniMax M2.5 (doubled for cross-region calls) general | 144,000,000,000 count | Fixed |
(Knowledge Bases) UpdateDataSource requests per second (Knowledge Bases) UpdateDataSource requests per second throughput | 2 count | Fixed |
(Model customization) Maximum number of prompts for distillation customization jobs (Model customization) Maximum number of prompts for distillation customization jobs general | 15,000 count | Fixed |
Cross-region model inference requests per minute for Stable Image Outpaint Cross-region model inference requests per minute for Stable Image Outpaint throughput | 4 count | Fixed |
Records per batch inference job for Claude 3.7 Sonnet Records per batch inference job for Claude 3.7 Sonnet general | 100,000 count | Adjustable |
Number of custom prompt routers per account Number of custom prompt routers per account general | 500 count | Fixed |
Batch inference input file size (in GB) for Claude Haiku 4.5 Batch inference input file size (in GB) for Claude Haiku 4.5 storage | 1 count | Fixed |
On-demand model inference tokens per minute for Meta Llama 3.1 70B Instruct On-demand model inference tokens per minute for Meta Llama 3.1 70B Instruct general | 300,000 count | Fixed |
On-demand model inference requests per minute for Stable Image Search and Recolor On-demand model inference requests per minute for Stable Image Search and Recolor throughput | 10 count | Fixed |
On-demand model inference tokens per minute for Z.ai GLM-4.7 Flash On-demand model inference tokens per minute for Z.ai GLM-4.7 Flash general | 100,000,000 count | Fixed |
Minimum number of records per batch inference job for Llama 3.2 11B Instruct Minimum number of records per batch inference job for Llama 3.2 11B Instruct general | 100 count | Fixed |
(Model customization) Sum of training and validation records for a Amazon Nova Micro Fine-tuning job (Model customization) Sum of training and validation records for a Amazon Nova Micro Fine-tuning job general | 20,000 count | Adjustable |
Batch inference input file size (in GB) for Gemma 3 4B Batch inference input file size (in GB) for Gemma 3 4B storage | 1 count | Fixed |
(Data Automation) CreateBlueprintVersion - Max number of Blueprint versions per Blueprint (Data Automation) CreateBlueprintVersion - Max number of Blueprint versions per Blueprint general | 10 count | Adjustable |
Sum of in-progress and submitted batch inference jobs using a base model for Claude 3 Haiku Sum of in-progress and submitted batch inference jobs using a base model for Claude 3 Haiku general | 100 count | Adjustable |
Records per batch inference job for NVIDIA Nemotron Nano 9B Records per batch inference job for NVIDIA Nemotron Nano 9B general | 100,000 count | Adjustable |
Model invocation max tokens per day for Mistral Devstral 2 123b (doubled for cross-region calls) Model invocation max tokens per day for Mistral Devstral 2 123b (doubled for cross-region calls) general | 144,000,000,000 count | Fixed |
Batch inference input file size (in GB) for Devstral 2 123B Batch inference input file size (in GB) for Devstral 2 123B storage | 1 count | Fixed |
Records per batch inference job for Qwen3 VL 235B Records per batch inference job for Qwen3 VL 235B general | 100,000 count | Adjustable |
Cross-region model inference tokens per minute for Mistral Pixtral Large 25.02 V1 Cross-region model inference tokens per minute for Mistral Pixtral Large 25.02 V1 general | 80,000 count | Adjustable |
(Data Automation) Maximum instruction field length for Audio Blueprint - (Characters) (Data Automation) Maximum instruction field length for Audio Blueprint - (Characters) general | 500 count | Adjustable |
Sum of in-progress and submitted batch inference jobs using a base model for Kimi K2 Thinking Sum of in-progress and submitted batch inference jobs using a base model for Kimi K2 Thinking general | 100 count | Adjustable |
Cross-region model inference requests per minute for Stable Image Style Transfer Cross-region model inference requests per minute for Stable Image Style Transfer throughput | 20 count | Fixed |
Cross-Region model inference requests per minute for Anthropic Claude 3.5 Sonnet V2 Cross-Region model inference requests per minute for Anthropic Claude 3.5 Sonnet V2 throughput | 100 count | Fixed |
Records per batch inference job for Claude Sonnet 4.5 Records per batch inference job for Claude Sonnet 4.5 general | 100,000 count | Adjustable |
Model invocation max tokens per day for Anthropic Claude 3.5 Haiku (doubled for cross-region calls) Model invocation max tokens per day for Anthropic Claude 3.5 Haiku (doubled for cross-region calls) general | 2,880,000,000 count | Fixed |
Batch inference input file size (in GB) for Qwen3 VL 235B Batch inference input file size (in GB) for Qwen3 VL 235B storage | 1 count | Fixed |
Batch inference input file size (in GB) for Nova 2 Lite Batch inference input file size (in GB) for Nova 2 Lite storage | 1 count | Fixed |
Batch inference input file size (in GB) for Voxtral Small 24B 2507 Batch inference input file size (in GB) for Voxtral Small 24B 2507 storage | 1 count | Fixed |
(Knowledge Bases) Knowledge bases per account (Knowledge Bases) Knowledge bases per account general | 100 count | Fixed |
(Automated Reasoning) GetAutomatedReasoningPolicyNextScenario requests per second (Automated Reasoning) GetAutomatedReasoningPolicyNextScenario requests per second throughput | 10 count | Adjustable |
ListAgentAliases requests per second ListAgentAliases requests per second throughput | 10 count | Fixed |
Minimum number of records per batch inference job for Nova Pro V1 Minimum number of records per batch inference job for Nova Pro V1 general | 100 count | Fixed |
Records per input file per batch inference job for NVIDIA Nemotron Nano 3 30B Records per input file per batch inference job for NVIDIA Nemotron Nano 3 30B storage | 100,000 count | Adjustable |
Batch inference job size (in GB) for Llama 3.3 70B Instruct Batch inference job size (in GB) for Llama 3.3 70B Instruct storage | 5 count | Fixed |
Records per input file per batch inference job for Llama 3.2 1B Instruct Records per input file per batch inference job for Llama 3.2 1B Instruct storage | 100,000 count | Adjustable |
Global cross-region model inference tokens per day for Anthropic Claude Sonnet 4.5 V1 Global cross-region model inference tokens per day for Anthropic Claude Sonnet 4.5 V1 general | 7,200,000,000 count | Fixed |
Records per input file per batch inference job for Claude 3.5 Sonnet Records per input file per batch inference job for Claude 3.5 Sonnet storage | 100,000 count | Adjustable |
Cross-region model inference requests per minute for Anthropic Claude Sonnet 4 V1 Cross-region model inference requests per minute for Anthropic Claude Sonnet 4 V1 throughput | 200 count | Adjustable |
Global cross-region model inference requests per minute for Anthropic Claude Sonnet 4 V1 Global cross-region model inference requests per minute for Anthropic Claude Sonnet 4 V1 throughput | 200 count | Adjustable |
Cross-region model inference tokens per minute for Anthropic Claude Opus 4.1 Cross-region model inference tokens per minute for Anthropic Claude Opus 4.1 general | 500,000 count | Adjustable |
Minimum number of records per batch inference job for Llama 4 Maverick Minimum number of records per batch inference job for Llama 4 Maverick general | 100 count | Fixed |
Records per input file per batch inference job for Claude 3.5 Haiku Records per input file per batch inference job for Claude 3.5 Haiku storage | 100,000 count | Adjustable |
(Knowledge Bases) Concurrent ingestion jobs per account (Knowledge Bases) Concurrent ingestion jobs per account compute | 5 count | Fixed |
(Guardrails) Words per word policy (Guardrails) Words per word policy identity | 10,000 count | Fixed |
Sum of in-progress and submitted batch inference jobs using a base model for Nova Pro V1 Sum of in-progress and submitted batch inference jobs using a base model for Nova Pro V1 general | 100 count | Adjustable |
Model units per provisioned model for Amazon Titan Image Generator G2 Model units per provisioned model for Amazon Titan Image Generator G2 general | 0 count | Adjustable |
Batch inference job size (in GB) for Mistral Large 2 (24.07) Batch inference job size (in GB) for Mistral Large 2 (24.07) storage | 5 count | Fixed |
Model units per provisioned model for AI21 Labs Jurassic-2 Ultra Model units per provisioned model for AI21 Labs Jurassic-2 Ultra general | 0 count | Adjustable |
Model invocation max tokens per day for Mistral AI Mistral 7B Instruct (doubled for cross-region calls) Model invocation max tokens per day for Mistral AI Mistral 7B Instruct (doubled for cross-region calls) general | 432,000,000 count | Fixed |
Records per batch inference job for Claude 3 Haiku Records per batch inference job for Claude 3 Haiku general | 100,000 count | Adjustable |
(Model customization) Maximum input file size for distillation customization jobs (Model customization) Maximum input file size for distillation customization jobs storage | 2 Gigabytes | Fixed |
(Evaluation) Number of evaluation jobs (Evaluation) Number of evaluation jobs general | 5,000 count | Fixed |
Sum of in-progress and submitted batch inference jobs using a base model for Claude 3.5 Sonnet v2 Sum of in-progress and submitted batch inference jobs using a base model for Claude 3.5 Sonnet v2 general | 100 count | Adjustable |
On-demand model inference tokens per minute for Meta Llama 2 Chat 70B On-demand model inference tokens per minute for Meta Llama 2 Chat 70B general | 300,000 count | Fixed |
Records per input file per batch inference job for Titan Multimodal Embeddings G1 Records per input file per batch inference job for Titan Multimodal Embeddings G1 storage | 100,000 count | Adjustable |
PrepareAgent requests per second PrepareAgent requests per second throughput | 2 count | Fixed |
Cross-region model inference requests per minute for Meta Llama 4 Maverick V1 Cross-region model inference requests per minute for Meta Llama 4 Maverick V1 throughput | 800 count | Fixed |
On-demand model inference tokens per minute for Anthropic Claude 3.5 Sonnet On-demand model inference tokens per minute for Anthropic Claude 3.5 Sonnet general | 400,000 count | Fixed |
On-demand model inference requests per minute for Moonshot AI Kimi K2.5 On-demand model inference requests per minute for Moonshot AI Kimi K2.5 throughput | 10,000 count | Fixed |
Throttle rate limit for CreateDataAutomationProject Throttle rate limit for CreateDataAutomationProject throughput | 5 count | Fixed |
(Data Automation) InvokeDataAutomationAsync - Max number of open jobs (Data Automation) InvokeDataAutomationAsync - Max number of open jobs general | 1,800 count | Fixed |
Model units, with commitment, for Provisioned Throughout created for Meta Maverick 4 Scout 17B Instruct 1M Model units, with commitment, for Provisioned Throughout created for Meta Maverick 4 Scout 17B Instruct 1M general | 0 count | Adjustable |
Records per batch inference job for Kimi K2.5 Records per batch inference job for Kimi K2.5 general | 100,000 count | Adjustable |
Cross-region model inference requests per minute for Amazon Nova Premier V1 Cross-region model inference requests per minute for Amazon Nova Premier V1 throughput | 500 count | Fixed |
Model units per provisioned model for Meta Llama 3 8B Instruct Model units per provisioned model for Meta Llama 3 8B Instruct general | 0 count | Adjustable |
Records per batch inference job for Llama 3.2 11B Instruct Records per batch inference job for Llama 3.2 11B Instruct general | 100,000 count | Adjustable |
On-demand model inference tokens per minute for Meta Llama 3 8B Instruct On-demand model inference tokens per minute for Meta Llama 3 8B Instruct general | 300,000 count | Fixed |
Batch inference job size (in GB) for Llama 3.1 405B Instruct Batch inference job size (in GB) for Llama 3.1 405B Instruct storage | 5 count | Fixed |
Minimum number of records per batch inference job for Gemma 3 4B Minimum number of records per batch inference job for Gemma 3 4B general | 100 count | Fixed |
Throttle rate limit for ListDataAutomationProjects Throttle rate limit for ListDataAutomationProjects throughput | 5 count | Fixed |
(Flows) UpdateFlowAlias requests per second (Flows) UpdateFlowAlias requests per second throughput | 2 count | Fixed |
Cross-region model inference tokens per minute for Anthropic Claude Sonnet 4 V1 1M Context Length Cross-region model inference tokens per minute for Anthropic Claude Sonnet 4 V1 1M Context Length general | 1,000,000 count | Adjustable |
(Model customization) Maximum number of training records for an Amazon Nova Canvas Fine-tuning job (Model customization) Maximum number of training records for an Amazon Nova Canvas Fine-tuning job general | 10,000 count | Adjustable |
(Model customization) Sum of training and validation records for a Meta Llama 2 70B v1 Fine-tuning job (Model customization) Sum of training and validation records for a Meta Llama 2 70B v1 Fine-tuning job general | 10,000 count | Adjustable |
On-demand model inference tokens per minute for MiniMax M2.5 On-demand model inference tokens per minute for MiniMax M2.5 general | 100,000,000 count | Fixed |
On-demand model inference requests per minute for Stable Image Conservative Upscale On-demand model inference requests per minute for Stable Image Conservative Upscale throughput | 2 count | Fixed |
Batch inference job size (in GB) for NVIDIA Nemotron Nano 12B Batch inference job size (in GB) for NVIDIA Nemotron Nano 12B storage | 5 count | Fixed |
Records per input file per batch inference job for Qwen3 VL 235B Records per input file per batch inference job for Qwen3 VL 235B storage | 100,000 count | Adjustable |
Records per batch inference job for Nova 2 Lite Records per batch inference job for Nova 2 Lite general | 100,000 count | Adjustable |
Cross-region model inference tokens per minute for Anthropic Claude Sonnet 4 V1 Cross-region model inference tokens per minute for Anthropic Claude Sonnet 4 V1 general | 200,000 count | Adjustable |
Model units per provisioned model for Anthropic Claude 3.5 Haiku 64K Model units per provisioned model for Anthropic Claude 3.5 Haiku 64K general | 0 count | Adjustable |
Minimum number of records per batch inference job for Claude Opus 4.6 Minimum number of records per batch inference job for Claude Opus 4.6 general | 100 count | Fixed |
GetAgent requests per second GetAgent requests per second throughput | 15 count | Fixed |
Records per batch inference job for DeepSeek V3.2 Records per batch inference job for DeepSeek V3.2 general | 100,000 count | Adjustable |
ListAgents requests per second ListAgents requests per second throughput | 10 count | Fixed |
Batch inference input file size (in GB) for Llama 3.2 3B Instruct Batch inference input file size (in GB) for Llama 3.2 3B Instruct storage | 1 count | Fixed |
(Data Automation) Minimum Audio Sample Rate (Hz) (Data Automation) Minimum Audio Sample Rate (Hz) throughput | 8,000 count | Fixed |
Model invocation max tokens per day for Amazon Nova Premier V1 (doubled for cross-region calls) Model invocation max tokens per day for Amazon Nova Premier V1 (doubled for cross-region calls) general | 1,440,000,000 count | Fixed |
On-demand latency-optimized model inference requests per minute for Anthropic Claude 3.5 Haiku On-demand latency-optimized model inference requests per minute for Anthropic Claude 3.5 Haiku throughput | 100 count | Fixed |
Model units per provisioned model for Anthropic Claude V2 18K Model units per provisioned model for Anthropic Claude V2 18K general | 0 count | Adjustable |
Records per batch inference job for Qwen3 Coder Next Records per batch inference job for Qwen3 Coder Next general | 100,000 count | Adjustable |
Records per batch inference job for Kimi K2 Thinking Records per batch inference job for Kimi K2 Thinking general | 100,000 count | Adjustable |
(Model customization) Sum of training and validation records for a Meta Llama 2 13B v1 Fine-tuning job (Model customization) Sum of training and validation records for a Meta Llama 2 13B v1 Fine-tuning job general | 10,000 count | Adjustable |
Model invocation max tokens per day for AI21 Labs Jamba 1.5 Mini (doubled for cross-region calls) Model invocation max tokens per day for AI21 Labs Jamba 1.5 Mini (doubled for cross-region calls) general | 432,000,000 count | Fixed |
Batch inference input file size (in GB) for Claude 3.5 Sonnet Batch inference input file size (in GB) for Claude 3.5 Sonnet storage | 1 count | Fixed |
Batch inference job size (in GB) for MiniMax M2.1 Batch inference job size (in GB) for MiniMax M2.1 storage | 5 count | Fixed |
Cross-region model inference requests per minute for Meta Llama 4 Scout V1 Cross-region model inference requests per minute for Meta Llama 4 Scout V1 throughput | 800 count | Fixed |
Cross-region model inference requests per minute for Anthropic Claude Opus 4.1 Cross-region model inference requests per minute for Anthropic Claude Opus 4.1 throughput | 50 count | Fixed |
(Model customization) Sum of on demand custom model deployment requests per minute for Amazon Nova Pro (Model customization) Sum of on demand custom model deployment requests per minute for Amazon Nova Pro throughput | 200 count | Fixed |
Minimum number of records per batch inference job for GLM 5 Minimum number of records per batch inference job for GLM 5 general | 100 count | Fixed |
On-demand latency-optimized model inference tokens per minute for Anthropic Claude 3.5 Haiku On-demand latency-optimized model inference tokens per minute for Anthropic Claude 3.5 Haiku general | 500,000 count | Fixed |
On-demand InvokeModel concurrent requests for Twelve Labs Marengo On-demand InvokeModel concurrent requests for Twelve Labs Marengo compute | 30 count | Fixed |
Batch inference job size (in GB) for Qwen3 Next 80B Batch inference job size (in GB) for Qwen3 Next 80B storage | 5 count | Fixed |
Batch inference input file size (in GB) for Mistral Small Batch inference input file size (in GB) for Mistral Small storage | 1 count | Fixed |
On-demand model inference requests per minute for Amazon Nova Pro On-demand model inference requests per minute for Amazon Nova Pro throughput | 250 count | Fixed |
Records per batch inference job for Llama 3.3 70B Instruct Records per batch inference job for Llama 3.3 70B Instruct general | 100,000 count | Adjustable |
On-demand model inference requests per minute for Anthropic Claude 3.5 Sonnet V2 On-demand model inference requests per minute for Anthropic Claude 3.5 Sonnet V2 throughput | 50 count | Fixed |
Sum of in-progress and submitted batch inference jobs using a base model for Amazon Nova 2 Multimodal Embeddings V1 Sum of in-progress and submitted batch inference jobs using a base model for Amazon Nova 2 Multimodal Embeddings V1 general | 100 count | Adjustable |
Records per input file per batch inference job for Mistral Large 3 Records per input file per batch inference job for Mistral Large 3 storage | 100,000 count | Adjustable |
Records per input file per batch inference job for DeepSeek V3.2 Records per input file per batch inference job for DeepSeek V3.2 storage | 100,000 count | Adjustable |
Model invocation max tokens per day for Mistral AI Mixtral 8X7B Instruct (doubled for cross-region calls) Model invocation max tokens per day for Mistral AI Mixtral 8X7B Instruct (doubled for cross-region calls) general | 432,000,000 count | Fixed |
Batch inference job size (in GB) for Claude 3.5 Sonnet Batch inference job size (in GB) for Claude 3.5 Sonnet storage | 5 count | Fixed |
Records per input file per batch inference job for Amazon Nova Premier Records per input file per batch inference job for Amazon Nova Premier storage | 100,000 count | Adjustable |
Model invocation max tokens per day for Anthropic Claude Haiku 4.5 (doubled for cross-region calls) Model invocation max tokens per day for Anthropic Claude Haiku 4.5 (doubled for cross-region calls) general | 3,600,000,000 count | Fixed |
(Guardrails) Automated Reasoning policies per guardrail (Guardrails) Automated Reasoning policies per guardrail general | 2 count | Fixed |
Global cross-region model inference tokens per minute for Anthropic Claude Opus 4.6 V1 Global cross-region model inference tokens per minute for Anthropic Claude Opus 4.6 V1 general | 3,000,000 count | Adjustable |
Minimum number of records per batch inference job for Devstral 2 123B Minimum number of records per batch inference job for Devstral 2 123B general | 100 count | Fixed |
Records per input file per batch inference job for Claude 3 Sonnet Records per input file per batch inference job for Claude 3 Sonnet storage | 100,000 count | Adjustable |
UpdateAgent requests per second UpdateAgent requests per second throughput | 4 count | Fixed |
(Guardrails) On-demand ApplyGuardrail Denied topic policy text units per second (standard) (Guardrails) On-demand ApplyGuardrail Denied topic policy text units per second (standard) identity | 200 count | Adjustable |
(Data Automation) InvokeDataAutomation(Sync) - Image - Max number of requests (Data Automation) InvokeDataAutomation(Sync) - Image - Max number of requests throughput | 200 count | Adjustable |
(Flows) ListFlows requests per second (Flows) ListFlows requests per second throughput | 10 count | Fixed |
Model invocation max tokens per day for Meta Llama 3.2 90B Instruct (doubled for cross-region calls) Model invocation max tokens per day for Meta Llama 3.2 90B Instruct (doubled for cross-region calls) general | 432,000,000 count | Fixed |
On-demand model inference requests per minute for AI21 Labs Jamba 1.5 Mini On-demand model inference requests per minute for AI21 Labs Jamba 1.5 Mini throughput | 100 count | Fixed |
Cross-region model inference requests per minute for Meta Llama 3.3 70B Instruct Cross-region model inference requests per minute for Meta Llama 3.3 70B Instruct throughput | 800 count | Fixed |
Batch inference input file size (in GB) for Titan Multimodal Embeddings G1 Batch inference input file size (in GB) for Titan Multimodal Embeddings G1 storage | 1 count | Fixed |
(Advanced Prompt Optimization) Inactive jobs per account (Advanced Prompt Optimization) Inactive jobs per account general | 5,000 count | Fixed |
(Automated Reasoning) UpdateAutomatedReasoningPolicyAnnotations requests per second (Automated Reasoning) UpdateAutomatedReasoningPolicyAnnotations requests per second throughput | 5 count | Adjustable |
Associated knowledge bases per Agent Associated knowledge bases per Agent general | 2 count | Adjustable |
(Automated Reasoning) StartAutomatedReasoningPolicyTestWorkflow requests per second (Automated Reasoning) StartAutomatedReasoningPolicyTestWorkflow requests per second throughput | 1 count | Adjustable |
(Guardrails) Contextual grounding response length in text units (Guardrails) Contextual grounding response length in text units general | 5 count | Fixed |
Minimum number of records per batch inference job for Nova Micro V1 Minimum number of records per batch inference job for Nova Micro V1 general | 100 count | Fixed |
Cross-region model inference tokens per minute for Anthropic Claude 3.7 Sonnet V1 Cross-region model inference tokens per minute for Anthropic Claude 3.7 Sonnet V1 general | 1,000,000 count | Adjustable |
Batch inference job size (in GB) for NVIDIA Nemotron 3 Super 120B A12B Batch inference job size (in GB) for NVIDIA Nemotron 3 Super 120B A12B storage | 5 count | Fixed |
Cross-region model inference requests per minute for Anthropic Claude Sonnet 4.5 V1 Cross-region model inference requests per minute for Anthropic Claude Sonnet 4.5 V1 throughput | 10,000 count | Adjustable |
On-demand model inference requests per minute for Amazon Titan Text Embeddings V2 On-demand model inference requests per minute for Amazon Titan Text Embeddings V2 throughput | 6,000 count | Fixed |
(Knowledge Bases) GetDataSource requests per second (Knowledge Bases) GetDataSource requests per second throughput | 10 count | Fixed |
(Model customization) Sum of training and validation records for a Titan Image Generator G1 V2 Fine-tuning job (Model customization) Sum of training and validation records for a Titan Image Generator G1 V2 Fine-tuning job general | 10,000 count | Adjustable |
Sum of in-progress and submitted batch inference jobs using a base model for Nova 2 Lite Sum of in-progress and submitted batch inference jobs using a base model for Nova 2 Lite general | 100 count | Adjustable |
(Model customization) Sum of on demand custom model deployment requests per minute for Amazon Nova 2 Lite (Model customization) Sum of on demand custom model deployment requests per minute for Amazon Nova 2 Lite throughput | 2,000 count | Fixed |
Minimum number of records per batch inference job for Qwen3 32B Minimum number of records per batch inference job for Qwen3 32B general | 100 count | Fixed |
(Model customization) Sum of training and validation records for a Amazon Nova Pro Fine-tuning job (Model customization) Sum of training and validation records for a Amazon Nova Pro Fine-tuning job general | 20,000 count | Adjustable |
Records per batch inference job for Claude Sonnet 4 Records per batch inference job for Claude Sonnet 4 general | 100,000 count | Adjustable |
Sum of in-progress and submitted batch inference jobs using a base model for Claude 3 Opus Sum of in-progress and submitted batch inference jobs using a base model for Claude 3 Opus general | 100 count | Adjustable |
(Automated Reasoning) GetAutomatedReasoningPolicyAnnotations requests per second (Automated Reasoning) GetAutomatedReasoningPolicyAnnotations requests per second throughput | 10 count | Adjustable |
On-demand model inference tokens per minute for OpenAI GPT OSS 120B On-demand model inference tokens per minute for OpenAI GPT OSS 120B general | 100,000,000 count | Fixed |
Batch inference input file size (in GB) for DeepSeek V3.2 Batch inference input file size (in GB) for DeepSeek V3.2 storage | 1 count | Fixed |
Model invocation max tokens per day for Gemma 3 27B (doubled for cross-region calls) Model invocation max tokens per day for Gemma 3 27B (doubled for cross-region calls) general | 144,000,000,000 count | Fixed |
Minimum number of records per batch inference job for Claude 3 Sonnet Minimum number of records per batch inference job for Claude 3 Sonnet general | 100 count | Fixed |
Cross-region model inference requests per minute for Stable Image Creative Upscale Cross-region model inference requests per minute for Stable Image Creative Upscale throughput | 4 count | Fixed |
Minimum number of records per batch inference job for NVIDIA Nemotron Nano 3 30B Minimum number of records per batch inference job for NVIDIA Nemotron Nano 3 30B general | 100 count | Fixed |
Sum of in-progress and submitted batch inference jobs using a base model for NVIDIA Nemotron 3 Super 120B A12B Sum of in-progress and submitted batch inference jobs using a base model for NVIDIA Nemotron 3 Super 120B A12B general | 100 count | Adjustable |
Model units per provisioned model for the 24k context length variant for Amazon Nova Pro Model units per provisioned model for the 24k context length variant for Amazon Nova Pro general | 0 count | Adjustable |
(Automated Reasoning) Concurrent policy builds per account (Automated Reasoning) Concurrent policy builds per account compute | 5 count | Fixed |
Minimum number of records per batch inference job for Claude 3.7 Sonnet Minimum number of records per batch inference job for Claude 3.7 Sonnet general | 100 count | Adjustable |
(Guardrails) Regex length in characters (Guardrails) Regex length in characters general | 500 count | Fixed |
Model invocation max tokens per day for Ministral 8B 3.0 (doubled for cross-region calls) Model invocation max tokens per day for Ministral 8B 3.0 (doubled for cross-region calls) general | 144,000,000,000 count | Fixed |
Batch inference input file size (in GB) for Claude Sonnet 4.6 Batch inference input file size (in GB) for Claude Sonnet 4.6 storage | 1 count | Fixed |
Records per input file per batch inference job for NVIDIA Nemotron 3 Super 120B A12B Records per input file per batch inference job for NVIDIA Nemotron 3 Super 120B A12B storage | 100,000 count | Adjustable |
On-demand model inference requests per minute for Gemma 3 27B On-demand model inference requests per minute for Gemma 3 27B throughput | 10,000 count | Fixed |
(Flows) Inline code nodes per flow (Flows) Inline code nodes per flow capacity | 5 count | Fixed |
Batch inference job size (in GB) for Qwen3 32B Batch inference job size (in GB) for Qwen3 32B storage | 5 count | Fixed |
Batch inference input file size (in GB) for Qwen3 Next 80B Batch inference input file size (in GB) for Qwen3 Next 80B storage | 1 count | Fixed |
(Flows) S3 storage nodes per flow (Flows) S3 storage nodes per flow storage | 10 count | Fixed |
Sum of in-progress and submitted batch inference jobs using a base model for Amazon Nova Premier Sum of in-progress and submitted batch inference jobs using a base model for Amazon Nova Premier general | 100 count | Adjustable |
Batch inference job size (in GB) for Llama 3.2 90B Instruct Batch inference job size (in GB) for Llama 3.2 90B Instruct storage | 5 count | Fixed |
(Model customization) Sum of training and validation records for a Titan Text G1 - Express v1 Continued Pre-Training job (Model customization) Sum of training and validation records for a Titan Text G1 - Express v1 Continued Pre-Training job general | 100,000 count | Adjustable |
Records per batch inference job for NVIDIA Nemotron Nano 3 30B Records per batch inference job for NVIDIA Nemotron Nano 3 30B general | 100,000 count | Adjustable |
Records per batch inference job for NVIDIA Nemotron Nano 12B Records per batch inference job for NVIDIA Nemotron Nano 12B general | 100,000 count | Adjustable |
On-demand model inference requests per minute for Meta Llama 3.2 90B Instruct On-demand model inference requests per minute for Meta Llama 3.2 90B Instruct throughput | 400 count | Fixed |
Minimum number of records per batch inference job for Claude Sonnet 4.5. Minimum number of records per batch inference job for Claude Sonnet 4.5. general | 100 count | Fixed |
Batch inference job size (in GB) for Kimi K2 Thinking Batch inference job size (in GB) for Kimi K2 Thinking storage | 5 count | Fixed |
Model invocation max tokens per day for AI21 Labs Jamba 1.5 Large (doubled for cross-region calls) Model invocation max tokens per day for AI21 Labs Jamba 1.5 Large (doubled for cross-region calls) general | 432,000,000 count | Fixed |
On-demand model inference tokens per minute for Ministral 8B 3.0 On-demand model inference tokens per minute for Ministral 8B 3.0 general | 100,000,000 count | Fixed |
Batch inference input file size (in GB) for Claude 3.5 Haiku Batch inference input file size (in GB) for Claude 3.5 Haiku storage | 1 count | Fixed |
Minimum number of records per batch inference job for Nova Lite V1 Minimum number of records per batch inference job for Nova Lite V1 general | 100 count | Fixed |
On-demand model inference requests per minute for Qwen3 32B V1 On-demand model inference requests per minute for Qwen3 32B V1 throughput | 10,000 count | Fixed |
Minimum number of records per batch inference job for Qwen3 Coder Next Minimum number of records per batch inference job for Qwen3 Coder Next general | 100 count | Fixed |
(Automated Reasoning) CreateAutomatedReasoningPolicy requests per second (Automated Reasoning) CreateAutomatedReasoningPolicy requests per second throughput | 5 count | Adjustable |
On-demand model inference tokens per minute for AI21 Labs Jamba 1.5 Large On-demand model inference tokens per minute for AI21 Labs Jamba 1.5 Large general | 300,000 count | Fixed |
Records per batch inference job for Qwen3 Coder 30B Records per batch inference job for Qwen3 Coder 30B general | 100,000 count | Adjustable |
Model units per provisioned model for Anthropic Claude 3 Sonnet 28K Model units per provisioned model for Anthropic Claude 3 Sonnet 28K general | 0 count | Adjustable |
(Knowledge Bases) DeleteKnowledgeBaseDocuments requests per second (Knowledge Bases) DeleteKnowledgeBaseDocuments requests per second throughput | 5 count | Fixed |
Model units, with commitment, for Provisioned Throughout created for Meta Maverick 4 Scout 17B Instruct 128K Model units, with commitment, for Provisioned Throughout created for Meta Maverick 4 Scout 17B Instruct 128K general | 0 count | Adjustable |
Model invocation max tokens per day for Anthropic Claude Sonnet 4 V1 1M Context Length (doubled for cross-region calls) Model invocation max tokens per day for Anthropic Claude Sonnet 4 V1 1M Context Length (doubled for cross-region calls) general | 720,000,000 count | Fixed |
Throttle rate limit for Bedrock Data Automation: ListTagsForResource Throttle rate limit for Bedrock Data Automation: ListTagsForResource throughput | 25 count | Fixed |
Model units per provisioned model for Meta Llama 2 Chat 70B Model units per provisioned model for Meta Llama 2 Chat 70B general | 0 count | Adjustable |
Throttle rate limit for Bedrock Data Automation: UntagResource Throttle rate limit for Bedrock Data Automation: UntagResource throughput | 25 count | Fixed |
Records per input file per batch inference job for Claude Haiku 4.5 Records per input file per batch inference job for Claude Haiku 4.5 storage | 100,000 count | Adjustable |
(Flows) Flow versions per flow (Flows) Flow versions per flow general | 10 count | Fixed |
On-demand model inference requests per minute for Nemotron Nano 3 30B On-demand model inference requests per minute for Nemotron Nano 3 30B throughput | 10,000 count | Fixed |
On-demand model inference tokens per minute for Anthropic Claude 3 Sonnet On-demand model inference tokens per minute for Anthropic Claude 3 Sonnet general | 1,000,000 count | Fixed |
(Data Automation) Maximum Blueprints per Project (Videos) (Data Automation) Maximum Blueprints per Project (Videos) general | 1 count | Fixed |
On-demand model inference tokens per minute for Amazon Nova Lite On-demand model inference tokens per minute for Amazon Nova Lite general | 4,000,000 count | Fixed |
(Data Automation) Maximum JSON Blueprint Size (Characters) (Data Automation) Maximum JSON Blueprint Size (Characters) storage | 100,000 count | Fixed |
Sum of in-progress and submitted batch inference jobs using a base model for OpenAI GPT OSS 20b Sum of in-progress and submitted batch inference jobs using a base model for OpenAI GPT OSS 20b general | 100 count | Adjustable |
On-demand model inference requests per minute for Meta Llama 3 70B Instruct On-demand model inference requests per minute for Meta Llama 3 70B Instruct throughput | 400 count | Fixed |
Records per batch inference job for Gemma 3 27B Records per batch inference job for Gemma 3 27B general | 100,000 count | Adjustable |
Sum of in-progress and submitted batch inference jobs using a base model for Devstral 2 123B Sum of in-progress and submitted batch inference jobs using a base model for Devstral 2 123B general | 100 count | Adjustable |
Minimum number of records per batch inference job for MiniMax M2 Minimum number of records per batch inference job for MiniMax M2 general | 100 count | Fixed |
Minimum number of records per batch inference job for OpenAI GPT OSS 120b Minimum number of records per batch inference job for OpenAI GPT OSS 120b general | 100 count | Fixed |
(Model customization) In-progress custom model deployments (Model customization) In-progress custom model deployments general | 2 count | Adjustable |
Model units per provisioned model for Stability.ai Stable Diffusion XL 0.8 Model units per provisioned model for Stability.ai Stable Diffusion XL 0.8 general | 0 count | Adjustable |
On-demand model inference tokens per minute for Meta Llama 3 70B Instruct On-demand model inference tokens per minute for Meta Llama 3 70B Instruct general | 300,000 count | Fixed |
Global cross-region model inference tokens per minute for Anthropic Claude Opus 4.7 Global cross-region model inference tokens per minute for Anthropic Claude Opus 4.7 general | 15,000,000 count | Adjustable |
(Knowledge Bases) ListKnowledgeBaseDocuments requests per second (Knowledge Bases) ListKnowledgeBaseDocuments requests per second throughput | 5 count | Fixed |
Batch inference job size (in GB) for Gemma 3 27B Batch inference job size (in GB) for Gemma 3 27B storage | 5 count | Fixed |
(Model customization) Sum of on demand custom model deployment tokens per day for Amazon Nova 2 Lite (Model customization) Sum of on demand custom model deployment tokens per day for Amazon Nova 2 Lite general | 5,760,000,000 count | Fixed |
(Guardrails) On-demand ApplyGuardrail requests per second (Guardrails) On-demand ApplyGuardrail requests per second throughput | 50 count | Adjustable |
Sum of in-progress and submitted batch inference jobs using a base model for MiniMax M2.1 Sum of in-progress and submitted batch inference jobs using a base model for MiniMax M2.1 general | 100 count | Adjustable |
Cross-region model inference tokens per minute for Anthropic Claude 3 Opus Cross-region model inference tokens per minute for Anthropic Claude 3 Opus general | 800,000 count | Adjustable |
Records per batch inference job for Claude Haiku 4.5 Records per batch inference job for Claude Haiku 4.5 general | 100,000 count | Adjustable |
(Model customization) Sum of training and validation records for a Titan Text G1 - Lite v1 Fine-tuning job (Model customization) Sum of training and validation records for a Titan Text G1 - Lite v1 Fine-tuning job general | 10,000 count | Adjustable |
(Automated Reasoning) Versions per policy (Automated Reasoning) Versions per policy identity | 1,000 count | Fixed |
Cross-region model inference requests per minute for Meta Llama 3.2 1B Instruct Cross-region model inference requests per minute for Meta Llama 3.2 1B Instruct throughput | 1,600 count | Fixed |
Model invocation max tokens per day for Z.ai GLM 5 (doubled for cross-region calls) Model invocation max tokens per day for Z.ai GLM 5 (doubled for cross-region calls) general | 144,000,000,000 count | Fixed |
Batch inference input file size (in GB) for GLM 5 Batch inference input file size (in GB) for GLM 5 storage | 1 count | Fixed |
Minimum number of records per batch inference job for NVIDIA Nemotron 3 Super 120B A12B Minimum number of records per batch inference job for NVIDIA Nemotron 3 Super 120B A12B general | 100 count | Fixed |
On-demand model inference tokens per minute for Anthropic Claude 3.5 Sonnet V2 On-demand model inference tokens per minute for Anthropic Claude 3.5 Sonnet V2 general | 400,000 count | Fixed |
Model units per provisioned model for the 128k context length variant for Amazon Nova Micro Model units per provisioned model for the 128k context length variant for Amazon Nova Micro general | 0 count | Adjustable |
Model units per provisioned model for Stability.ai Stable Diffusion XL 1.0 Model units per provisioned model for Stability.ai Stable Diffusion XL 1.0 general | 0 count | Adjustable |
On-Demand, latency-optimized model inference requests per minute for Meta Llama 3.1 70B Instruct On-Demand, latency-optimized model inference requests per minute for Meta Llama 3.1 70B Instruct throughput | 100 count | Fixed |
Sum of in-progress and submitted batch inference jobs using a base model for Nova Micro V1 Sum of in-progress and submitted batch inference jobs using a base model for Nova Micro V1 general | 100 count | Adjustable |
Cross-region model inference tokens per minute for Anthropic Claude Sonnet 4.6 Cross-region model inference tokens per minute for Anthropic Claude Sonnet 4.6 general | 6,000,000 count | Adjustable |
Throttle rate limit for CreateBlueprint Throttle rate limit for CreateBlueprint throughput | 5 count | Fixed |
(Knowledge Bases) Ingestion job file size (Knowledge Bases) Ingestion job file size storage | 50 count | Fixed |
(Automated Reasoning) StartAutomatedReasoningPolicyBuildWorkflow requests per second (Automated Reasoning) StartAutomatedReasoningPolicyBuildWorkflow requests per second throughput | 1 count | Adjustable |
On-demand model inference requests per minute for Meta Llama 2 Chat 13B On-demand model inference requests per minute for Meta Llama 2 Chat 13B throughput | 800 count | Fixed |
Records per input file per batch inference job for Ministral 3B Records per input file per batch inference job for Ministral 3B storage | 100,000 count | Adjustable |
(Automated Reasoning) Source document tokens (Automated Reasoning) Source document tokens general | 122,880 count | Fixed |
Global cross-region model inference tokens per minute for Amazon Nova 2 Lite Global cross-region model inference tokens per minute for Amazon Nova 2 Lite general | 8,000,000 count | Adjustable |
On-demand model inference tokens per minute for Amazon Titan Multimodal Embeddings G1 On-demand model inference tokens per minute for Amazon Titan Multimodal Embeddings G1 general | 300,000 count | Fixed |
On-demand model inference requests per minute for OpenAI GPT OSS 120B On-demand model inference requests per minute for OpenAI GPT OSS 120B throughput | 10,000 count | Fixed |
On-demand model inference requests per minute for Stable Image Search and Replace On-demand model inference requests per minute for Stable Image Search and Replace throughput | 10 count | Fixed |
On-demand model inference requests per minute for Qwen3 Next 80B A3B On-demand model inference requests per minute for Qwen3 Next 80B A3B throughput | 10,000 count | Fixed |
Batch inference job size (in GB) for Claude Haiku 4.5 Batch inference job size (in GB) for Claude Haiku 4.5 storage | 5 count | Fixed |
Global cross-region model inference tokens per minute for Anthropic Claude Sonnet 4 V1 Global cross-region model inference tokens per minute for Anthropic Claude Sonnet 4 V1 general | 200,000 count | Adjustable |
Sum of in-progress and submitted batch inference jobs using a base model for Ministral 3B Sum of in-progress and submitted batch inference jobs using a base model for Ministral 3B general | 100 count | Adjustable |
Records per input file per batch inference job for Nova Lite V1 Records per input file per batch inference job for Nova Lite V1 storage | 100,000 count | Adjustable |
Sum of in-progress and submitted batch inference jobs using a base model for Ministral 3 8B Sum of in-progress and submitted batch inference jobs using a base model for Ministral 3 8B general | 100 count | Adjustable |
Cross-region model inference requests per minute for Meta Llama 3.1 8B Instruct Cross-region model inference requests per minute for Meta Llama 3.1 8B Instruct throughput | 1,600 count | Fixed |
Model invocation max tokens per day for Meta Llama 3.2 11B Instruct (doubled for cross-region calls) Model invocation max tokens per day for Meta Llama 3.2 11B Instruct (doubled for cross-region calls) general | 432,000,000 count | Fixed |
Batch inference input file size (in GB) for Llama 3.2 1B Instruct Batch inference input file size (in GB) for Llama 3.2 1B Instruct storage | 1 count | Fixed |
(Guardrails) Versions per guardrail (Guardrails) Versions per guardrail general | 20 count | Fixed |
On-demand model inference requests per minute for Qwen3 VL 235B A22B On-demand model inference requests per minute for Qwen3 VL 235B A22B throughput | 10,000 count | Fixed |
Cross-region model inference requests per minute for Meta Llama 3.2 3B Instruct Cross-region model inference requests per minute for Meta Llama 3.2 3B Instruct throughput | 1,600 count | Fixed |
(Evaluation) Number of custom metrics (Evaluation) Number of custom metrics general | 10 count | Fixed |
Records per input file per batch inference job for Qwen3 Next 80B Records per input file per batch inference job for Qwen3 Next 80B storage | 100,000 count | Adjustable |
Cross-region model inference requests per minute for Stable Image Conservative Upscale Cross-region model inference requests per minute for Stable Image Conservative Upscale throughput | 4 count | Fixed |
UpdateAgentKnowledgeBase requests per second UpdateAgentKnowledgeBase requests per second throughput | 4 count | Fixed |
(Model customization) Maximum line length for distillation customization jobs (Model customization) Maximum line length for distillation customization jobs general | 16 Kilobytes | Fixed |
Cross-region model inference requests per minute for Anthropic Claude 3 Haiku Cross-region model inference requests per minute for Anthropic Claude 3 Haiku throughput | 2,000 count | Fixed |
Sum of in-progress and submitted batch inference jobs using a base model for Claude Sonnet 4.6 Sum of in-progress and submitted batch inference jobs using a base model for Claude Sonnet 4.6 general | 100 count | Adjustable |
Sum of in-progress and submitted batch inference jobs using a base model for Qwen3 Coder 30B Sum of in-progress and submitted batch inference jobs using a base model for Qwen3 Coder 30B general | 100 count | Adjustable |
Records per batch inference job for Claude 3.5 Sonnet v2 Records per batch inference job for Claude 3.5 Sonnet v2 general | 100,000 count | Adjustable |
(Knowledge Bases) Concurrent ingestion jobs per knowledge base (Knowledge Bases) Concurrent ingestion jobs per knowledge base compute | 1 count | Fixed |
Cross-region model inference requests per minute for Cohere Embed V4 Cross-region model inference requests per minute for Cohere Embed V4 throughput | 2,000 count | Fixed |
Batch inference job size (in GB) for Claude 3.7 Sonnet Batch inference job size (in GB) for Claude 3.7 Sonnet storage | 5 count | Adjustable |
Global cross-region model inference tokens per minute for Cohere Embed V4 Global cross-region model inference tokens per minute for Cohere Embed V4 general | 300,000 count | Adjustable |
On-demand model inference tokens per minute for Meta Llama 3.2 11B Instruct On-demand model inference tokens per minute for Meta Llama 3.2 11B Instruct general | 300,000 count | Fixed |
On-demand model inference requests per minute for Mistral 7B Instruct On-demand model inference requests per minute for Mistral 7B Instruct throughput | 800 count | Fixed |
Records per input file per batch inference job for Voxtral Mini 3B 2507 Records per input file per batch inference job for Voxtral Mini 3B 2507 storage | 100,000 count | Adjustable |
Minimum number of records per batch inference job for Claude 3.5 Haiku Minimum number of records per batch inference job for Claude 3.5 Haiku general | 100 count | Fixed |
On-demand model inference requests per minute for AI21 Labs Jurassic-2 Mid On-demand model inference requests per minute for AI21 Labs Jurassic-2 Mid throughput | 400 count | Fixed |
On-demand model inference requests per minute for Anthropic Claude 3 Sonnet On-demand model inference requests per minute for Anthropic Claude 3 Sonnet throughput | 500 count | Fixed |
(Flows) DeleteFlowAlias requests per second (Flows) DeleteFlowAlias requests per second throughput | 2 count | Fixed |
(Flows) Flow aliases per flow (Flows) Flow aliases per flow general | 10 count | Fixed |
(Knowledge Bases) Files to add or update per ingestion job (Knowledge Bases) Files to add or update per ingestion job storage | 5,000,000 count | Fixed |
Records per input file per batch inference job for OpenAI GPT OSS 20b Records per input file per batch inference job for OpenAI GPT OSS 20b storage | 100,000 count | Adjustable |
Model invocation max tokens per day for Ministral 14B 3.0 (doubled for cross-region calls) Model invocation max tokens per day for Ministral 14B 3.0 (doubled for cross-region calls) general | 144,000,000,000 count | Fixed |
(Knowledge Bases) IngestKnowledgeBaseDocuments requests per second (Knowledge Bases) IngestKnowledgeBaseDocuments requests per second throughput | 5 count | Fixed |
Model invocation max tokens per day for Minimax M2.1 (doubled for cross-region calls) Model invocation max tokens per day for Minimax M2.1 (doubled for cross-region calls) general | 144,000,000,000 count | Fixed |
Batch inference input file size (in GB) for Kimi K2.5 Batch inference input file size (in GB) for Kimi K2.5 storage | 1 count | Fixed |
Sum of in-progress and submitted batch inference jobs using a base model for Claude Opus 4.5 Sum of in-progress and submitted batch inference jobs using a base model for Claude Opus 4.5 general | 100 count | Adjustable |
ListAgentKnowledgeBases requests per second ListAgentKnowledgeBases requests per second throughput | 10 count | Fixed |
(Model customization) Sum of on demand custom model deployment tokens per day for Amazon Nova Micro (Model customization) Sum of on demand custom model deployment tokens per day for Amazon Nova Micro general | 5,760,000,000 count | Fixed |
Records per batch inference job for Titan Text Embeddings V2 Records per batch inference job for Titan Text Embeddings V2 general | 100,000 count | Adjustable |
On-demand model inference tokens per minute for Mistral Devstral 2 123b On-demand model inference tokens per minute for Mistral Devstral 2 123b general | 100,000,000 count | Fixed |
(Flows) ValidateFlowDefinition requests per second (Flows) ValidateFlowDefinition requests per second throughput | 2 count | Fixed |
Inference profiles per account Inference profiles per account storage | 1,000 count | Adjustable |
On-demand model inference requests per minute for Twelve Labs Marengo On-demand model inference requests per minute for Twelve Labs Marengo throughput | 100 count | Fixed |
CreateAgent requests per second CreateAgent requests per second throughput | 6 count | Fixed |
(Knowledge Bases) Files to ingest per IngestKnowledgeBaseDocuments job. (Knowledge Bases) Files to ingest per IngestKnowledgeBaseDocuments job. storage | 25 count | Fixed |
Cross-region model inference requests per minute for Amazon Nova 2 Lite Cross-region model inference requests per minute for Amazon Nova 2 Lite throughput | 2,000 count | Fixed |
Minimum number of records per batch inference job for Llama 3.1 8B Instruct Minimum number of records per batch inference job for Llama 3.1 8B Instruct general | 100 count | Fixed |
Minimum number of records per batch inference job for Titan Text Embeddings V2 Minimum number of records per batch inference job for Titan Text Embeddings V2 general | 100 count | Fixed |
Records per batch inference job for Llama 3.1 70B Instruct Records per batch inference job for Llama 3.1 70B Instruct general | 100,000 count | Adjustable |
Sum of in-progress and submitted batch inference jobs using a base model for Titan Multimodal Embeddings G1 Sum of in-progress and submitted batch inference jobs using a base model for Titan Multimodal Embeddings G1 general | 100 count | Adjustable |
Batch inference job size (in GB) for Nova Lite V1 Batch inference job size (in GB) for Nova Lite V1 storage | 100 count | Fixed |
Model units per provisioned model for Amazon Titan Embeddings G1 - Text Model units per provisioned model for Amazon Titan Embeddings G1 - Text general | 0 count | Adjustable |
On-demand InvokeModel concurrent requests for Amazon Nova Sonic On-demand InvokeModel concurrent requests for Amazon Nova Sonic compute | 20 count | Fixed |
Model units per provisioned model for Cohere Embed English Model units per provisioned model for Cohere Embed English general | 0 count | Adjustable |
(Model customization) Sum of on demand custom model deployment tokens per minute for Amazon Nova 2 Lite (Model customization) Sum of on demand custom model deployment tokens per minute for Amazon Nova 2 Lite general | 4,000,000 count | Fixed |
(Knowledge Bases) User query size (Knowledge Bases) User query size storage | 1,000 count | Fixed |
(Flows) GetFlowVersion requests per second (Flows) GetFlowVersion requests per second throughput | 10 count | Fixed |
On-demand model inference tokens per minute for Amazon Nova Pro On-demand model inference tokens per minute for Amazon Nova Pro general | 1,000,000 count | Fixed |
Records per batch inference job for Claude 3.5 Haiku Records per batch inference job for Claude 3.5 Haiku general | 100,000 count | Adjustable |
No-commitment model units for Provisioned Throughput created for base model Amazon Nova 2 Lite V1.0 256K No-commitment model units for Provisioned Throughput created for base model Amazon Nova 2 Lite V1.0 256K general | 0 count | Fixed |
On-demand model inference requests per minute for MiniMax M2.5 On-demand model inference requests per minute for MiniMax M2.5 throughput | 10,000 count | Fixed |
Model units per provisioned model for Mistral Small Model units per provisioned model for Mistral Small general | 0 count | Adjustable |
(Data Automation) InvokeBlueprintOptimizationAsync - Max number of blueprint optimization concurrent jobs (Data Automation) InvokeBlueprintOptimizationAsync - Max number of blueprint optimization concurrent jobs compute | 3 count | Adjustable |
On-demand model inference requests per minute for Z.ai GLM-4.7 Flash On-demand model inference requests per minute for Z.ai GLM-4.7 Flash throughput | 10,000 count | Fixed |
On-demand model inference tokens per minute for NVIDIA Nemotron Nano 2 VL On-demand model inference tokens per minute for NVIDIA Nemotron Nano 2 VL general | 100,000,000 count | Fixed |
(Knowledge Bases) RetrieveAndGenerateStream requests per second (Knowledge Bases) RetrieveAndGenerateStream requests per second throughput | 20 count | Fixed |
Records per batch inference job for Claude Sonnet 4.6 Records per batch inference job for Claude Sonnet 4.6 general | 100,000 count | Adjustable |
(Knowledge Bases) Files to delete per ingestion job (Knowledge Bases) Files to delete per ingestion job storage | 5,000,000 count | Fixed |
(Automated Reasoning) ListAutomatedReasoningPolicies requests per second (Automated Reasoning) ListAutomatedReasoningPolicies requests per second throughput | 5 count | Adjustable |
(Data Automation) Maximum number of Blueprints per Start Inference request (Images) (Data Automation) Maximum number of Blueprints per Start Inference request (Images) throughput | 1 count | Fixed |
Batch inference job size (in GB) for NVIDIA Nemotron Nano 9B Batch inference job size (in GB) for NVIDIA Nemotron Nano 9B storage | 5 count | Fixed |
Model invocation max tokens per day for DeepSeek V3.2 (doubled for cross-region calls) Model invocation max tokens per day for DeepSeek V3.2 (doubled for cross-region calls) general | 144,000,000,000 count | Fixed |
Batch inference job size (in GB) for Nova Micro V1 Batch inference job size (in GB) for Nova Micro V1 storage | 5 count | Fixed |
(Flows) GetFlowAlias requests per second (Flows) GetFlowAlias requests per second throughput | 10 count | Fixed |
On-demand model inference requests per minute for Amazon Titan Image Generator G1 On-demand model inference requests per minute for Amazon Titan Image Generator G1 throughput | 60 count | Fixed |
On-demand model inference tokens per minute for Ministral 14B 3.0 On-demand model inference tokens per minute for Ministral 14B 3.0 general | 100,000,000 count | Fixed |
Records per input file per batch inference job for Gemma 3 27B Records per input file per batch inference job for Gemma 3 27B storage | 100,000 count | Adjustable |
Batch inference input file size (in GB) for OpenAI GPT OSS Safeguard 20b Batch inference input file size (in GB) for OpenAI GPT OSS Safeguard 20b storage | 1 count | Fixed |
Batch inference job size (in GB) for Claude 3 Haiku Batch inference job size (in GB) for Claude 3 Haiku storage | 5 count | Fixed |
(Automated Reasoning) DeleteAutomatedReasoningPolicyBuildWorkflow requests per second (Automated Reasoning) DeleteAutomatedReasoningPolicyBuildWorkflow requests per second throughput | 5 count | Adjustable |
(Automated Reasoning) Types per policy (Automated Reasoning) Types per policy identity | 50 count | Fixed |
(Model customization) Sum of on demand custom model deployment tokens per day for Amazon Nova Pro (Model customization) Sum of on demand custom model deployment tokens per day for Amazon Nova Pro general | 1,152,000,000 count | Fixed |
Records per batch inference job for Voxtral Mini 3B 2507 Records per batch inference job for Voxtral Mini 3B 2507 general | 100,000 count | Adjustable |
Model invocation max tokens per day for Anthropic Claude Opus 4.1 (doubled for cross-region calls) Model invocation max tokens per day for Anthropic Claude Opus 4.1 (doubled for cross-region calls) general | 360,000,000 count | Fixed |
Sum of in-progress and submitted batch inference jobs using a base model for Mistral Large 2 (24.07) Sum of in-progress and submitted batch inference jobs using a base model for Mistral Large 2 (24.07) general | 100 count | Adjustable |
Cross-region model inference tokens per minute for Meta Llama 3.2 3B Instruct Cross-region model inference tokens per minute for Meta Llama 3.2 3B Instruct general | 600,000 count | Adjustable |
On-demand model inference requests per minute for Writer Palmyra Vision 7B On-demand model inference requests per minute for Writer Palmyra Vision 7B throughput | 10,000 count | Fixed |
(Automated Reasoning) Variables in policy (Automated Reasoning) Variables in policy identity | 200 count | Fixed |
Model invocation max tokens per day for Qwen3 Coder Next (doubled for cross-region calls) Model invocation max tokens per day for Qwen3 Coder Next (doubled for cross-region calls) general | 144,000,000,000 count | Fixed |
Batch inference job size (in GB) for Claude Opus 4.6 Batch inference job size (in GB) for Claude Opus 4.6 storage | 5 count | Fixed |
(Automated Reasoning) ListAutomatedReasoningPolicyTestResults requests per second (Automated Reasoning) ListAutomatedReasoningPolicyTestResults requests per second throughput | 5 count | Adjustable |
Batch inference job size (in GB) for Claude 3.5 Haiku Batch inference job size (in GB) for Claude 3.5 Haiku storage | 5 count | Fixed |
On-demand model inference requests per minute for Stable Image Inpaint On-demand model inference requests per minute for Stable Image Inpaint throughput | 10 count | Fixed |
(Flows) Flow executions per account (Flows) Flow executions per account compute | 1,000 count | Adjustable |
Cross-region model inference requests per minute for Twelve Labs Marengo Cross-region model inference requests per minute for Twelve Labs Marengo throughput | 200 count | Fixed |
(Prompt management) UpdatePrompt requests per second (Prompt management) UpdatePrompt requests per second throughput | 2 count | Fixed |
Model invocation max tokens per day for DeepSeek R1 V1 (doubled for cross-region calls) Model invocation max tokens per day for DeepSeek R1 V1 (doubled for cross-region calls) general | 144,000,000 count | Fixed |
AssociateAgentKnowledgeBase requests per second AssociateAgentKnowledgeBase requests per second throughput | 6 count | Fixed |
Global cross-region model inference requests per minute for Amazon Nova 2 Omni Global cross-region model inference requests per minute for Amazon Nova 2 Omni throughput | 2,000 count | Adjustable |
(Flows) Prompt nodes per flow (Flows) Prompt nodes per flow capacity | 20 count | Adjustable |
Global cross-region model inference tokens per minute for Amazon Nova 2 Pro Preview Global cross-region model inference tokens per minute for Amazon Nova 2 Pro Preview general | 1,000,000 count | Adjustable |
Sum of in-progress and submitted batch inference jobs using a base model for Gemma 3 27B Sum of in-progress and submitted batch inference jobs using a base model for Gemma 3 27B general | 100 count | Adjustable |
(Data Automation) Maximum video file size (MB) (Data Automation) Maximum video file size (MB) storage | 10,240 count | Fixed |
Throttle rate limit for UpdateDataAutomationProject Throttle rate limit for UpdateDataAutomationProject throughput | 5 count | Fixed |
(Data Automation) Minimum audio length (Miliseconds) (Data Automation) Minimum audio length (Miliseconds) general | 500 count | Fixed |
(Flows) Collector nodes per flow (Flows) Collector nodes per flow capacity | 1 count | Fixed |
Batch inference job size (in GB) for Llama 3.1 70B Instruct Batch inference job size (in GB) for Llama 3.1 70B Instruct storage | 5 count | Fixed |
Global cross-region model inference tokens per day for Anthropic Claude Sonnet 4.6 Global cross-region model inference tokens per day for Anthropic Claude Sonnet 4.6 general | 8,640,000,000 count | Fixed |
Batch inference input file size (in GB) for MiniMax M2 Batch inference input file size (in GB) for MiniMax M2 storage | 1 count | Fixed |
Sum of in-progress and submitted batch inference jobs using a base model for Gemma 3 12B Sum of in-progress and submitted batch inference jobs using a base model for Gemma 3 12B general | 100 count | Adjustable |
Model invocation max tokens per day for Amazon Nova 2 Omni (doubled for cross-region calls) Model invocation max tokens per day for Amazon Nova 2 Omni (doubled for cross-region calls) general | 5,760,000,000 count | Fixed |
On-demand model inference tokens per minute for Qwen3 Coder Next On-demand model inference tokens per minute for Qwen3 Coder Next general | 100,000,000 count | Fixed |
Model invocation max tokens per day for Amazon Nova Lite (doubled for cross-region calls) Model invocation max tokens per day for Amazon Nova Lite (doubled for cross-region calls) general | 5,760,000,000 count | Fixed |
(Prompt management) GetPrompt requests per second (Prompt management) GetPrompt requests per second throughput | 10 count | Fixed |
(Model customization) Sum of on demand custom model deployment tokens per day for Amazon Nova Lite (Model customization) Sum of on demand custom model deployment tokens per day for Amazon Nova Lite general | 5,760,000,000 count | Fixed |
(Data Automation) Maximum Levels of Field Hierarchy (Data Automation) Maximum Levels of Field Hierarchy general | 1 count | Fixed |
Batch inference input file size (in GB) for Claude 3 Haiku Batch inference input file size (in GB) for Claude 3 Haiku storage | 1 count | Fixed |
Cross-region model inference tokens per minute for Anthropic Claude Opus 4 V1 Cross-region model inference tokens per minute for Anthropic Claude Opus 4 V1 general | 200,000 count | Adjustable |
Minimum number of records per batch inference job for Mistral Large 2 (24.07) Minimum number of records per batch inference job for Mistral Large 2 (24.07) general | 100 count | Fixed |
Model units per provisioned model for Meta Llama 2 70B Model units per provisioned model for Meta Llama 2 70B general | 0 count | Adjustable |
Minimum number of records per batch inference job for Claude 3 Haiku Minimum number of records per batch inference job for Claude 3 Haiku general | 100 count | Fixed |
Minimum number of records per batch inference job for Claude 3.5 Sonnet Minimum number of records per batch inference job for Claude 3.5 Sonnet general | 100 count | Fixed |
Minimum number of records per batch inference job for Amazon Nova 2 Multimodal Embeddings V1 Minimum number of records per batch inference job for Amazon Nova 2 Multimodal Embeddings V1 general | 100 count | Fixed |
Global cross-region model inference requests per minute for Amazon Nova 2 Pro Preview Global cross-region model inference requests per minute for Amazon Nova 2 Pro Preview throughput | 100 count | Adjustable |
Model units per provisioned model for Cohere Command R Model units per provisioned model for Cohere Command R general | 0 count | Adjustable |
Sum of in-progress and submitted batch inference jobs using a base model for Gemma 3 4B Sum of in-progress and submitted batch inference jobs using a base model for Gemma 3 4B general | 100 count | Adjustable |
On-demand model inference requests per minute for Twelve Labs Pegasus On-demand model inference requests per minute for Twelve Labs Pegasus throughput | 60 count | Adjustable |
Batch inference job size (in GB) for GLM 4.7 Batch inference job size (in GB) for GLM 4.7 storage | 5 count | Fixed |
On-demand model inference tokens per minute for Mistral AI Mistral Small On-demand model inference tokens per minute for Mistral AI Mistral Small general | 300,000 count | Fixed |
(Model customization) Maximum student model fine tuning context length for Amazon Nova Micro V1 distillation customization jobs (Model customization) Maximum student model fine tuning context length for Amazon Nova Micro V1 distillation customization jobs general | 32,000 count | Fixed |
On-demand model inference tokens per minute for GPT OSS Safeguard 120B On-demand model inference tokens per minute for GPT OSS Safeguard 120B general | 100,000,000 count | Fixed |
Cross-region model inference requests per minute for TwelveLabs Marengo Embed 3.0 Cross-region model inference requests per minute for TwelveLabs Marengo Embed 3.0 throughput | 1,000 count | Adjustable |
Model units per provisioned model for Meta Llama 2 13B Model units per provisioned model for Meta Llama 2 13B general | 0 count | Adjustable |
On-demand model inference tokens per minute for DeepSeek V3.2 On-demand model inference tokens per minute for DeepSeek V3.2 general | 100,000,000 count | Fixed |
On-demand model inference tokens per minute for Meta Llama 3.1 8B Instruct On-demand model inference tokens per minute for Meta Llama 3.1 8B Instruct general | 300,000 count | Fixed |
Batch inference input file size (in GB) for Llama 3.1 405B Instruct Batch inference input file size (in GB) for Llama 3.1 405B Instruct storage | 1 count | Fixed |
On-demand model inference requests per minute for Stable Image Fast Upscale On-demand model inference requests per minute for Stable Image Fast Upscale throughput | 10 count | Fixed |
On-demand model inference requests per minute for Stability.ai Stable Diffusion XL 0.8 On-demand model inference requests per minute for Stability.ai Stable Diffusion XL 0.8 throughput | 60 count | Fixed |
UpdateAgentAlias requests per second UpdateAgentAlias requests per second throughput | 2 count | Fixed |
(Guardrails) Regex entities in Sensitive Information Filter (Guardrails) Regex entities in Sensitive Information Filter general | 30 count | Fixed |
Cross-region model inference requests per minute for Amazon Nova Pro Cross-region model inference requests per minute for Amazon Nova Pro throughput | 500 count | Fixed |
On-demand model inference tokens per minute for Gemma 3 12B On-demand model inference tokens per minute for Gemma 3 12B general | 100,000,000 count | Fixed |
Sum of in-progress and submitted batch inference jobs using a base model for Claude Sonnet 4 Sum of in-progress and submitted batch inference jobs using a base model for Claude Sonnet 4 general | 100 count | Adjustable |
Model invocation max tokens per day for Anthropic Claude 3.5 Sonnet V1 (doubled for cross-region calls) Model invocation max tokens per day for Anthropic Claude 3.5 Sonnet V1 (doubled for cross-region calls) general | 2,880,000,000 count | Fixed |
Records per input file per batch inference job for Llama 4 Scout Records per input file per batch inference job for Llama 4 Scout storage | 100,000 count | Adjustable |
Model invocation max tokens per day for Meta Llama 4 Scout V1 (doubled for cross-region calls) Model invocation max tokens per day for Meta Llama 4 Scout V1 (doubled for cross-region calls) general | 432,000,000 count | Fixed |
(Data Automation) Maximum document file size (MB) (Data Automation) Maximum document file size (MB) storage | 500 count | Fixed |
(Automated Reasoning) Policies per account (Automated Reasoning) Policies per account general | 100 count | Fixed |
Cross-region model inference tokens per minute for Amazon Nova Pro Cross-region model inference tokens per minute for Amazon Nova Pro general | 2,000,000 count | Adjustable |
Model invocation max tokens per day for Qwen3 Next 80B A3B (doubled for cross-region calls) Model invocation max tokens per day for Qwen3 Next 80B A3B (doubled for cross-region calls) general | 144,000,000,000 count | Fixed |
Cross-region model inference requests per minute for Amazon Nova 2 Omni Cross-region model inference requests per minute for Amazon Nova 2 Omni throughput | 2,000 count | Fixed |
(Flows) Condition nodes per flow (Flows) Condition nodes per flow capacity | 5 count | Fixed |
(Automated Reasoning) CreateAutomatedReasoningPolicyTestCase requests per second (Automated Reasoning) CreateAutomatedReasoningPolicyTestCase requests per second throughput | 5 count | Adjustable |
(Flows) PrepareFlow requests per second (Flows) PrepareFlow requests per second throughput | 2 count | Fixed |
Batch inference job size (in GB) for Ministral 3 14B Batch inference job size (in GB) for Ministral 3 14B storage | 5 count | Fixed |
Throttle rate limit for Bedrock Data Automation Runtime: ListTagsForResource Throttle rate limit for Bedrock Data Automation Runtime: ListTagsForResource throughput | 25 count | Fixed |
(Automated Reasoning) ListAutomatedReasoningPolicyTestCases requests per second (Automated Reasoning) ListAutomatedReasoningPolicyTestCases requests per second throughput | 5 count | Adjustable |
(Guardrails) On-demand ApplyGuardrail Denied topic policy text units per second (Guardrails) On-demand ApplyGuardrail Denied topic policy text units per second identity | 50 count | Adjustable |
(Data Automation) Maximum image file size (MB) (Data Automation) Maximum image file size (MB) storage | 5 count | Fixed |
Records per input file per batch inference job for Claude 3.7 Sonnet Records per input file per batch inference job for Claude 3.7 Sonnet storage | 100,000 count | Adjustable |
Minimum number of records per batch inference job for Mistral Small Minimum number of records per batch inference job for Mistral Small general | 100 count | Fixed |
Batch inference job size (in GB) for Llama 3.2 1B Instruct Batch inference job size (in GB) for Llama 3.2 1B Instruct storage | 5 count | Fixed |
Records per input file per batch inference job for Nova Micro V1 Records per input file per batch inference job for Nova Micro V1 storage | 100,000 count | Adjustable |
On-demand model inference tokens per minute for OpenAI GPT OSS 20B On-demand model inference tokens per minute for OpenAI GPT OSS 20B general | 100,000,000 count | Fixed |
On-demand model inference requests per minute for Ministral 14B 3.0 On-demand model inference requests per minute for Ministral 14B 3.0 throughput | 10,000 count | Fixed |
(Evaluation) Number of datasets per job (Evaluation) Number of datasets per job general | 5 count | Fixed |
Cross-region model inference tokens per minute for Anthropic Claude 3 Haiku Cross-region model inference tokens per minute for Anthropic Claude 3 Haiku general | 4,000,000 count | Adjustable |
Model invocation max tokens per day for Anthropic Claude Opus 4 V1 (doubled for cross-region calls) Model invocation max tokens per day for Anthropic Claude Opus 4 V1 (doubled for cross-region calls) general | 144,000,000 count | Fixed |
Records per batch inference job for OpenAI GPT OSS Safeguard 20b Records per batch inference job for OpenAI GPT OSS Safeguard 20b general | 100,000 count | Adjustable |
On-demand InvokeModel concurrent requests for Amazon Nova Reel1.1 On-demand InvokeModel concurrent requests for Amazon Nova Reel1.1 compute | 3 count | Fixed |
Model invocation max tokens per day for GPT OSS Safeguard 120B (doubled for cross-region calls) Model invocation max tokens per day for GPT OSS Safeguard 120B (doubled for cross-region calls) general | 144,000,000,000 count | Fixed |
On-demand model inference requests per minute for Meta Llama 2 70B On-demand model inference requests per minute for Meta Llama 2 70B throughput | 400 count | Fixed |
Batch inference input file size (in GB) for GLM 4.7 Batch inference input file size (in GB) for GLM 4.7 storage | 1 count | Fixed |
Batch inference input file size (in GB) for Nova Micro V1 Batch inference input file size (in GB) for Nova Micro V1 storage | 1 count | Fixed |
Cross-region model inference requests per minute for Stable Image Inpaint Cross-region model inference requests per minute for Stable Image Inpaint throughput | 20 count | Fixed |
Model invocation max tokens per day for Mistral Large 3 (doubled for cross-region calls) Model invocation max tokens per day for Mistral Large 3 (doubled for cross-region calls) general | 144,000,000,000 count | Fixed |
Records per batch inference job for Claude 3 Sonnet Records per batch inference job for Claude 3 Sonnet general | 100,000 count | Adjustable |
On-demand model inference tokens per minute for Qwen3 VL 235B A22B On-demand model inference tokens per minute for Qwen3 VL 235B A22B general | 100,000,000 count | Fixed |
Model units per provisioned model for Amazon Titan Text G1 - Express 8K Model units per provisioned model for Amazon Titan Text G1 - Express 8K general | 0 count | Adjustable |
Batch inference job size (in GB) for Titan Multimodal Embeddings G1 Batch inference job size (in GB) for Titan Multimodal Embeddings G1 storage | 5 count | Fixed |
(Evaluation) Number of metrics per dataset (Evaluation) Number of metrics per dataset general | 3 count | Fixed |
On-demand model inference requests per minute for Cohere Embed V4 On-demand model inference requests per minute for Cohere Embed V4 throughput | 1,000 count | Fixed |
Batch inference job size (in GB) for Claude Sonnet 4 Batch inference job size (in GB) for Claude Sonnet 4 storage | 5 count | Adjustable |
DeleteAgentActionGroup requests per second DeleteAgentActionGroup requests per second throughput | 2 count | Fixed |
(Knowledge Bases) Maximum number of files for BDA parser (Knowledge Bases) Maximum number of files for BDA parser storage | 1,000 count | Fixed |
(Knowledge Bases) ListDataSources requests per second (Knowledge Bases) ListDataSources requests per second throughput | 10 count | Fixed |
(Knowledge Bases) CreateKnowledgeBase requests per second (Knowledge Bases) CreateKnowledgeBase requests per second throughput | 2 count | Fixed |
Model invocation max tokens per day for OpenAI GPT OSS 20B (doubled for cross-region calls) Model invocation max tokens per day for OpenAI GPT OSS 20B (doubled for cross-region calls) general | 144,000,000,000 count | Fixed |
Batch inference job size (in GB) for Claude 3.5 Sonnet v2 Batch inference job size (in GB) for Claude 3.5 Sonnet v2 storage | 5 count | Fixed |
Model invocation max tokens per day for Moonshot AI Kimi K2.5 (doubled for cross-region calls) Model invocation max tokens per day for Moonshot AI Kimi K2.5 (doubled for cross-region calls) general | 144,000,000,000 count | Fixed |
Batch inference input file size (in GB) for Llama 4 Scout Batch inference input file size (in GB) for Llama 4 Scout storage | 1 count | Fixed |
Model invocation max tokens per day for Meta Llama 3.2 3B Instruct (doubled for cross-region calls) Model invocation max tokens per day for Meta Llama 3.2 3B Instruct (doubled for cross-region calls) general | 432,000,000 count | Fixed |
Batch inference input file size (in GB) for Claude 3.7 Sonnet Batch inference input file size (in GB) for Claude 3.7 Sonnet storage | 1 count | Adjustable |
(Model customization) Sum of on demand custom model deployment tokens per minute for Amazon Nova Lite (Model customization) Sum of on demand custom model deployment tokens per minute for Amazon Nova Lite general | 4,000,000 count | Fixed |
Records per input file per batch inference job for Ministral 3 8B Records per input file per batch inference job for Ministral 3 8B storage | 100,000 count | Adjustable |
Records per batch inference job for Magistral Small 2509 Records per batch inference job for Magistral Small 2509 general | 100,000 count | Adjustable |
Batch inference input file size (in GB) for OpenAI GPT OSS 120b Batch inference input file size (in GB) for OpenAI GPT OSS 120b storage | 1 count | Fixed |
Batch inference job size (in GB) for Llama 4 Scout Batch inference job size (in GB) for Llama 4 Scout storage | 5 count | Fixed |
Cross-Region model inference requests per minute for Anthropic Claude 3.5 Haiku Cross-Region model inference requests per minute for Anthropic Claude 3.5 Haiku throughput | 2,000 count | Fixed |
On-demand model inference tokens per minute for Moonshot AI Kimi K2.5 On-demand model inference tokens per minute for Moonshot AI Kimi K2.5 general | 100,000,000 count | Fixed |
On-Demand, latency-optimized model inference requests per minute for Amazon Nova Pro V1 On-Demand, latency-optimized model inference requests per minute for Amazon Nova Pro V1 throughput | 10 count | Fixed |
Throttle rate limit for Bedrock Data Automation Runtime: TagResource Throttle rate limit for Bedrock Data Automation Runtime: TagResource throughput | 25 count | Fixed |
Model units per provisioned model for Meta Llama 2 Chat 13B Model units per provisioned model for Meta Llama 2 Chat 13B general | 0 count | Adjustable |
Model units per provisioned model for Amazon Titan Image Generator G1 Model units per provisioned model for Amazon Titan Image Generator G1 general | 0 count | Adjustable |
On-demand model inference tokens per minute for Anthropic Claude 3.5 Haiku On-demand model inference tokens per minute for Anthropic Claude 3.5 Haiku general | 2,000,000 count | Fixed |
Records per batch inference job for Ministral 3 14B Records per batch inference job for Ministral 3 14B general | 100,000 count | Adjustable |
Records per batch inference job for OpenAI GPT OSS Safeguard 120b Records per batch inference job for OpenAI GPT OSS Safeguard 120b general | 100,000 count | Adjustable |
(Model customization) Sum of on demand custom model deployment tokens per minute for Amazon Nova Pro (Model customization) Sum of on demand custom model deployment tokens per minute for Amazon Nova Pro general | 800,000 count | Fixed |
Global cross-region model inference tokens per minute for Anthropic Claude Sonnet 4.5 V1 Global cross-region model inference tokens per minute for Anthropic Claude Sonnet 4.5 V1 general | 5,000,000 count | Adjustable |
(Guardrails) On-demand ApplyGuardrail Sensitive information filter policy text units per second (Guardrails) On-demand ApplyGuardrail Sensitive information filter policy text units per second identity | 200 count | Adjustable |
Minimum number of records per batch inference job for NVIDIA Nemotron Nano 12B Minimum number of records per batch inference job for NVIDIA Nemotron Nano 12B general | 100 count | Fixed |
On-demand model inference requests per minute for Amazon Nova 2 Multimodal Embeddings V1 On-demand model inference requests per minute for Amazon Nova 2 Multimodal Embeddings V1 throughput | 2,000 count | Fixed |
Minimum number of records per batch inference job for Voxtral Small 24B 2507 Minimum number of records per batch inference job for Voxtral Small 24B 2507 general | 100 count | Fixed |
(Data Automation) Maximum audio file size (MB) (Data Automation) Maximum audio file size (MB) storage | 2,048 count | Fixed |
(Knowledge Bases) UpdateKnowledgeBase requests per second (Knowledge Bases) UpdateKnowledgeBase requests per second throughput | 2 count | Fixed |
(Automated Reasoning) CancelAutomatedReasoningPolicyBuildWorkflow requests per second (Automated Reasoning) CancelAutomatedReasoningPolicyBuildWorkflow requests per second throughput | 5 count | Adjustable |
On-demand InvokeModel concurrent requests for Twelve Labs Pegasus On-demand InvokeModel concurrent requests for Twelve Labs Pegasus compute | 30 count | Adjustable |
Throttle rate limit for ListBlueprints Throttle rate limit for ListBlueprints throughput | 5 count | Fixed |
Minimum number of records per batch inference job for Gemma 3 27B Minimum number of records per batch inference job for Gemma 3 27B general | 100 count | Fixed |
Model units per provisioned model for the 24k context length variant for Amazon Nova Lite Model units per provisioned model for the 24k context length variant for Amazon Nova Lite general | 0 count | Adjustable |
Model invocation max tokens per day for Magistral Small 1.2 (doubled for cross-region calls) Model invocation max tokens per day for Magistral Small 1.2 (doubled for cross-region calls) general | 144,000,000,000 count | Fixed |
On-demand model inference requests per minute for Voxtral Mini 1.0 On-demand model inference requests per minute for Voxtral Mini 1.0 throughput | 10,000 count | Fixed |
Parameters per function Parameters per function general | 5 count | Adjustable |
Minimum number of records per batch inference job for Kimi K2 Thinking Minimum number of records per batch inference job for Kimi K2 Thinking general | 100 count | Fixed |
On-demand model inference tokens per minute for Magistral Small 1.2 On-demand model inference tokens per minute for Magistral Small 1.2 general | 100,000,000 count | Fixed |
Throttle rate limit for Bedrock Data Automation: TagResource Throttle rate limit for Bedrock Data Automation: TagResource throughput | 25 count | Fixed |
On-demand model inference tokens per minute for Anthropic Claude 3 Opus On-demand model inference tokens per minute for Anthropic Claude 3 Opus general | 400,000 count | Fixed |
On-demand model inference tokens per minute for Mistral AI Mistral Large On-demand model inference tokens per minute for Mistral AI Mistral Large general | 300,000 count | Fixed |
On-demand model inference tokens per minute for Qwen3 Coder 30B a3b V1 On-demand model inference tokens per minute for Qwen3 Coder 30B a3b V1 general | 100,000,000 count | Fixed |
GetAgentVersion requests per second GetAgentVersion requests per second throughput | 10 count | Fixed |
Cross-region model inference requests per minute for DeepSeek R1 V1 Cross-region model inference requests per minute for DeepSeek R1 V1 throughput | 200 count | Fixed |
(Data Automation) Maximum Blueprints per Project (Audios) (Data Automation) Maximum Blueprints per Project (Audios) general | 1 count | Fixed |
Batch inference input file size (in GB) for Claude Opus 4.6 Batch inference input file size (in GB) for Claude Opus 4.6 storage | 1 count | Fixed |
Model invocation max tokens per day for NVIDIA Nemotron 3 Super 120B A12B (doubled for cross-region calls) Model invocation max tokens per day for NVIDIA Nemotron 3 Super 120B A12B (doubled for cross-region calls) general | 144,000,000,000 count | Fixed |
(Data Automation) Maximum Number of pages per document (Data Automation) Maximum Number of pages per document general | 3,000 count | Fixed |
(Automated Reasoning) Values per type in policy (Automated Reasoning) Values per type in policy identity | 50 count | Fixed |
(Knowledge Bases) Concurrent ingestion jobs per data source (Knowledge Bases) Concurrent ingestion jobs per data source compute | 1 count | Fixed |
Batch inference job size (in GB) for Claude 3 Sonnet Batch inference job size (in GB) for Claude 3 Sonnet storage | 5 count | Fixed |
Cross-region model inference requests per minute for Stable Image Style Guide Cross-region model inference requests per minute for Stable Image Style Guide throughput | 20 count | Fixed |
(Knowledge Bases) Rerank requests per second (Knowledge Bases) Rerank requests per second throughput | 10 count | Fixed |
On-demand model inference requests per minute for Stable Image Style Guide On-demand model inference requests per minute for Stable Image Style Guide throughput | 10 count | Fixed |
Global cross-region model inference tokens per day for Anthropic Claude Sonnet 4.5 V1 1M Context Length Global cross-region model inference tokens per day for Anthropic Claude Sonnet 4.5 V1 1M Context Length general | 1,440,000,000 count | Fixed |
Minimum number of records per batch inference job for Claude Opus 4.5 Minimum number of records per batch inference job for Claude Opus 4.5 general | 100 count | Fixed |
Model units per provisioned model for Anthropic Claude 3.5 Sonnet V2 200K Model units per provisioned model for Anthropic Claude 3.5 Sonnet V2 200K general | 0 count | Adjustable |
Cross-region model inference tokens per minute for Amazon Nova 2 Lite Cross-region model inference tokens per minute for Amazon Nova 2 Lite general | 8,000,000 count | Adjustable |
On-demand model inference tokens per minute for Meta Llama 3.2 1B Instruct On-demand model inference tokens per minute for Meta Llama 3.2 1B Instruct general | 300,000 count | Fixed |
Records per input file per batch inference job for Llama 3.3 70B Instruct Records per input file per batch inference job for Llama 3.3 70B Instruct storage | 100,000 count | Adjustable |
CreateAgentAlias requests per second CreateAgentAlias requests per second throughput | 2 count | Fixed |
DeleteAgentAlias requests per second DeleteAgentAlias requests per second throughput | 2 count | Fixed |
Sum of in-progress and submitted batch inference jobs using a base model for MiniMax M2 Sum of in-progress and submitted batch inference jobs using a base model for MiniMax M2 general | 100 count | Adjustable |
Batch inference job size (in GB) for Devstral 2 123B Batch inference job size (in GB) for Devstral 2 123B storage | 5 count | Fixed |
Batch inference input file size (in GB) for Claude Sonnet 4.5 Batch inference input file size (in GB) for Claude Sonnet 4.5 storage | 1 count | Fixed |
(Guardrails) On-demand ApplyGuardrail Word filter policy text units per second (Guardrails) On-demand ApplyGuardrail Word filter policy text units per second identity | 200 count | Adjustable |
On-demand model inference requests per minute for Z.ai GLM-4.7 On-demand model inference requests per minute for Z.ai GLM-4.7 throughput | 10,000 count | Fixed |
Minimum number of records per batch inference job for NVIDIA Nemotron Nano 9B Minimum number of records per batch inference job for NVIDIA Nemotron Nano 9B general | 100 count | Fixed |
Minimum number of records per batch inference job for Llama 3.1 70B Instruct Minimum number of records per batch inference job for Llama 3.1 70B Instruct general | 100 count | Fixed |
Sum of in-progress and submitted batch inference jobs using a base model for NVIDIA Nemotron Nano 12B Sum of in-progress and submitted batch inference jobs using a base model for NVIDIA Nemotron Nano 12B general | 100 count | Adjustable |
Minimum number of records per batch inference job for OpenAI GPT OSS Safeguard 20b Minimum number of records per batch inference job for OpenAI GPT OSS Safeguard 20b general | 100 count | Fixed |
Minimum number of records per batch inference job for Voxtral Mini 3B 2507 Minimum number of records per batch inference job for Voxtral Mini 3B 2507 general | 100 count | Fixed |
Records per batch inference job for Ministral 3 8B Records per batch inference job for Ministral 3 8B general | 100,000 count | Adjustable |
Cross-region model inference tokens per minute for Anthropic Claude 3 Sonnet Cross-region model inference tokens per minute for Anthropic Claude 3 Sonnet general | 2,000,000 count | Adjustable |
Sum of in-progress and submitted batch inference jobs using a base model for Claude 3 Sonnet Sum of in-progress and submitted batch inference jobs using a base model for Claude 3 Sonnet general | 100 count | Adjustable |
Batch inference input file size (in GB) for MiniMax M2.1 Batch inference input file size (in GB) for MiniMax M2.1 storage | 1 count | Fixed |
Records per input file per batch inference job for OpenAI GPT OSS 120b Records per input file per batch inference job for OpenAI GPT OSS 120b storage | 100,000 count | Adjustable |
(Guardrails) On-demand ApplyGuardrail contextual grounding policy text units per second (Guardrails) On-demand ApplyGuardrail contextual grounding policy text units per second identity | 106 count | Adjustable |
On-demand model inference requests per minute for Meta Llama 2 13B On-demand model inference requests per minute for Meta Llama 2 13B throughput | 800 count | Fixed |
Cross-region model inference requests per minute for Stable Image Remove Background Cross-region model inference requests per minute for Stable Image Remove Background throughput | 20 count | Fixed |
Model invocation max latency-optimized tokens per day for Amazon Nova Pro V1 Model invocation max latency-optimized tokens per day for Amazon Nova Pro V1 general | 57,600,000 count | Fixed |
Cross-region model inference tokens per minute for Meta Llama 3.3 70B Instruct Cross-region model inference tokens per minute for Meta Llama 3.3 70B Instruct general | 600,000 count | Adjustable |
(Data Automation) InvokeDataAutomationAsync - Image - Max number of concurrent jobs (Data Automation) InvokeDataAutomationAsync - Image - Max number of concurrent jobs compute | 20 count | Adjustable |
On-demand model inference requests per minute for Ministral 8B 3.0 On-demand model inference requests per minute for Ministral 8B 3.0 throughput | 10,000 count | Fixed |
Cross-region model inference requests per minute for Anthropic Claude Opus 4.6 V1 Cross-region model inference requests per minute for Anthropic Claude Opus 4.6 V1 throughput | 10,000 count | Adjustable |
On-demand model inference tokens per minute for Kimi K2 Thinking On-demand model inference tokens per minute for Kimi K2 Thinking general | 100,000,000 count | Fixed |
On-demand model inference requests per minute for Stable Image Control Structure On-demand model inference requests per minute for Stable Image Control Structure throughput | 10 count | Fixed |
Model units per provisioned model for Cohere Command R Plus Model units per provisioned model for Cohere Command R Plus general | 0 count | Adjustable |
On-demand model inference requests per minute for Meta Llama 3.2 11B Instruct On-demand model inference requests per minute for Meta Llama 3.2 11B Instruct throughput | 400 count | Fixed |
On-demand model inference tokens per minute for Qwen3 32B V1 On-demand model inference tokens per minute for Qwen3 32B V1 general | 100,000,000 count | Fixed |
Records per batch inference job for Llama 4 Maverick Records per batch inference job for Llama 4 Maverick general | 100,000 count | Adjustable |
(Flows) Lex nodes per flow (Flows) Lex nodes per flow capacity | 5 count | Fixed |
Minimum number of records per batch inference job for Gemma 3 12B Minimum number of records per batch inference job for Gemma 3 12B general | 100 count | Fixed |
Throttle rate limit for Bedrock Data Automation Runtime: UntagResource Throttle rate limit for Bedrock Data Automation Runtime: UntagResource throughput | 25 count | Fixed |
(Data Automation) Maximum number of Blueprints per Start Inference request (Videos) (Data Automation) Maximum number of Blueprints per Start Inference request (Videos) throughput | 1 count | Fixed |
(Automated Reasoning) CreateAutomatedReasoningPolicyVersion requests per second (Automated Reasoning) CreateAutomatedReasoningPolicyVersion requests per second throughput | 5 count | Adjustable |
Cross-region model inference requests per minute for Anthropic Claude 3 Sonnet Cross-region model inference requests per minute for Anthropic Claude 3 Sonnet throughput | 1,000 count | Fixed |
APIs per Agent APIs per Agent general | 11 count | Adjustable |
(Prompt management) DeletePrompt requests per second (Prompt management) DeletePrompt requests per second throughput | 2 count | Fixed |
Cross-region model inference requests per minute for Anthropic Claude Sonnet 4.6 Cross-region model inference requests per minute for Anthropic Claude Sonnet 4.6 throughput | 10,000 count | Adjustable |
On-demand model inference tokens per minute for Cohere Embed V4 On-demand model inference tokens per minute for Cohere Embed V4 general | 150,000 count | Fixed |
Model invocation max tokens per day for Mistral AI Mistral Large (doubled for cross-region calls) Model invocation max tokens per day for Mistral AI Mistral Large (doubled for cross-region calls) general | 432,000,000 count | Fixed |
(Knowledge Bases) StartIngestionJob requests per second (Knowledge Bases) StartIngestionJob requests per second throughput | 0.1 count | Fixed |
(Data Automation) Maximum Resolution (Data Automation) Maximum Resolution general | 8,000 count | Fixed |
(Automated Reasoning) DeleteAutomatedReasoningPolicy requests per second (Automated Reasoning) DeleteAutomatedReasoningPolicy requests per second throughput | 5 count | Adjustable |
Model invocation max tokens per day for Anthropic Claude Sonnet 4.5 V1 (doubled for cross-region calls) Model invocation max tokens per day for Anthropic Claude Sonnet 4.5 V1 (doubled for cross-region calls) general | 3,600,000,000 count | Fixed |
Records per batch inference job for Amazon Nova 2 Multimodal Embeddings V1 Records per batch inference job for Amazon Nova 2 Multimodal Embeddings V1 general | 100,000 count | Adjustable |
On-demand model inference requests per minute for Stable Image Outpaint On-demand model inference requests per minute for Stable Image Outpaint throughput | 2 count | Fixed |
Minimum number of records per batch inference job for Claude Sonnet 4 Minimum number of records per batch inference job for Claude Sonnet 4 general | 100 count | Adjustable |
Imported models per account Imported models per account general | 3 count | Adjustable |
(Guardrails) Contextual grounding query length in text units (Guardrails) Contextual grounding query length in text units general | 1 count | Fixed |
On-demand model inference tokens per minute for Voxtral Small 1.0 On-demand model inference tokens per minute for Voxtral Small 1.0 general | 100,000,000 count | Fixed |
Cross-region model inference requests per minute for Stable Image Control Structure Cross-region model inference requests per minute for Stable Image Control Structure throughput | 20 count | Fixed |
On-demand model inference tokens per minute for Amazon Titan Text Lite On-demand model inference tokens per minute for Amazon Titan Text Lite general | 300,000 count | Fixed |
Batch inference job size (in GB) for Claude Opus 4.5 Batch inference job size (in GB) for Claude Opus 4.5 storage | 5 count | Fixed |
Cross-region model inference tokens per minute for Amazon Nova Lite Cross-region model inference tokens per minute for Amazon Nova Lite general | 8,000,000 count | Adjustable |
On-demand model inference requests per minute for Amazon Titan Text Express On-demand model inference requests per minute for Amazon Titan Text Express throughput | 400 count | Fixed |
Batch inference input file size (in GB) for Nova Lite V1 Batch inference input file size (in GB) for Nova Lite V1 storage | 1 count | Fixed |
Model units per provisioned model for Amazon Titan Lite V1 4K Model units per provisioned model for Amazon Titan Lite V1 4K general | 0 count | Adjustable |
(Data Automation) InvokeDataAutomation(Sync) - Document - Max number of requests (Data Automation) InvokeDataAutomation(Sync) - Document - Max number of requests throughput | 60 count | Adjustable |
(Flows) CreateFlow requests per second (Flows) CreateFlow requests per second throughput | 2 count | Fixed |
Batch inference input file size (in GB) for Qwen3 32B Batch inference input file size (in GB) for Qwen3 32B storage | 1 count | Fixed |
Model invocation max tokens per day for Voxtral Small 1.0 (doubled for cross-region calls) Model invocation max tokens per day for Voxtral Small 1.0 (doubled for cross-region calls) general | 144,000,000,000 count | Fixed |
Cross-region model inference requests per minute for Amazon Nova Micro Cross-region model inference requests per minute for Amazon Nova Micro throughput | 4,000 count | Fixed |
On-demand model inference requests per minute for Cohere Rerank 3.5 On-demand model inference requests per minute for Cohere Rerank 3.5 throughput | 250 count | Fixed |
No-commitment model units for Provisioned Throughput created for custom model Amazon Nova 2 Lite V1.0 256K No-commitment model units for Provisioned Throughput created for custom model Amazon Nova 2 Lite V1.0 256K general | 0 count | Fixed |
On-demand model inference tokens per minute for AI21 Labs Jurassic-2 Mid On-demand model inference tokens per minute for AI21 Labs Jurassic-2 Mid general | 300,000 count | Fixed |
Minimum number of records per batch inference job for DeepSeek V3.2 Minimum number of records per batch inference job for DeepSeek V3.2 general | 100 count | Fixed |
Records per input file per batch inference job for Claude Opus 4.5 Records per input file per batch inference job for Claude Opus 4.5 storage | 100,000 count | Adjustable |
(Model customization) Sum of training and validation records for a Titan Text G1 - Premier v1 Fine-tuning job (Model customization) Sum of training and validation records for a Titan Text G1 - Premier v1 Fine-tuning job general | 20,000 count | Adjustable |
Cross-region model inference requests per minute for Stable Image Erase Object Cross-region model inference requests per minute for Stable Image Erase Object storage | 20 count | Fixed |
Throttle rate limit for InvokeDataAutomationAsync Throttle rate limit for InvokeDataAutomationAsync throughput | 10 count | Fixed |
(Automated Reasoning) Concurrent builds per policy (Automated Reasoning) Concurrent builds per policy compute | 2 count | Fixed |
(Flows) Iterator nodes per flow (Flows) Iterator nodes per flow capacity | 1 count | Fixed |
Records per batch inference job for Amazon Nova Premier Records per batch inference job for Amazon Nova Premier general | 100,000 count | Adjustable |
Sum of in-progress and submitted batch inference jobs using a base model for Kimi K2.5 Sum of in-progress and submitted batch inference jobs using a base model for Kimi K2.5 general | 100 count | Adjustable |
Agents per account Agents per account general | 1,000 count | Adjustable |
Minimum number of records per batch inference job for Qwen3 VL 235B Minimum number of records per batch inference job for Qwen3 VL 235B general | 100 count | Fixed |
On-demand model inference tokens per minute for Voxtral Mini 1.0 On-demand model inference tokens per minute for Voxtral Mini 1.0 general | 100,000,000 count | Fixed |
Sum of in-progress and submitted batch inference jobs using a base model for Qwen3 VL 235B Sum of in-progress and submitted batch inference jobs using a base model for Qwen3 VL 235B general | 100 count | Adjustable |
Model units per provisioned model for Anthropic Claude 3.5 Sonnet 51K Model units per provisioned model for Anthropic Claude 3.5 Sonnet 51K general | 0 count | Adjustable |
Batch inference job size (in GB) for Gemma 3 4B Batch inference job size (in GB) for Gemma 3 4B storage | 5 count | Fixed |
Batch inference input file size (in GB) for GLM 4.7 Flash Batch inference input file size (in GB) for GLM 4.7 Flash storage | 1 count | Fixed |
Records per input file per batch inference job for Titan Text Embeddings V2 Records per input file per batch inference job for Titan Text Embeddings V2 storage | 100,000 count | Adjustable |
Global cross-region model inference tokens per day for Anthropic Claude Opus 4.5 Global cross-region model inference tokens per day for Anthropic Claude Opus 4.5 general | 2,880,000,000 count | Fixed |
Global cross-region model inference requests per minute for Anthropic Claude Sonnet 4.5 V1 1M Context Length Global cross-region model inference requests per minute for Anthropic Claude Sonnet 4.5 V1 1M Context Length throughput | 1,000 count | Adjustable |
Sum of in-progress and submitted batch inference jobs using a base model for MiniMax M2.5 Sum of in-progress and submitted batch inference jobs using a base model for MiniMax M2.5 general | 100 count | Adjustable |
Model units per provisioned model for Anthropic Claude 3.5 Haiku 200K Model units per provisioned model for Anthropic Claude 3.5 Haiku 200K general | 0 count | Adjustable |
Batch inference input file size (in GB) for OpenAI GPT OSS Safeguard 120b Batch inference input file size (in GB) for OpenAI GPT OSS Safeguard 120b storage | 1 count | Fixed |
Cross-region model inference requests per minute for Anthropic Claude Sonnet 4 V1 1M Context Length Cross-region model inference requests per minute for Anthropic Claude Sonnet 4 V1 1M Context Length throughput | 5 count | Adjustable |
Minimum number of records per batch inference job for Llama 4 Scout Minimum number of records per batch inference job for Llama 4 Scout general | 100 count | Fixed |
Batch inference job size (in GB) for OpenAI GPT OSS 20b Batch inference job size (in GB) for OpenAI GPT OSS 20b storage | 5 count | Fixed |
Batch inference job size (in GB) for GLM 5 Batch inference job size (in GB) for GLM 5 storage | 5 count | Fixed |
On-demand model inference tokens per minute for Ministral 3B 3.0 On-demand model inference tokens per minute for Ministral 3B 3.0 general | 100,000,000 count | Fixed |
(Model customization) Sum of training and validation records for a Amazon Nova Lite Fine-tuning job (Model customization) Sum of training and validation records for a Amazon Nova Lite Fine-tuning job general | 20,000 count | Adjustable |
Batch inference job size (in GB) for Amazon Nova 2 Multimodal Embeddings V1 Batch inference job size (in GB) for Amazon Nova 2 Multimodal Embeddings V1 storage | 100 count | Fixed |
Records per input file per batch inference job for MiniMax M2 Records per input file per batch inference job for MiniMax M2 storage | 100,000 count | Adjustable |
Batch inference input file size (in GB) for Nova Pro V1 Batch inference input file size (in GB) for Nova Pro V1 storage | 1 count | Fixed |
Records per input file per batch inference job for Claude Sonnet 4.5 Records per input file per batch inference job for Claude Sonnet 4.5 storage | 100,000 count | Adjustable |
(Automated Reasoning) DeleteAutomatedReasoningPolicyTestCase requests per second (Automated Reasoning) DeleteAutomatedReasoningPolicyTestCase requests per second throughput | 5 count | Adjustable |
Batch inference job size (in GB) for Magistral Small 2509 Batch inference job size (in GB) for Magistral Small 2509 storage | 5 count | Fixed |
Cross-region model inference tokens per minute for Meta Llama 3.1 8B Instruct Cross-region model inference tokens per minute for Meta Llama 3.1 8B Instruct general | 600,000 count | Adjustable |
On-demand model inference tokens per minute for Amazon Titan Image Generator G1 V2 On-demand model inference tokens per minute for Amazon Titan Image Generator G1 V2 general | 2,000 count | Fixed |
(Data Automation) Maximum Blueprints per Project (Documents) (Data Automation) Maximum Blueprints per Project (Documents) general | 40 count | Fixed |
Minimum number of records per batch inference job for Titan Multimodal Embeddings G1 Minimum number of records per batch inference job for Titan Multimodal Embeddings G1 general | 100 count | Fixed |
Sum of in-progress and submitted batch inference jobs using a base model for Llama 3.1 8B Instruct Sum of in-progress and submitted batch inference jobs using a base model for Llama 3.1 8B Instruct general | 100 count | Adjustable |
Batch inference input file size (in GB) for Writer Palmyra Vision 7B Batch inference input file size (in GB) for Writer Palmyra Vision 7B storage | 1 count | Fixed |
Batch inference job size (in GB) for Voxtral Mini 3B 2507 Batch inference job size (in GB) for Voxtral Mini 3B 2507 storage | 5 count | Fixed |
Global cross-region model inference requests per minute for Anthropic Claude Haiku 4.5 Global cross-region model inference requests per minute for Anthropic Claude Haiku 4.5 throughput | 10,000 count | Adjustable |
(Guardrails) Example phrases per Topic (Guardrails) Example phrases per Topic general | 5 count | Fixed |
Batch inference input file size (in GB) for Gemma 3 12B Batch inference input file size (in GB) for Gemma 3 12B storage | 1 count | Fixed |
DeleteAgent requests per second DeleteAgent requests per second throughput | 2 count | Fixed |
(Knowledge Bases) GenerateQuery requests per second (Knowledge Bases) GenerateQuery requests per second throughput | 2 count | Fixed |
Minimum number of records per batch inference job for Llama 3.2 1B Instruct Minimum number of records per batch inference job for Llama 3.2 1B Instruct general | 100 count | Fixed |
Sum of in-progress and submitted batch inference jobs using a base model for Llama 3.3 70B Instruct Sum of in-progress and submitted batch inference jobs using a base model for Llama 3.3 70B Instruct general | 100 count | Adjustable |
Global cross-region model inference tokens per minute for Anthropic Claude Sonnet 4.6 Global cross-region model inference tokens per minute for Anthropic Claude Sonnet 4.6 general | 6,000,000 count | Adjustable |
Cross-region model inference tokens per minute for Anthropic Claude Sonnet 4.5 V1 Cross-region model inference tokens per minute for Anthropic Claude Sonnet 4.5 V1 general | 5,000,000 count | Adjustable |
Batch inference input file size (in GB) for Llama 3.2 11B Instruct Batch inference input file size (in GB) for Llama 3.2 11B Instruct storage | 1 count | Fixed |
Records per batch inference job for GLM 4.7 Records per batch inference job for GLM 4.7 general | 100,000 count | Adjustable |
On-demand InvokeModel async concurrent requests for Amazon Nova 2 Multimodal Embeddings V1 On-demand InvokeModel async concurrent requests for Amazon Nova 2 Multimodal Embeddings V1 compute | 30 count | Fixed |
(Automated Reasoning) UpdateAutomatedReasoningPolicyTestCase requests per second (Automated Reasoning) UpdateAutomatedReasoningPolicyTestCase requests per second throughput | 5 count | Adjustable |
Records per input file per batch inference job for Gemma 3 4B Records per input file per batch inference job for Gemma 3 4B storage | 100,000 count | Adjustable |
Cross-region model inference tokens per minute for Meta Llama 4 Scout V1 Cross-region model inference tokens per minute for Meta Llama 4 Scout V1 general | 600,000 count | Adjustable |
On-demand model inference requests per minute for Gemma 3 4B On-demand model inference requests per minute for Gemma 3 4B throughput | 10,000 count | Fixed |
Batch inference job size (in GB) for Claude Sonnet 4.6 Batch inference job size (in GB) for Claude Sonnet 4.6 storage | 5 count | Fixed |
Sum of in-progress and submitted batch inference jobs using a base model for Writer Palmyra Vision 7B Sum of in-progress and submitted batch inference jobs using a base model for Writer Palmyra Vision 7B general | 100 count | Adjustable |
(Guardrails) Topics per guardrail (Guardrails) Topics per guardrail general | 30 count | Fixed |
Batch inference job size (in GB) for DeepSeek V3.2 Batch inference job size (in GB) for DeepSeek V3.2 storage | 5 count | Fixed |
On-demand model inference requests per minute for Cohere Command R Plus On-demand model inference requests per minute for Cohere Command R Plus throughput | 400 count | Fixed |
(Data Automation) Maximum video length (Minutes) (Data Automation) Maximum video length (Minutes) general | 240 count | Fixed |
Cross-region model inference requests per minute for Meta Llama 3.1 70B Instruct Cross-region model inference requests per minute for Meta Llama 3.1 70B Instruct throughput | 800 count | Fixed |
Records per batch inference job for Voxtral Small 24B 2507 Records per batch inference job for Voxtral Small 24B 2507 general | 100,000 count | Adjustable |
Sum of in-progress and submitted batch inference jobs using a base model for Mistral Large 3 Sum of in-progress and submitted batch inference jobs using a base model for Mistral Large 3 general | 100 count | Adjustable |
(Guardrails) Guardrails per account (Guardrails) Guardrails per account general | 100 count | Fixed |
Sum of in-progress and submitted batch inference jobs using a base model for Claude Opus 4.6 Sum of in-progress and submitted batch inference jobs using a base model for Claude Opus 4.6 general | 100 count | Adjustable |
Records per input file per batch inference job for Claude Opus 4.6 Records per input file per batch inference job for Claude Opus 4.6 storage | 100,000 count | Adjustable |
Global cross-region model inference requests per minute for Anthropic Claude Opus 4.5 Global cross-region model inference requests per minute for Anthropic Claude Opus 4.5 throughput | 10,000 count | Adjustable |
Sum of in-progress and submitted batch inference jobs using a base model for Llama 3.2 1B Instruct Sum of in-progress and submitted batch inference jobs using a base model for Llama 3.2 1B Instruct general | 100 count | Adjustable |
Records per batch inference job for Titan Multimodal Embeddings G1 Records per batch inference job for Titan Multimodal Embeddings G1 general | 100,000 count | Adjustable |
Cross-region model inference requests per minute for Stable Image Control Sketch Cross-region model inference requests per minute for Stable Image Control Sketch throughput | 20 count | Fixed |
On-demand model inference requests per minute for Voxtral Small 1.0 On-demand model inference requests per minute for Voxtral Small 1.0 throughput | 10,000 count | Fixed |
(Knowledge Bases) DeleteDataSource requests per second (Knowledge Bases) DeleteDataSource requests per second throughput | 2 count | Fixed |
Model invocation max tokens per day for Qwen3 32B V1 (doubled for cross-region calls) Model invocation max tokens per day for Qwen3 32B V1 (doubled for cross-region calls) general | 144,000,000,000 count | Fixed |
(Knowledge Bases) ListIngestionJobs requests per second (Knowledge Bases) ListIngestionJobs requests per second throughput | 10 count | Fixed |
Records per input file per batch inference job for Llama 3.2 11B Instruct Records per input file per batch inference job for Llama 3.2 11B Instruct storage | 100,000 count | Adjustable |
Minimum number of records per batch inference job for Qwen3 Coder 30B Minimum number of records per batch inference job for Qwen3 Coder 30B general | 100 count | Fixed |
Records per input file per batch inference job for MiniMax M2.1 Records per input file per batch inference job for MiniMax M2.1 storage | 100,000 count | Adjustable |
GetAgentAlias requests per second GetAgentAlias requests per second throughput | 10 count | Fixed |
Cross-region model inference tokens per minute for Cohere Embed V4 Cross-region model inference tokens per minute for Cohere Embed V4 general | 300,000 count | Adjustable |
Records per batch inference job for Mistral Large 2 (24.07) Records per batch inference job for Mistral Large 2 (24.07) general | 100,000 count | Adjustable |
On-demand model inference tokens per minute for Meta Llama 2 70B On-demand model inference tokens per minute for Meta Llama 2 70B general | 300,000 count | Fixed |
(Flows) Input nodes per flow (Flows) Input nodes per flow capacity | 1 count | Fixed |
On-demand model inference requests per minute for Gemma 3 12B On-demand model inference requests per minute for Gemma 3 12B throughput | 10,000 count | Fixed |
Minimum number of records per batch inference job for OpenAI GPT OSS 20b Minimum number of records per batch inference job for OpenAI GPT OSS 20b general | 100 count | Fixed |
On-demand model inference requests per minute for Cohere Command R On-demand model inference requests per minute for Cohere Command R throughput | 400 count | Fixed |
Records per batch inference job for Ministral 3B Records per batch inference job for Ministral 3B general | 100,000 count | Adjustable |
(Evaluation) Number of models in automated model evaluation job (Evaluation) Number of models in automated model evaluation job general | 1 count | Fixed |
On-demand model inference requests per minute for NVIDIA Nemotron Nano 2 VL On-demand model inference requests per minute for NVIDIA Nemotron Nano 2 VL throughput | 10,000 count | Fixed |
Records per input file per batch inference job for Claude 3 Haiku Records per input file per batch inference job for Claude 3 Haiku storage | 100,000 count | Adjustable |
Batch inference input file size (in GB) for Llama 3.2 90B Instruct Batch inference input file size (in GB) for Llama 3.2 90B Instruct storage | 1 count | Fixed |
On-demand model inference requests per minute for Mistral Mixtral 8x7b Instruct On-demand model inference requests per minute for Mistral Mixtral 8x7b Instruct throughput | 400 count | Fixed |
(Flows) Knowledge base nodes per flow (Flows) Knowledge base nodes per flow capacity | 20 count | Fixed |
Records per input file per batch inference job for Writer Palmyra Vision 7B Records per input file per batch inference job for Writer Palmyra Vision 7B storage | 100,000 count | Adjustable |
Model invocation max tokens per day for Qwen3 VL 235B A22B (doubled for cross-region calls) Model invocation max tokens per day for Qwen3 VL 235B A22B (doubled for cross-region calls) general | 144,000,000,000 count | Fixed |
Model invocation max tokens per day for Anthropic Claude 3.5 Sonnet V2 (doubled for cross-region calls) Model invocation max tokens per day for Anthropic Claude 3.5 Sonnet V2 (doubled for cross-region calls) general | 2,880,000,000 count | Fixed |
(Knowledge Bases) IngestKnowledgeBaseDocuments total payload size (Knowledge Bases) IngestKnowledgeBaseDocuments total payload size storage | 6 count | Fixed |
On-demand model inference tokens per minute for Mistral AI Mistral 7B Instruct On-demand model inference tokens per minute for Mistral AI Mistral 7B Instruct general | 300,000 count | Fixed |
Records per input file per batch inference job for Mistral Small Records per input file per batch inference job for Mistral Small storage | 100,000 count | Adjustable |
Batch inference input file size (in GB) for Qwen3 Coder Next Batch inference input file size (in GB) for Qwen3 Coder Next storage | 1 count | Fixed |
Batch inference input file size (in GB) for Ministral 3B Batch inference input file size (in GB) for Ministral 3B storage | 1 count | Fixed |
Associated aliases per Agent Associated aliases per Agent general | 10 count | Fixed |
Records per batch inference job for OpenAI GPT OSS 20b Records per batch inference job for OpenAI GPT OSS 20b general | 100,000 count | Adjustable |
(Flows) Lambda function nodes per flow (Flows) Lambda function nodes per flow capacity | 20 count | Fixed |
(Prompt management) CreatePrompt requests per second (Prompt management) CreatePrompt requests per second throughput | 2 count | Fixed |
Sum of in-progress and submitted batch inference jobs using a base model for NVIDIA Nemotron Nano 3 30B Sum of in-progress and submitted batch inference jobs using a base model for NVIDIA Nemotron Nano 3 30B general | 100 count | Adjustable |
Records per input file per batch inference job for Claude Sonnet 4.6 Records per input file per batch inference job for Claude Sonnet 4.6 storage | 100,000 count | Adjustable |
On-demand model inference requests per minute for Amazon Nova Lite On-demand model inference requests per minute for Amazon Nova Lite throughput | 2,000 count | Fixed |
On-demand model inference tokens per minute for Gemma 3 27B On-demand model inference tokens per minute for Gemma 3 27B general | 100,000,000 count | Fixed |
Model invocation max tokens per day for Anthropic Claude Sonnet 4.6 (doubled for cross-region calls) Model invocation max tokens per day for Anthropic Claude Sonnet 4.6 (doubled for cross-region calls) general | 4,320,000,000 count | Fixed |
(Knowledge Bases) Retrieve requests per second (Knowledge Bases) Retrieve requests per second throughput | 20 count | Fixed |
On-demand model inference requests per minute for Stable Image Remove Background On-demand model inference requests per minute for Stable Image Remove Background throughput | 10 count | Fixed |
Model units per provisioned model for Cohere Embed Multilingual Model units per provisioned model for Cohere Embed Multilingual general | 0 count | Adjustable |
Records per batch inference job for Claude Opus 4.6 Records per batch inference job for Claude Opus 4.6 general | 100,000 count | Adjustable |
On-demand model inference requests per minute for Meta Llama 3.1 8B Instruct On-demand model inference requests per minute for Meta Llama 3.1 8B Instruct throughput | 800 count | Fixed |
On-demand model inference requests per minute for Cohere Embed Multilingual On-demand model inference requests per minute for Cohere Embed Multilingual throughput | 2,000 count | Fixed |
(Evaluation) Number of concurrent model evaluation jobs that use human workers (Evaluation) Number of concurrent model evaluation jobs that use human workers compute | 10 count | Fixed |
On-demand model inference tokens per minute for Amazon Titan Text Embeddings V2 On-demand model inference tokens per minute for Amazon Titan Text Embeddings V2 general | 300,000 count | Fixed |
Sum of in-progress and submitted batch inference jobs using a base model for OpenAI GPT OSS 120b Sum of in-progress and submitted batch inference jobs using a base model for OpenAI GPT OSS 120b general | 100 count | Adjustable |
(Flows) UpdateFlow requests per second (Flows) UpdateFlow requests per second throughput | 2 count | Fixed |
On-demand model inference requests per minute for Stable Image Erase Object On-demand model inference requests per minute for Stable Image Erase Object storage | 10 count | Fixed |
(Data Automation) Maximum number of list fields per Blueprint (Data Automation) Maximum number of list fields per Blueprint general | 15 count | Fixed |
Model units no-commitment Provisioned Throughputs across custom models Model units no-commitment Provisioned Throughputs across custom models general | 0 count | Adjustable |
Records per batch inference job for Nova Pro V1 Records per batch inference job for Nova Pro V1 general | 100,000 count | Adjustable |
Minimum number of records per batch inference job for Claude 3.5 Sonnet v2 Minimum number of records per batch inference job for Claude 3.5 Sonnet v2 general | 100 count | Fixed |
(Evaluation) Size of prompt (Evaluation) Size of prompt storage | 4 count | Fixed |
Global cross-region model inference requests per minute for Anthropic Claude Opus 4.6 V1 Global cross-region model inference requests per minute for Anthropic Claude Opus 4.6 V1 throughput | 10,000 count | Adjustable |
(Evaluation) Task time for workers (Evaluation) Task time for workers general | 30 count | Fixed |
Throttle rate limit for GetBlueprint Throttle rate limit for GetBlueprint throughput | 5 count | Fixed |
Batch inference input file size (in GB) for Llama 3.1 70B Instruct Batch inference input file size (in GB) for Llama 3.1 70B Instruct storage | 1 count | Fixed |
Sum of in-progress and submitted batch inference jobs using a base model for GLM 5 Sum of in-progress and submitted batch inference jobs using a base model for GLM 5 general | 100 count | Adjustable |
On-demand model inference requests per minute for Amazon Nova Micro On-demand model inference requests per minute for Amazon Nova Micro throughput | 2,000 count | Fixed |
On-demand model inference requests per minute for Amazon Titan Multimodal Embeddings G1 On-demand model inference requests per minute for Amazon Titan Multimodal Embeddings G1 throughput | 2,000 count | Fixed |
Model invocation max tokens per day for Anthropic Claude 3.7 Sonnet V1 (doubled for cross-region calls) Model invocation max tokens per day for Anthropic Claude 3.7 Sonnet V1 (doubled for cross-region calls) general | 720,000,000 count | Fixed |
Records per batch inference job for Mistral Large 3 Records per batch inference job for Mistral Large 3 general | 100,000 count | Adjustable |
Batch inference job size (in GB) for Kimi K2.5 Batch inference job size (in GB) for Kimi K2.5 storage | 5 count | Fixed |
Endpoints per inference profile Endpoints per inference profile storage | 5 count | Fixed |
Action groups per Agent Action groups per Agent general | 20 count | Adjustable |
Model units per provisioned model for Anthropic Claude V2.1 18K Model units per provisioned model for Anthropic Claude V2.1 18K general | 0 count | Adjustable |
Records per batch inference job for Gemma 3 12B Records per batch inference job for Gemma 3 12B general | 100,000 count | Adjustable |
Minimum number of records per batch inference job for Writer Palmyra Vision 7B Minimum number of records per batch inference job for Writer Palmyra Vision 7B general | 100 count | Fixed |
Cross-region model inference requests per minute for Amazon Nova 2 Pro Preview Cross-region model inference requests per minute for Amazon Nova 2 Pro Preview throughput | 100 count | Fixed |
On-demand model inference tokens per minute for Meta Llama 2 Chat 13B On-demand model inference tokens per minute for Meta Llama 2 Chat 13B general | 300,000 count | Fixed |
Sum of in-progress and submitted batch inference jobs using a base model for Llama 3.1 70B Instruct Sum of in-progress and submitted batch inference jobs using a base model for Llama 3.1 70B Instruct general | 100 count | Adjustable |
Records per input file per batch inference job for Amazon Nova 2 Multimodal Embeddings V1 Records per input file per batch inference job for Amazon Nova 2 Multimodal Embeddings V1 storage | 100,000 count | Adjustable |
Model units per provisioned model for Anthropic Claude 3 Sonnet 200K Model units per provisioned model for Anthropic Claude 3 Sonnet 200K general | 0 count | Adjustable |
Model invocation max tokens per day for Cohere Embed V4 (doubled for cross-region calls) Model invocation max tokens per day for Cohere Embed V4 (doubled for cross-region calls) general | 216,000,000 count | Fixed |
(Data Automation) Description length for fields (Characters) (Data Automation) Description length for fields (Characters) general | 300 count | Fixed |
On-demand model inference tokens per minute for NVIDIA Nemotron Nano 2 On-demand model inference tokens per minute for NVIDIA Nemotron Nano 2 general | 100,000,000 count | Fixed |
Throttle rate limit for GetDataAutomationStatus Throttle rate limit for GetDataAutomationStatus throughput | 10 count | Fixed |
(Data Automation) InvokeDataAutomationAsync - Video - Max number of concurrent jobs (Data Automation) InvokeDataAutomationAsync - Video - Max number of concurrent jobs compute | 20 count | Adjustable |
Records per input file per batch inference job for Qwen3 32B Records per input file per batch inference job for Qwen3 32B storage | 100,000 count | Adjustable |
On-demand model inference requests per minute for Z.ai GLM 5 On-demand model inference requests per minute for Z.ai GLM 5 throughput | 10,000 count | Fixed |
Records per batch inference job for Llama 3.2 3B Instruct Records per batch inference job for Llama 3.2 3B Instruct general | 100,000 count | Adjustable |
Model units per provisioned model for Anthropic Claude Instant V1 100K Model units per provisioned model for Anthropic Claude Instant V1 100K general | 0 count | Adjustable |
On-demand model inference requests per minute for Cohere Embed English On-demand model inference requests per minute for Cohere Embed English throughput | 2,000 count | Fixed |
Records per batch inference job for Mistral Small Records per batch inference job for Mistral Small general | 100,000 count | Adjustable |
(Knowledge Bases) Ingestion job size (Knowledge Bases) Ingestion job size storage | 100 count | Fixed |
On-demand model inference requests per minute for Meta Llama 3.2 1B Instruct On-demand model inference requests per minute for Meta Llama 3.2 1B Instruct throughput | 800 count | Fixed |
Records per input file per batch inference job for Llama 3.2 3B Instruct Records per input file per batch inference job for Llama 3.2 3B Instruct storage | 100,000 count | Adjustable |
Records per batch inference job for Claude 3 Opus Records per batch inference job for Claude 3 Opus general | 100,000 count | Adjustable |
Minimum number of records per batch inference job for MiniMax M2.5 Minimum number of records per batch inference job for MiniMax M2.5 general | 100 count | Fixed |
Global cross-region model inference tokens per day for Amazon Nova 2 Lite Global cross-region model inference tokens per day for Amazon Nova 2 Lite general | 11,520,000,000 count | Fixed |
(Knowledge Bases) CreateDataSource requests per second (Knowledge Bases) CreateDataSource requests per second throughput | 2 count | Fixed |
On-demand model inference tokens per minute for Meta Llama 3.2 90B Instruct On-demand model inference tokens per minute for Meta Llama 3.2 90B Instruct general | 300,000 count | Fixed |
On-demand model inference requests per minute for Meta Llama 3.2 3B Instruct On-demand model inference requests per minute for Meta Llama 3.2 3B Instruct throughput | 800 count | Fixed |
Sum of in-progress and submitted batch inference jobs using a base model for OpenAI GPT OSS Safeguard 20b Sum of in-progress and submitted batch inference jobs using a base model for OpenAI GPT OSS Safeguard 20b general | 100 count | Adjustable |
Batch inference job size (in GB) for Qwen3 Coder Next Batch inference job size (in GB) for Qwen3 Coder Next storage | 5 count | Fixed |
On-demand model inference tokens per minute for NVIDIA Nemotron 3 Super 120B A12B On-demand model inference tokens per minute for NVIDIA Nemotron 3 Super 120B A12B general | 100,000,000 count | Fixed |
(Data Automation) CreateBlueprint - Max number of blueprints per account (Data Automation) CreateBlueprint - Max number of blueprints per account general | 350 count | Adjustable |
Cross-region model inference requests per minute for Stable Image Search and Replace Cross-region model inference requests per minute for Stable Image Search and Replace throughput | 20 count | Fixed |
Sum of in-progress and submitted batch inference jobs using a base model for Claude 3.5 Haiku Sum of in-progress and submitted batch inference jobs using a base model for Claude 3.5 Haiku general | 100 count | Adjustable |
Batch inference job size (in GB) for OpenAI GPT OSS Safeguard 120b Batch inference job size (in GB) for OpenAI GPT OSS Safeguard 120b storage | 5 count | Fixed |
Batch inference job size (in GB) for Titan Text Embeddings V2 Batch inference job size (in GB) for Titan Text Embeddings V2 storage | 5 count | Fixed |
Sum of in-progress and submitted batch inference jobs using a base model for Claude Haiku 4.5 Sum of in-progress and submitted batch inference jobs using a base model for Claude Haiku 4.5 general | 100 count | Adjustable |
Cross-region model inference tokens per minute for Anthropic Claude 3.5 Sonnet Cross-region model inference tokens per minute for Anthropic Claude 3.5 Sonnet general | 800,000 count | Adjustable |
Batch inference input file size (in GB) for NVIDIA Nemotron Nano 3 30B Batch inference input file size (in GB) for NVIDIA Nemotron Nano 3 30B storage | 1 count | Fixed |
Global cross-region model inference tokens per minute for Anthropic Claude Opus 4.5 Global cross-region model inference tokens per minute for Anthropic Claude Opus 4.5 general | 2,000,000 count | Adjustable |
On-demand model inference requests per minute for Mistral Large On-demand model inference requests per minute for Mistral Large throughput | 400 count | Fixed |
Throttle rate limit for DeleteDataAutomationProject Throttle rate limit for DeleteDataAutomationProject throughput | 5 count | Fixed |
Batch inference job size (in GB) for MiniMax M2.5 Batch inference job size (in GB) for MiniMax M2.5 storage | 5 count | Fixed |
Batch inference input file size (in GB) for Amazon Nova Premier Batch inference input file size (in GB) for Amazon Nova Premier storage | 1 count | Fixed |
(Flows) Output nodes per flow (Flows) Output nodes per flow capacity | 20 count | Fixed |
Model units, with commitment, for Provisioned Throughout created for Meta Llama 4 Scout 17B Instruct 10M Model units, with commitment, for Provisioned Throughout created for Meta Llama 4 Scout 17B Instruct 10M general | 0 count | Adjustable |
Batch inference job size (in GB) for Qwen3 Coder 30B Batch inference job size (in GB) for Qwen3 Coder 30B storage | 5 count | Fixed |
Batch inference input file size (in GB) for Llama 3.3 70B Instruct Batch inference input file size (in GB) for Llama 3.3 70B Instruct storage | 1 count | Fixed |
Batch inference job size (in GB) for Voxtral Small 24B 2507 Batch inference job size (in GB) for Voxtral Small 24B 2507 storage | 5 count | Fixed |
(Model customization) Sum of training and validation records for a Titan Image Generator G1 V1 Fine-tuning job (Model customization) Sum of training and validation records for a Titan Image Generator G1 V1 Fine-tuning job general | 10,000 count | Adjustable |
(Automated Reasoning) Rules in policy (Automated Reasoning) Rules in policy identity | 500 count | Fixed |
Minimum number of records per batch inference job for Amazon Nova Premier Minimum number of records per batch inference job for Amazon Nova Premier general | 100 count | Fixed |
On-demand model inference requests per minute for Amazon Rerank 1.0 On-demand model inference requests per minute for Amazon Rerank 1.0 throughput | 200 count | Fixed |
Sum of in-progress and submitted batch inference jobs using a base model for Qwen3 32B Sum of in-progress and submitted batch inference jobs using a base model for Qwen3 32B general | 100 count | Adjustable |
Model invocation max tokens per day for Gemma 3 12B (doubled for cross-region calls) Model invocation max tokens per day for Gemma 3 12B (doubled for cross-region calls) general | 144,000,000,000 count | Fixed |
DeleteAgentVersion requests per second DeleteAgentVersion requests per second throughput | 2 count | Fixed |
Batch inference input file size (in GB) for MiniMax M2.5 Batch inference input file size (in GB) for MiniMax M2.5 storage | 1 count | Fixed |
On-demand model inference tokens per minute for Gemma 3 4B On-demand model inference tokens per minute for Gemma 3 4B general | 100,000,000 count | Fixed |
Minimum number of records per batch inference job for Llama 3.2 90B Instruct Minimum number of records per batch inference job for Llama 3.2 90B Instruct general | 100 count | Fixed |
Records per batch inference job for OpenAI GPT OSS 120b Records per batch inference job for OpenAI GPT OSS 120b general | 100,000 count | Adjustable |
Minimum number of records per batch inference job for Ministral 3 14B Minimum number of records per batch inference job for Ministral 3 14B general | 100 count | Fixed |
(Prompt management) Prompts per account (Prompt management) Prompts per account general | 500 count | Adjustable |
(Data Automation) Maximum Audio Sample Rate (Hz) (Data Automation) Maximum Audio Sample Rate (Hz) throughput | 48,000 count | Fixed |
(Automated Reasoning) GetAutomatedReasoningPolicyBuildWorkflow requests per second (Automated Reasoning) GetAutomatedReasoningPolicyBuildWorkflow requests per second throughput | 10 count | Adjustable |
On-demand model inference tokens per minute for Cohere Embed Multilingual On-demand model inference tokens per minute for Cohere Embed Multilingual general | 300,000 count | Fixed |
Cross-region model inference tokens per minute for DeepSeek R1 V1 Cross-region model inference tokens per minute for DeepSeek R1 V1 general | 200,000 count | Adjustable |
On-demand model inference requests per minute for Stability.ai Stable Diffusion XL 1.0 On-demand model inference requests per minute for Stability.ai Stable Diffusion XL 1.0 throughput | 60 count | Fixed |
On-demand model inference requests per minute for TwelveLabs Marengo Embed 3.0 On-demand model inference requests per minute for TwelveLabs Marengo Embed 3.0 throughput | 500 count | Adjustable |
Records per input file per batch inference job for GLM 4.7 Flash Records per input file per batch inference job for GLM 4.7 Flash storage | 100,000 count | Adjustable |
On-demand model inference requests per minute for AI21 Labs Jurassic-2 Ultra On-demand model inference requests per minute for AI21 Labs Jurassic-2 Ultra throughput | 100 count | Fixed |
Records per batch inference job for Devstral 2 123B Records per batch inference job for Devstral 2 123B general | 100,000 count | Adjustable |
On-demand InvokeModel async concurrent requests for TwelveLabs Marengo Embed 3.0 On-demand InvokeModel async concurrent requests for TwelveLabs Marengo Embed 3.0 compute | 10 count | Adjustable |
(Model customization) Sum of training and validation records for a Titan Text G1 - Express v1 Fine-tuning job (Model customization) Sum of training and validation records for a Titan Text G1 - Express v1 Fine-tuning job general | 10,000 count | Adjustable |
Minimum number of records per batch inference job for Llama 3.1 405B Instruct Minimum number of records per batch inference job for Llama 3.1 405B Instruct general | 100 count | Fixed |
Records per batch inference job for Llama 4 Scout Records per batch inference job for Llama 4 Scout general | 100,000 count | Adjustable |
(Guardrails) Contextual grounding source length in text units (Guardrails) Contextual grounding source length in text units general | 100 count | Fixed |
On-demand model inference tokens per minute for AI21 Labs Jamba 1.5 Mini On-demand model inference tokens per minute for AI21 Labs Jamba 1.5 Mini general | 300,000 count | Fixed |
Records per input file per batch inference job for NVIDIA Nemotron Nano 12B Records per input file per batch inference job for NVIDIA Nemotron Nano 12B storage | 100,000 count | Adjustable |
Model units per provisioned model for the 24k context length variant for Amazon Nova Micro Model units per provisioned model for the 24k context length variant for Amazon Nova Micro general | 0 count | Adjustable |
Cross-Region model inference tokens per minute for Anthropic Claude 3.5 Haiku Cross-Region model inference tokens per minute for Anthropic Claude 3.5 Haiku general | 4,000,000 count | Adjustable |
Records per input file per batch inference job for Claude 3 Opus Records per input file per batch inference job for Claude 3 Opus storage | 100,000 count | Adjustable |
Batch inference job size (in GB) for Qwen3 VL 235B Batch inference job size (in GB) for Qwen3 VL 235B storage | 5 count | Fixed |
Records per input file per batch inference job for Gemma 3 12B Records per input file per batch inference job for Gemma 3 12B storage | 100,000 count | Adjustable |
On-demand model inference tokens per minute for Nemotron Nano 3 30B On-demand model inference tokens per minute for Nemotron Nano 3 30B general | 100,000,000 count | Fixed |
Minimum number of records per batch inference job for Mistral Large 3 Minimum number of records per batch inference job for Mistral Large 3 general | 100 count | Fixed |
Minimum number of records per batch inference job for Claude Haiku 4.5 Minimum number of records per batch inference job for Claude Haiku 4.5 general | 100 count | Fixed |
On-demand model inference requests per minute for Stable Image Style Transfer On-demand model inference requests per minute for Stable Image Style Transfer throughput | 10 count | Fixed |
Global cross-region model inference requests per minute for Anthropic Claude Sonnet 4.5 V1 Global cross-region model inference requests per minute for Anthropic Claude Sonnet 4.5 V1 throughput | 10,000 count | Adjustable |
(Data Automation) (Console) Maximum document file size (MB) (Data Automation) (Console) Maximum document file size (MB) storage | 200 count | Fixed |
Model units per provisioned model for the 300k context length variant for Amazon Nova Pro Model units per provisioned model for the 300k context length variant for Amazon Nova Pro general | 0 count | Adjustable |
On-demand model inference requests per minute for Qwen3 Coder 30B a3b V1 On-demand model inference requests per minute for Qwen3 Coder 30B a3b V1 throughput | 10,000 count | Fixed |
On-demand model inference requests per minute for Anthropic Claude 3 Haiku On-demand model inference requests per minute for Anthropic Claude 3 Haiku throughput | 1,000 count | Fixed |
Records per input file per batch inference job for MiniMax M2.5 Records per input file per batch inference job for MiniMax M2.5 storage | 100,000 count | Adjustable |
(Model customization) Sum of training and validation records for a Amazon Nova 2 Lite Fine-tuning job (Model customization) Sum of training and validation records for a Amazon Nova 2 Lite Fine-tuning job general | 20,000 count | Adjustable |
GetAgentKnowledgeBase requests per second GetAgentKnowledgeBase requests per second throughput | 15 count | Fixed |
Sum of in-progress and submitted batch inference jobs using a base model for OpenAI GPT OSS Safeguard 120b Sum of in-progress and submitted batch inference jobs using a base model for OpenAI GPT OSS Safeguard 120b general | 100 count | Adjustable |
(Data Automation) (Console) Maximum number of pages per document file (Data Automation) (Console) Maximum number of pages per document file storage | 20 count | Fixed |
Records per batch inference job for Llama 3.2 90B Instruct Records per batch inference job for Llama 3.2 90B Instruct general | 100,000 count | Adjustable |
Model invocation max tokens per day for Amazon Nova Micro (doubled for cross-region calls) Model invocation max tokens per day for Amazon Nova Micro (doubled for cross-region calls) general | 5,760,000,000 count | Fixed |
Throttle rate limit for CreateBlueprintVersion Throttle rate limit for CreateBlueprintVersion throughput | 5 count | Fixed |
Model invocation max tokens per day for OpenAI GPT OSS 120B (doubled for cross-region calls) Model invocation max tokens per day for OpenAI GPT OSS 120B (doubled for cross-region calls) general | 144,000,000,000 count | Fixed |
Model units per provisioned model for Anthropic Claude 3.5 Sonnet V2 18K Model units per provisioned model for Anthropic Claude 3.5 Sonnet V2 18K general | 0 count | Adjustable |
Sum of in-progress and submitted batch inference jobs using a base model for Claude Sonnet 4.5. Sum of in-progress and submitted batch inference jobs using a base model for Claude Sonnet 4.5. general | 100 count | Adjustable |
Minimum number of records per batch inference job for Kimi K2.5 Minimum number of records per batch inference job for Kimi K2.5 general | 100 count | Fixed |
(Flows) CreateFlowAlias requests per second (Flows) CreateFlowAlias requests per second throughput | 2 count | Fixed |
Model units per provisioned model for AI21 Labs Jurassic-2 Mid Model units per provisioned model for AI21 Labs Jurassic-2 Mid general | 0 count | Adjustable |
Batch inference input file size (in GB) for NVIDIA Nemotron Nano 9B Batch inference input file size (in GB) for NVIDIA Nemotron Nano 9B storage | 1 count | Fixed |
Records per input file per batch inference job for Devstral 2 123B Records per input file per batch inference job for Devstral 2 123B storage | 100,000 count | Adjustable |
Sum of in-progress and submitted batch inference jobs using a base model for Llama 3.2 11B Instruct Sum of in-progress and submitted batch inference jobs using a base model for Llama 3.2 11B Instruct general | 100 count | Adjustable |
On-demand model inference tokens per minute for Anthropic Claude 3 Haiku On-demand model inference tokens per minute for Anthropic Claude 3 Haiku general | 2,000,000 count | Fixed |
(Data Automation) Maximum Blueprints per Project (Images) (Data Automation) Maximum Blueprints per Project (Images) general | 1 count | Fixed |
Global cross-region model inference tokens per day for Amazon Nova 2 Omni Global cross-region model inference tokens per day for Amazon Nova 2 Omni general | 11,520,000,000 count | Fixed |
Records per input file per batch inference job for NVIDIA Nemotron Nano 9B Records per input file per batch inference job for NVIDIA Nemotron Nano 9B storage | 100,000 count | Adjustable |
Agent Collaborators per Agent Agent Collaborators per Agent general | 1,000 count | Adjustable |
UpdateAgentActionGroup requests per second UpdateAgentActionGroup requests per second throughput | 6 count | Fixed |
(Automated Reasoning) GetAutomatedReasoningPolicyBuildWorkflowResultAssets requests per second (Automated Reasoning) GetAutomatedReasoningPolicyBuildWorkflowResultAssets requests per second throughput | 10 count | Adjustable |
Model units, with commitment, for Provisioned Throughout created for Meta Llama 4 Scout 17B Instruct 128K Model units, with commitment, for Provisioned Throughout created for Meta Llama 4 Scout 17B Instruct 128K general | 0 count | Adjustable |
Sum of in-progress and submitted batch inference jobs using a base model for GLM 4.7 Sum of in-progress and submitted batch inference jobs using a base model for GLM 4.7 general | 100 count | Adjustable |
(Automated Reasoning) GetAutomatedReasoningPolicyTestCase requests per second (Automated Reasoning) GetAutomatedReasoningPolicyTestCase requests per second throughput | 10 count | Adjustable |
Batch inference job size (in GB) for MiniMax M2 Batch inference job size (in GB) for MiniMax M2 storage | 5 count | Fixed |
Model units per provisioned model for Amazon Titan Multimodal Embeddings G1 Model units per provisioned model for Amazon Titan Multimodal Embeddings G1 general | 0 count | Adjustable |
(Flows) Total nodes per flow (Flows) Total nodes per flow capacity | 40 count | Fixed |
(Automated Reasoning) UpdateAutomatedReasoningPolicy requests per second (Automated Reasoning) UpdateAutomatedReasoningPolicy requests per second throughput | 5 count | Adjustable |
Model invocation max tokens per day for Mistral Pixtral Large 25.02 V1 (doubled for cross-region calls) Model invocation max tokens per day for Mistral Pixtral Large 25.02 V1 (doubled for cross-region calls) general | 57,600,000 count | Fixed |
(Guardrails) On-demand ApplyGuardrail Content filter policy text units per second (standard) (Guardrails) On-demand ApplyGuardrail Content filter policy text units per second (standard) identity | 200 count | Adjustable |
Records per batch inference job for NVIDIA Nemotron 3 Super 120B A12B Records per batch inference job for NVIDIA Nemotron 3 Super 120B A12B general | 100,000 count | Adjustable |
On-Demand, latency-optimized model inference tokens per minute for Meta Llama 3.1 70B Instruct On-Demand, latency-optimized model inference tokens per minute for Meta Llama 3.1 70B Instruct general | 40,000 count | Fixed |
On-demand model inference tokens per minute for Meta Llama 2 13B On-demand model inference tokens per minute for Meta Llama 2 13B general | 300,000 count | Fixed |
On-demand model inference requests per minute for GPT OSS Safeguard 120B On-demand model inference requests per minute for GPT OSS Safeguard 120B throughput | 10,000 count | Fixed |
(Flows) S3 retrieval nodes per flow (Flows) S3 retrieval nodes per flow capacity | 10 count | Fixed |
On-demand model inference requests per minute for Qwen3 Coder Next On-demand model inference requests per minute for Qwen3 Coder Next throughput | 10,000 count | Fixed |
Sum of in-progress and submitted batch inference jobs using a base model for Llama 3.2 90B Instruct Sum of in-progress and submitted batch inference jobs using a base model for Llama 3.2 90B Instruct general | 100 count | Adjustable |
Model invocation max tokens per day for Anthropic Claude 3 Haiku (doubled for cross-region calls) Model invocation max tokens per day for Anthropic Claude 3 Haiku (doubled for cross-region calls) general | 2,880,000,000 count | Fixed |
Model units per provisioned model for Amazon Nova Canvas Model units per provisioned model for Amazon Nova Canvas general | 0 count | Adjustable |
Model invocation max tokens per day for Kimi K2 Thinking (doubled for cross-region calls) Model invocation max tokens per day for Kimi K2 Thinking (doubled for cross-region calls) general | 144,000,000,000 count | Fixed |
On-demand model inference tokens per minute for Mistral AI Mixtral 8X7BB Instruct On-demand model inference tokens per minute for Mistral AI Mixtral 8X7BB Instruct general | 300,000 count | Fixed |
Minimum number of records per batch inference job for MiniMax M2.1 Minimum number of records per batch inference job for MiniMax M2.1 general | 100 count | Fixed |
On-demand model inference requests per minute for Mistral Devstral 2 123b On-demand model inference requests per minute for Mistral Devstral 2 123b throughput | 10,000 count | Fixed |
ListAgentActionGroups requests per second ListAgentActionGroups requests per second throughput | 10 count | Fixed |
Batch inference input file size (in GB) for NVIDIA Nemotron 3 Super 120B A12B Batch inference input file size (in GB) for NVIDIA Nemotron 3 Super 120B A12B storage | 1 count | Fixed |
Sum of in-progress and submitted batch inference jobs using a base model for Ministral 3 14B Sum of in-progress and submitted batch inference jobs using a base model for Ministral 3 14B general | 100 count | Adjustable |
Batch inference input file size (in GB) for Kimi K2 Thinking Batch inference input file size (in GB) for Kimi K2 Thinking storage | 1 count | Fixed |
Batch inference input file size (in GB) for Mistral Large 3 Batch inference input file size (in GB) for Mistral Large 3 storage | 1 count | Fixed |
Model invocation max tokens per day for Anthropic Claude Sonnet 4 V1 (doubled for cross-region calls) Model invocation max tokens per day for Anthropic Claude Sonnet 4 V1 (doubled for cross-region calls) general | 144,000,000 count | Fixed |
On-demand model inference tokens per minute for GPT OSS Safeguard 20B On-demand model inference tokens per minute for GPT OSS Safeguard 20B general | 100,000,000 count | Fixed |
Model invocation max tokens per day for Qwen3 Coder 30B a3b V1 (doubled for cross-region calls) Model invocation max tokens per day for Qwen3 Coder 30B a3b V1 (doubled for cross-region calls) general | 144,000,000,000 count | Fixed |
Sum of in-progress and submitted batch inference jobs using a base model for Claude 3.5 Sonnet Sum of in-progress and submitted batch inference jobs using a base model for Claude 3.5 Sonnet general | 100 count | Adjustable |
CreateAgentActionGroup requests per second CreateAgentActionGroup requests per second throughput | 12 count | Fixed |
(Flows) CreateFlowVersion requests per second (Flows) CreateFlowVersion requests per second throughput | 2 count | Fixed |
(Automated Reasoning) Tests per policy (Automated Reasoning) Tests per policy identity | 100 count | Fixed |
Cross-region model inference tokens per minute for Amazon Nova 2 Omni Cross-region model inference tokens per minute for Amazon Nova 2 Omni general | 8,000,000 count | Adjustable |
Minimum number of records per batch inference job for Qwen3 Next 80B Minimum number of records per batch inference job for Qwen3 Next 80B general | 100 count | Fixed |
Records per input file per batch inference job for Llama 3.1 70B Instruct Records per input file per batch inference job for Llama 3.1 70B Instruct storage | 100,000 count | Adjustable |
Minimum number of records per batch inference job for Magistral Small 2509 Minimum number of records per batch inference job for Magistral Small 2509 general | 100 count | Fixed |
On-demand model inference requests per minute for Amazon Titan Image Generator G1 V2 On-demand model inference requests per minute for Amazon Titan Image Generator G1 V2 throughput | 60 count | Fixed |
(Flows) Flows per account (Flows) Flows per account general | 100 count | Adjustable |
Throttle rate limit for DeleteBlueprint Throttle rate limit for DeleteBlueprint throughput | 5 count | Fixed |
Model units per provisioned model for Anthropic Claude 3 Haiku 48K Model units per provisioned model for Anthropic Claude 3 Haiku 48K general | 0 count | Adjustable |
(Evaluation) Number of custom prompt datasets in a human-based model evaluation job (Evaluation) Number of custom prompt datasets in a human-based model evaluation job general | 1 count | Fixed |
Records per input file per batch inference job for Llama 3.1 8B Instruct Records per input file per batch inference job for Llama 3.1 8B Instruct storage | 100,000 count | Adjustable |
Model units per provisioned model for Anthropic Claude 3.5 Sonnet V2 51K Model units per provisioned model for Anthropic Claude 3.5 Sonnet V2 51K general | 0 count | Adjustable |
Batch inference input file size (in GB) for Llama 4 Maverick Batch inference input file size (in GB) for Llama 4 Maverick storage | 1 count | Fixed |
On-demand model inference tokens per minute for Minimax M2 On-demand model inference tokens per minute for Minimax M2 general | 100,000,000 count | Fixed |
Model invocation max tokens per day for Writer Palmyra Vision 7B (doubled for cross-region calls) Model invocation max tokens per day for Writer Palmyra Vision 7B (doubled for cross-region calls) general | 144,000,000,000 count | Fixed |
Our bi-weekly newsletter teaches hands-on AWS fundamentals. No certification fluff - just practical knowledge.
Subscribe to NewsletterOur bi-weekly newsletter teaches hands-on AWS fundamentals. No certification fluff - just practical knowledge.
Subscribe to Newsletter