DynamoDB Rate Limiting: Strategies for Single Table Design at Scale

When working with DynamoDB at scale, throttling becomes an inevitable challenge. The ProvisionedThroughputExceededException error often appears despite having adequate table-level capacity, and understanding why requires diving into DynamoDB's internal mechanics.

This guide covers proven patterns for preventing and handling throttling in Single Table Design applications, from partition key strategies to monitoring configurations that catch issues before they impact users.

Understanding DynamoDB's Throttling Mechanism

DynamoDB uses a token bucket algorithm for rate limiting. Each partition maintains its own bucket of read and write tokens that refill at a rate matching provisioned capacity. When tokens are depleted, requests get throttled.

The critical limits to remember:

Resource	Limit
Read Capacity per Partition	3,000 RCU
Write Capacity per Partition	1,000 WCU
Storage per Partition	10 GB
Item Size	400 KB (hard limit)

Here's what makes this tricky: provisioned capacity is distributed across partitions. A table with 100 RCU and 3 partitions means each partition gets roughly 33 RCU. If one partition receives 80% of traffic, it will throttle even though the table has headroom.

typescript

// Conceptual model: How capacity gets distributedinterface PartitionCapacity {  // Table-level settings  tableRCU: 100;  tableWCU: 50;  partitionCount: 3;
  // Per-partition reality  perPartitionRCU: 33;  // ~100/3  perPartitionWCU: 17;  // ~50/3
  // The problem: uneven traffic  actualTraffic: {    partition1: { rcu: 80 };  // 80 > 33 = THROTTLED    partition2: { rcu: 10 };  // Underutilized    partition3: { rcu: 10 };  // Underutilized  };}

// Conceptual model: How capacity gets distributedinterface PartitionCapacity {  // Table-level settings  tableRCU: 100;  tableWCU: 50;  partitionCount: 3;
  // Per-partition reality  perPartitionRCU: 33;  // ~100/3  perPartitionWCU: 17;  // ~50/3
  // The problem: uneven traffic  actualTraffic: {    partition1: { rcu: 80 };  // 80 > 33 = THROTTLED    partition2: { rcu: 10 };  // Underutilized    partition3: { rcu: 10 };  // Underutilized  };}

Partition Key Design: The Foundation

Hot partitions cause most throttling issues. Getting partition key design right prevents problems that no amount of capacity can solve.

Anti-Patterns to Avoid

typescript

// ANTI-PATTERN 1: Low cardinality partition keyinterface BadDesign1 {  PK: 'STATUS#active' | 'STATUS#inactive';  // Only 2 values  SK: `USER#${string}`;}// Result: All active users in one partition// With 100,000 active users: immediate throttling
// ANTI-PATTERN 2: Time-based partition keyinterface BadDesign2 {  PK: `DATE#${string}`;  // e.g., "DATE#2024-01-15"  SK: `EVENT#${string}`;}// Result: All today's events hit one partition// Peak hours create hot partition
// ANTI-PATTERN 3: Celebrity/Viral content probleminterface BadDesign3 {  PK: `POST#${string}`;  // Viral post ID  SK: `LIKE#${string}`;}// Result: Viral post with millions of likes// Single partition cannot handle the load
// ANTI-PATTERN 4: Large tenant dominanceinterface BadDesign4 {  PK: `TENANT#${string}`;  SK: `ORDER#${string}`;}// Result: Enterprise tenant with 80% of orders// Their partition is always hot

// ANTI-PATTERN 1: Low cardinality partition keyinterface BadDesign1 {  PK: 'STATUS#active' | 'STATUS#inactive';  // Only 2 values  SK: `USER#${string}`;}// Result: All active users in one partition// With 100,000 active users: immediate throttling
// ANTI-PATTERN 2: Time-based partition keyinterface BadDesign2 {  PK: `DATE#${string}`;  // e.g., "DATE#2024-01-15"  SK: `EVENT#${string}`;}// Result: All today's events hit one partition// Peak hours create hot partition
// ANTI-PATTERN 3: Celebrity/Viral content probleminterface BadDesign3 {  PK: `POST#${string}`;  // Viral post ID  SK: `LIKE#${string}`;}// Result: Viral post with millions of likes// Single partition cannot handle the load
// ANTI-PATTERN 4: Large tenant dominanceinterface BadDesign4 {  PK: `TENANT#${string}`;  SK: `ORDER#${string}`;}// Result: Enterprise tenant with 80% of orders// Their partition is always hot

High-Cardinality Patterns That Work

typescript

// PATTERN 1: User-scoped partition keysinterface GoodDesign1 {  PK: `USER#${userId}`;  // Unique per user  SK: `ORDER#${timestamp}#${orderId}`;}// Result: Millions of unique partition keys// Traffic naturally distributed
// PATTERN 2: Composite keys for multi-tenantinterface GoodDesign2 {  PK: `TENANT#${tenantId}#USER#${userId}`;  SK: string;}// Result: Even distribution within and across tenants// Large tenant's users still spread across partitions
// PATTERN 3: Hierarchical with high cardinality at PK levelinterface GoodDesign3 {  PK: `REGION#${region}#STORE#${storeId}`;  SK: `PRODUCT#${category}#${productId}`;}// Result: Queries scoped to store level// Each store has its own partition space
// PATTERN 4: GSI for low-cardinality queriesinterface GoodDesign4 {  PK: `USER#${userId}`;  SK: 'METADATA';  status: 'active' | 'inactive';  GSI1PK: `STATUS#${status}#SHARD#${shardId}`;  // Sharded!  GSI1SK: `USER#${userId}`;}// Base table: High cardinality (users)// GSI: Handles status queries with sharding

// PATTERN 1: User-scoped partition keysinterface GoodDesign1 {  PK: `USER#${userId}`;  // Unique per user  SK: `ORDER#${timestamp}#${orderId}`;}// Result: Millions of unique partition keys// Traffic naturally distributed
// PATTERN 2: Composite keys for multi-tenantinterface GoodDesign2 {  PK: `TENANT#${tenantId}#USER#${userId}`;  SK: string;}// Result: Even distribution within and across tenants// Large tenant's users still spread across partitions
// PATTERN 3: Hierarchical with high cardinality at PK levelinterface GoodDesign3 {  PK: `REGION#${region}#STORE#${storeId}`;  SK: `PRODUCT#${category}#${productId}`;}// Result: Queries scoped to store level// Each store has its own partition space
// PATTERN 4: GSI for low-cardinality queriesinterface GoodDesign4 {  PK: `USER#${userId}`;  SK: 'METADATA';  status: 'active' | 'inactive';  GSI1PK: `STATUS#${status}#SHARD#${shardId}`;  // Sharded!  GSI1SK: `USER#${userId}`;}// Base table: High cardinality (users)// GSI: Handles status queries with sharding

Write Sharding: Distributing Hot Keys

When business requirements force low-cardinality access patterns, write sharding distributes load across multiple partitions.

Random Suffix Sharding

Best for write-heavy patterns where read aggregation is acceptable:

typescript

import { DynamoDBDocumentClient, PutCommand, QueryCommand } from '@aws-sdk/lib-dynamodb';
const SHARD_COUNT = 10;
const getRandomShard = (): number => {  return Math.floor(Math.random() * SHARD_COUNT);};
// Writing with random shard - distributes writes evenlyconst writeToShardedPartition = async (  client: DynamoDBDocumentClient,  status: string,  userId: string,  userData: Record<string, unknown>): Promise<void> => {  const shardId = getRandomShard();
  await client.send(new PutCommand({    TableName: 'MainTable',    Item: {      PK: `STATUS#${status}#SHARD#${shardId}`,      SK: `USER#${userId}`,      ...userData    }  }));};
// Reading requires scatter-gather across all shardsconst readFromAllShards = async (  client: DynamoDBDocumentClient,  status: string): Promise<Record<string, unknown>[]> => {  const promises = Array.from({ length: SHARD_COUNT }, (_, i) =>    client.send(new QueryCommand({      TableName: 'MainTable',      KeyConditionExpression: 'PK = :pk',      ExpressionAttributeValues: {        ':pk': `STATUS#${status}#SHARD#${i}`      }    }))  );
  const results = await Promise.all(promises);  return results.flatMap(r => r.Items ?? []);};

import { DynamoDBDocumentClient, PutCommand, QueryCommand } from '@aws-sdk/lib-dynamodb';
const SHARD_COUNT = 10;
const getRandomShard = (): number => {  return Math.floor(Math.random() * SHARD_COUNT);};
// Writing with random shard - distributes writes evenlyconst writeToShardedPartition = async (  client: DynamoDBDocumentClient,  status: string,  userId: string,  userData: Record<string, unknown>): Promise<void> => {  const shardId = getRandomShard();
  await client.send(new PutCommand({    TableName: 'MainTable',    Item: {      PK: `STATUS#${status}#SHARD#${shardId}`,      SK: `USER#${userId}`,      ...userData    }  }));};
// Reading requires scatter-gather across all shardsconst readFromAllShards = async (  client: DynamoDBDocumentClient,  status: string): Promise<Record<string, unknown>[]> => {  const promises = Array.from({ length: SHARD_COUNT }, (_, i) =>    client.send(new QueryCommand({      TableName: 'MainTable',      KeyConditionExpression: 'PK = :pk',      ExpressionAttributeValues: {        ':pk': `STATUS#${status}#SHARD#${i}`      }    }))  );
  const results = await Promise.all(promises);  return results.flatMap(r => r.Items ?? []);};

Deterministic Sharding

When you need to read specific items without scatter-gather:

typescript

import { createHash } from 'crypto';
const getDeterministicShard = (entityId: string): number => {  const hash = createHash('md5').update(entityId).digest('hex');  return parseInt(hash.substring(0, 8), 16) % SHARD_COUNT;};
// Write with consistent shard based on order IDconst writeOrderWithShard = async (  client: DynamoDBDocumentClient,  date: string,  orderId: string,  orderData: Record<string, unknown>): Promise<void> => {  const shardId = getDeterministicShard(orderId);
  await client.send(new PutCommand({    TableName: 'MainTable',    Item: {      PK: `ORDERS#DATE#${date}#SHARD#${shardId}`,      SK: `ORDER#${orderId}`,      ...orderData    }  }));};
// Read specific order - calculate shard, single queryconst readOrder = async (  client: DynamoDBDocumentClient,  date: string,  orderId: string): Promise<Record<string, unknown> | undefined> => {  const shardId = getDeterministicShard(orderId);
  const result = await client.send(new GetCommand({    TableName: 'MainTable',    Key: {      PK: `ORDERS#DATE#${date}#SHARD#${shardId}`,      SK: `ORDER#${orderId}`    }  }));
  return result.Item;};

import { createHash } from 'crypto';
const getDeterministicShard = (entityId: string): number => {  const hash = createHash('md5').update(entityId).digest('hex');  return parseInt(hash.substring(0, 8), 16) % SHARD_COUNT;};
// Write with consistent shard based on order IDconst writeOrderWithShard = async (  client: DynamoDBDocumentClient,  date: string,  orderId: string,  orderData: Record<string, unknown>): Promise<void> => {  const shardId = getDeterministicShard(orderId);
  await client.send(new PutCommand({    TableName: 'MainTable',    Item: {      PK: `ORDERS#DATE#${date}#SHARD#${shardId}`,      SK: `ORDER#${orderId}`,      ...orderData    }  }));};
// Read specific order - calculate shard, single queryconst readOrder = async (  client: DynamoDBDocumentClient,  date: string,  orderId: string): Promise<Record<string, unknown> | undefined> => {  const shardId = getDeterministicShard(orderId);
  const result = await client.send(new GetCommand({    TableName: 'MainTable',    Key: {      PK: `ORDERS#DATE#${date}#SHARD#${shardId}`,      SK: `ORDER#${orderId}`    }  }));
  return result.Item;};

GSI Write Sharding

Apply the same pattern to Global Secondary Indexes to prevent GSI throttling from blocking base table writes:

typescript

import * as dynamodb from 'aws-cdk-lib/aws-dynamodb';
// CDK definition with sharded GSIconst table = new dynamodb.Table(this, 'MainTable', {  partitionKey: { name: 'PK', type: dynamodb.AttributeType.STRING },  sortKey: { name: 'SK', type: dynamodb.AttributeType.STRING },  billingMode: dynamodb.BillingMode.PAY_PER_REQUEST,});
table.addGlobalSecondaryIndex({  indexName: 'GSI1',  partitionKey: { name: 'GSI1PK', type: dynamodb.AttributeType.STRING },  sortKey: { name: 'GSI1SK', type: dynamodb.AttributeType.STRING },  projectionType: dynamodb.ProjectionType.ALL,});
// Writing with GSI shardingconst writeOrderWithGSIShard = async (  client: DynamoDBDocumentClient,  userId: string,  orderId: string,  orderDate: string): Promise<void> => {  const shardId = getRandomShard();
  await client.send(new PutCommand({    TableName: 'MainTable',    Item: {      PK: `USER#${userId}`,      SK: `ORDER#${orderDate}#${orderId}`,      EntityType: 'Order',      // Sharded GSI keys      GSI1PK: `ORDERS#DATE#${orderDate}#SHARD#${shardId}`,      GSI1SK: `USER#${userId}#ORDER#${orderId}`    }  }));};

import * as dynamodb from 'aws-cdk-lib/aws-dynamodb';
// CDK definition with sharded GSIconst table = new dynamodb.Table(this, 'MainTable', {  partitionKey: { name: 'PK', type: dynamodb.AttributeType.STRING },  sortKey: { name: 'SK', type: dynamodb.AttributeType.STRING },  billingMode: dynamodb.BillingMode.PAY_PER_REQUEST,});
table.addGlobalSecondaryIndex({  indexName: 'GSI1',  partitionKey: { name: 'GSI1PK', type: dynamodb.AttributeType.STRING },  sortKey: { name: 'GSI1SK', type: dynamodb.AttributeType.STRING },  projectionType: dynamodb.ProjectionType.ALL,});
// Writing with GSI shardingconst writeOrderWithGSIShard = async (  client: DynamoDBDocumentClient,  userId: string,  orderId: string,  orderDate: string): Promise<void> => {  const shardId = getRandomShard();
  await client.send(new PutCommand({    TableName: 'MainTable',    Item: {      PK: `USER#${userId}`,      SK: `ORDER#${orderDate}#${orderId}`,      EntityType: 'Order',      // Sharded GSI keys      GSI1PK: `ORDERS#DATE#${orderDate}#SHARD#${shardId}`,      GSI1SK: `USER#${userId}#ORDER#${orderId}`    }  }));};

Warning: GSI throttling causes backpressure to base table writes. If your GSI cannot keep up with base table write velocity, all writes fail. Always match GSI capacity to base table needs.

Capacity Mode Selection

On-Demand Mode: Understanding the Limits

On-demand capacity has scaling constraints that catch teams off guard:

typescript

interface OnDemandBehavior {  // Initial capacity for new tables  initialCapacity: {    rcu: 12000;  // 4 partitions * 3,000 RCU    wcu: 4000;  // 4 partitions * 1,000 WCU  };
  scaling: {    // Instant scale to previous peak    previousPeak: 'instant';
    // Beyond previous peak: limited growth    beyondPeak: {      rate: 'Double every 30 minutes';      limit: 'Cannot exceed 2x within 30-min window';    };  };
  // Account-level limits  accountLimits: {    defaultPerTable: 40000;  // RCU and WCU    requestIncrease: true;  };}

interface OnDemandBehavior {  // Initial capacity for new tables  initialCapacity: {    rcu: 12000;  // 4 partitions * 3,000 RCU    wcu: 4000;  // 4 partitions * 1,000 WCU  };
  scaling: {    // Instant scale to previous peak    previousPeak: 'instant';
    // Beyond previous peak: limited growth    beyondPeak: {      rate: 'Double every 30 minutes';      limit: 'Cannot exceed 2x within 30-min window';    };  };
  // Account-level limits  accountLimits: {    defaultPerTable: 40000;  // RCU and WCU    requestIncrease: true;  };}

For traffic spikes, this 2x limit matters. A flash sale with 10x normal traffic cannot be handled immediately by on-demand. The table needs to "warm up" gradually or use pre-provisioned capacity.

Provisioned with Auto-Scaling

For predictable workloads with cost sensitivity:

typescript

import * as dynamodb from 'aws-cdk-lib/aws-dynamodb';import * as appautoscaling from 'aws-cdk-lib/aws-applicationautoscaling';import { Duration } from 'aws-cdk-lib';
const table = new dynamodb.Table(this, 'MainTable', {  tableName: 'ProductionTable',  partitionKey: { name: 'PK', type: dynamodb.AttributeType.STRING },  sortKey: { name: 'SK', type: dynamodb.AttributeType.STRING },  billingMode: dynamodb.BillingMode.PROVISIONED,  readCapacity: 100,  writeCapacity: 50,});
// Auto-scaling for readsconst readScaling = table.autoScaleReadCapacity({  minCapacity: 10,  maxCapacity: 1000,});
readScaling.scaleOnUtilization({  targetUtilizationPercent: 70,  // Scale up before hitting limits});
// Auto-scaling for writesconst writeScaling = table.autoScaleWriteCapacity({  minCapacity: 5,  maxCapacity: 500,});
writeScaling.scaleOnUtilization({  targetUtilizationPercent: 70,});
// Scheduled scaling for predictable patternswriteScaling.scaleOnSchedule('ScaleUpMorning', {  schedule: appautoscaling.Schedule.cron({ hour: '8', minute: '0' }),  minCapacity: 100,  maxCapacity: 500,});
writeScaling.scaleOnSchedule('ScaleDownNight', {  schedule: appautoscaling.Schedule.cron({ hour: '22', minute: '0' }),  minCapacity: 5,  maxCapacity: 100,});

import * as dynamodb from 'aws-cdk-lib/aws-dynamodb';import * as appautoscaling from 'aws-cdk-lib/aws-applicationautoscaling';import { Duration } from 'aws-cdk-lib';
const table = new dynamodb.Table(this, 'MainTable', {  tableName: 'ProductionTable',  partitionKey: { name: 'PK', type: dynamodb.AttributeType.STRING },  sortKey: { name: 'SK', type: dynamodb.AttributeType.STRING },  billingMode: dynamodb.BillingMode.PROVISIONED,  readCapacity: 100,  writeCapacity: 50,});
// Auto-scaling for readsconst readScaling = table.autoScaleReadCapacity({  minCapacity: 10,  maxCapacity: 1000,});
readScaling.scaleOnUtilization({  targetUtilizationPercent: 70,  // Scale up before hitting limits});
// Auto-scaling for writesconst writeScaling = table.autoScaleWriteCapacity({  minCapacity: 5,  maxCapacity: 500,});
writeScaling.scaleOnUtilization({  targetUtilizationPercent: 70,});
// Scheduled scaling for predictable patternswriteScaling.scaleOnSchedule('ScaleUpMorning', {  schedule: appautoscaling.Schedule.cron({ hour: '8', minute: '0' }),  minCapacity: 100,  maxCapacity: 500,});
writeScaling.scaleOnSchedule('ScaleDownNight', {  schedule: appautoscaling.Schedule.cron({ hour: '22', minute: '0' }),  minCapacity: 5,  maxCapacity: 100,});

Decision Framework

Factor	On-Demand	Provisioned + Auto-Scaling
Traffic predictability	Unpredictable/spiky	Steady with gradual changes
Scaling speed needed	Instant (within 2x)	1-2 minute delay acceptable
Cost sensitivity	Lower priority	Higher priority
Peak-to-average ratio	> 4:1	< 4:1
Development/testing	Recommended	Not recommended
Utilization rate	< 30% average	> 30% average

Burst and Adaptive Capacity

DynamoDB provides two automatic mechanisms that help with uneven traffic patterns.

Burst Capacity

Unused capacity accumulates for up to 5 minutes and can be consumed during traffic spikes:

typescript

interface BurstCapacity {  accumulation: {    source: 'Unused provisioned capacity';    maxRetention: '5 minutes (300 seconds)';    refillRate: '1 token per unused RCU/WCU per second';  };
  consumption: {    trigger: 'Traffic exceeds provisioned capacity';    speed: 'Can consume faster than provisioned rate';    limit: 'Until burst bucket depleted';  };
  // Important limitations  warnings: [    'Temporary safeguard, not capacity planning substitute',    'DynamoDB may use for background maintenance',    'No guarantee of availability',    'Cannot be monitored or relied upon'  ];}

interface BurstCapacity {  accumulation: {    source: 'Unused provisioned capacity';    maxRetention: '5 minutes (300 seconds)';    refillRate: '1 token per unused RCU/WCU per second';  };
  consumption: {    trigger: 'Traffic exceeds provisioned capacity';    speed: 'Can consume faster than provisioned rate';    limit: 'Until burst bucket depleted';  };
  // Important limitations  warnings: [    'Temporary safeguard, not capacity planning substitute',    'DynamoDB may use for background maintenance',    'No guarantee of availability',    'Cannot be monitored or relied upon'  ];}

Adaptive Capacity and Split-for-Heat

DynamoDB automatically rebalances capacity toward hot partitions and can split them when needed:

typescript

interface AdaptiveCapacity {  behavior: {    detection: 'Monitors traffic patterns per partition';    action: 'Reallocates throughput from cold to hot partitions';    limit: 'Cannot exceed partition maximum (3,000 RCU, 1,000 WCU)';  };
  splitForHeat: {    trigger: 'Sustained high throughput on single partition';    action: 'Automatically splits partition into two';    result: 'Doubles available capacity for that key range';    timing: 'Takes several minutes';  };
  // When it helps  scenarios: [    'Temporary traffic spikes',    'Gradual hot partition development',    'Uneven but distributed access patterns'  ];
  // When it does NOT help  limitations: [    'Single hot key (celebrity problem)',    'All writes to same partition key value',    'Low-cardinality partition keys',    'Item collections with LSI cannot split'  ];}

interface AdaptiveCapacity {  behavior: {    detection: 'Monitors traffic patterns per partition';    action: 'Reallocates throughput from cold to hot partitions';    limit: 'Cannot exceed partition maximum (3,000 RCU, 1,000 WCU)';  };
  splitForHeat: {    trigger: 'Sustained high throughput on single partition';    action: 'Automatically splits partition into two';    result: 'Doubles available capacity for that key range';    timing: 'Takes several minutes';  };
  // When it helps  scenarios: [    'Temporary traffic spikes',    'Gradual hot partition development',    'Uneven but distributed access patterns'  ];
  // When it does NOT help  limitations: [    'Single hot key (celebrity problem)',    'All writes to same partition key value',    'Low-cardinality partition keys',    'Item collections with LSI cannot split'  ];}

Note: Adaptive capacity rebalancing is instant (since May 2019), but split-for-heat (partition splitting) takes several minutes. For flash sale scenarios or viral content, a single hot partition key cannot be helped by either mechanism. Design partition keys properly rather than relying on adaptive capacity.

DAX for Read-Heavy Workloads

DynamoDB Accelerator (DAX) offloads read traffic from DynamoDB, reducing both latency and capacity consumption.

Note: The DAX SDK for JavaScript v3 (@amazon-dax-sdk/lib-dax) was released in March 2025. It uses aggregated methods (.get(), .query()) instead of the .send() pattern used by the standard DynamoDB SDK v3.

typescript

import { DaxDocument } from '@amazon-dax-sdk/lib-dax';import { DynamoDBDocumentClient, UpdateCommand } from '@aws-sdk/lib-dynamodb';
// DAX client setup (AWS SDK v3 compatible)const createDaxClient = (endpoints: string[]): DaxDocument => {  return new DaxDocument({    endpoints,    region: process.env.AWS_REGION ?? 'us-east-1',  });};
// Client factory for choosing based on operation typeinterface ClientFactory {  daxClient: DaxDocument;  // For cacheable reads  dynamoClient: DynamoDBDocumentClient;  // For writes, strong consistency}
// Usage pattern: reads through DAX, writes directlyconst productService = {  // Read through DAX (microsecond latency, offloads DynamoDB)  // Note: DaxDocument uses aggregated methods, not .send()  getProduct: async (    factory: ClientFactory,    productId: string  ): Promise<Record<string, unknown> | undefined> => {    const result = await factory.daxClient.get({      TableName: 'Products',      Key: { PK: `PRODUCT#${productId}`, SK: 'METADATA' }    });    return result.Item;  },
  // Query through DAX (cached result sets)  getProductsByCategory: async (    factory: ClientFactory,    category: string  ): Promise<Record<string, unknown>[]> => {    const result = await factory.daxClient.query({      TableName: 'Products',      IndexName: 'GSI1',      KeyConditionExpression: 'GSI1PK = :category',      ExpressionAttributeValues: { ':category': `CATEGORY#${category}` }    });    return result.Items ?? [];  },
  // Write directly to DynamoDB  // IMPORTANT: DAX only auto-invalidates cache for writes made THROUGH DAX.  // Writes directly to DynamoDB (bypassing DAX) are NOT reflected in DAX  // cache until TTL expires. For write-through caching, use daxClient.put().  updateProduct: async (    factory: ClientFactory,    productId: string,    updates: Record<string, unknown>  ): Promise<void> => {    await factory.dynamoClient.send(new UpdateCommand({      TableName: 'Products',      Key: { PK: `PRODUCT#${productId}`, SK: 'METADATA' },      UpdateExpression: 'SET #name = :name, #price = :price',      ExpressionAttributeNames: { '#name': 'name', '#price': 'price' },      ExpressionAttributeValues: updates    }));  }};

import { DaxDocument } from '@amazon-dax-sdk/lib-dax';import { DynamoDBDocumentClient, UpdateCommand } from '@aws-sdk/lib-dynamodb';
// DAX client setup (AWS SDK v3 compatible)const createDaxClient = (endpoints: string[]): DaxDocument => {  return new DaxDocument({    endpoints,    region: process.env.AWS_REGION ?? 'us-east-1',  });};
// Client factory for choosing based on operation typeinterface ClientFactory {  daxClient: DaxDocument;  // For cacheable reads  dynamoClient: DynamoDBDocumentClient;  // For writes, strong consistency}
// Usage pattern: reads through DAX, writes directlyconst productService = {  // Read through DAX (microsecond latency, offloads DynamoDB)  // Note: DaxDocument uses aggregated methods, not .send()  getProduct: async (    factory: ClientFactory,    productId: string  ): Promise<Record<string, unknown> | undefined> => {    const result = await factory.daxClient.get({      TableName: 'Products',      Key: { PK: `PRODUCT#${productId}`, SK: 'METADATA' }    });    return result.Item;  },
  // Query through DAX (cached result sets)  getProductsByCategory: async (    factory: ClientFactory,    category: string  ): Promise<Record<string, unknown>[]> => {    const result = await factory.daxClient.query({      TableName: 'Products',      IndexName: 'GSI1',      KeyConditionExpression: 'GSI1PK = :category',      ExpressionAttributeValues: { ':category': `CATEGORY#${category}` }    });    return result.Items ?? [];  },
  // Write directly to DynamoDB  // IMPORTANT: DAX only auto-invalidates cache for writes made THROUGH DAX.  // Writes directly to DynamoDB (bypassing DAX) are NOT reflected in DAX  // cache until TTL expires. For write-through caching, use daxClient.put().  updateProduct: async (    factory: ClientFactory,    productId: string,    updates: Record<string, unknown>  ): Promise<void> => {    await factory.dynamoClient.send(new UpdateCommand({      TableName: 'Products',      Key: { PK: `PRODUCT#${productId}`, SK: 'METADATA' },      UpdateExpression: 'SET #name = :name, #price = :price',      ExpressionAttributeNames: { '#name': 'name', '#price': 'price' },      ExpressionAttributeValues: updates    }));  }};

When DAX Makes Sense

Use Case	DAX Value
Product catalogs (high read, low write)	High
User sessions (read-mostly)	High
Configuration data (rarely changes)	High
Flash sale product pages	Very High
Write-heavy workloads	Low
Strong consistency requirements	None
Low traffic (< 200 req/sec)	Negative (cost overhead)
Random access patterns (< 80% hit rate)	Low

TTL Strategy by Data Type

typescript

const daxTTLStrategy = {  staticData: {    ttl: 3600000,  // 1 hour    examples: ['Product catalog', 'Category list', 'Configuration']  },  semiStatic: {    ttl: 300000,  // 5 minutes (default)    examples: ['User profiles', 'Settings', 'Preferences']  },  dynamic: {    ttl: 60000,  // 1 minute    examples: ['Inventory counts', 'Availability', 'Pricing']  }};

const daxTTLStrategy = {  staticData: {    ttl: 3600000,  // 1 hour    examples: ['Product catalog', 'Category list', 'Configuration']  },  semiStatic: {    ttl: 300000,  // 5 minutes (default)    examples: ['User profiles', 'Settings', 'Preferences']  },  dynamic: {    ttl: 60000,  // 1 minute    examples: ['Inventory counts', 'Availability', 'Pricing']  }};

Retry Strategies and Circuit Breakers

Handling throttling gracefully requires proper retry logic. The AWS SDK provides built-in retries, but batch operations need additional handling.

SDK Configuration

typescript

import { DynamoDBClient } from '@aws-sdk/client-dynamodb';import { DynamoDBDocumentClient } from '@aws-sdk/lib-dynamodb';
const createClientWithRetry = (): DynamoDBDocumentClient => {  const client = new DynamoDBClient({    maxAttempts: 10,    retryMode: 'adaptive',  // Recommended for DynamoDB    // Adaptive mode tracks throttling per resource    // and reduces throughput for throttled tables  });
  return DynamoDBDocumentClient.from(client);};

import { DynamoDBClient } from '@aws-sdk/client-dynamodb';import { DynamoDBDocumentClient } from '@aws-sdk/lib-dynamodb';
const createClientWithRetry = (): DynamoDBDocumentClient => {  const client = new DynamoDBClient({    maxAttempts: 10,    retryMode: 'adaptive',  // Recommended for DynamoDB    // Adaptive mode tracks throttling per resource    // and reduces throughput for throttled tables  });
  return DynamoDBDocumentClient.from(client);};

Batch Operations: Handling Unprocessed Items

The SDK does NOT automatically retry unprocessed items from batch operations:

typescript

import { DynamoDBDocumentClient, BatchWriteCommand } from '@aws-sdk/lib-dynamodb';
const batchWriteWithRetry = async (  client: DynamoDBDocumentClient,  tableName: string,  items: Record<string, unknown>[],  maxRetries: number = 5): Promise<void> => {  const chunks = chunkArray(items, 25);  // BatchWrite limit
  for (const chunk of chunks) {    let unprocessed: Record<string, unknown>[] | undefined = chunk;    let attempts = 0;
    while (unprocessed && unprocessed.length > 0 && attempts < maxRetries) {      const result = await client.send(new BatchWriteCommand({        RequestItems: {          [tableName]: unprocessed.map(item => ({            PutRequest: { Item: item }          }))        }      }));
      const unprocessedItems = result.UnprocessedItems?.[tableName];
      if (unprocessedItems && unprocessedItems.length > 0) {        unprocessed = unprocessedItems          .map(req => req.PutRequest?.Item as Record<string, unknown>)          .filter(Boolean);
        // Exponential backoff with jitter        const delay = Math.min(100 * Math.pow(2, attempts), 5000);        const jitter = delay * 0.2 * Math.random();        await sleep(delay + jitter);
        attempts++;      } else {        unprocessed = undefined;      }    }
    if (unprocessed && unprocessed.length > 0) {      throw new Error(        `Failed to write ${unprocessed.length} items after ${maxRetries} retries`      );    }  }};
const chunkArray = <T>(array: T[], size: number): T[][] => {  const chunks: T[][] = [];  for (let i = 0; i < array.length; i += size) {    chunks.push(array.slice(i, i + size));  }  return chunks;};
const sleep = (ms: number): Promise<void> =>  new Promise(resolve => setTimeout(resolve, ms));

import { DynamoDBDocumentClient, BatchWriteCommand } from '@aws-sdk/lib-dynamodb';
const batchWriteWithRetry = async (  client: DynamoDBDocumentClient,  tableName: string,  items: Record<string, unknown>[],  maxRetries: number = 5): Promise<void> => {  const chunks = chunkArray(items, 25);  // BatchWrite limit
  for (const chunk of chunks) {    let unprocessed: Record<string, unknown>[] | undefined = chunk;    let attempts = 0;
    while (unprocessed && unprocessed.length > 0 && attempts < maxRetries) {      const result = await client.send(new BatchWriteCommand({        RequestItems: {          [tableName]: unprocessed.map(item => ({            PutRequest: { Item: item }          }))        }      }));
      const unprocessedItems = result.UnprocessedItems?.[tableName];
      if (unprocessedItems && unprocessedItems.length > 0) {        unprocessed = unprocessedItems          .map(req => req.PutRequest?.Item as Record<string, unknown>)          .filter(Boolean);
        // Exponential backoff with jitter        const delay = Math.min(100 * Math.pow(2, attempts), 5000);        const jitter = delay * 0.2 * Math.random();        await sleep(delay + jitter);
        attempts++;      } else {        unprocessed = undefined;      }    }
    if (unprocessed && unprocessed.length > 0) {      throw new Error(        `Failed to write ${unprocessed.length} items after ${maxRetries} retries`      );    }  }};
const chunkArray = <T>(array: T[], size: number): T[][] => {  const chunks: T[][] = [];  for (let i = 0; i < array.length; i += size) {    chunks.push(array.slice(i, i + size));  }  return chunks;};
const sleep = (ms: number): Promise<void> =>  new Promise(resolve => setTimeout(resolve, ms));

Circuit Breaker for Sustained Throttling

When throttling persists, a circuit breaker prevents retry storms:

typescript

import {  ProvisionedThroughputExceededException,  ThrottlingException} from '@aws-sdk/client-dynamodb';
interface CircuitBreakerConfig {  failureThreshold: number;  // Failures before opening  resetTimeout: number;  // Time before trying again (ms)}
class DynamoDBCircuitBreaker {  private failures = 0;  private lastFailure: number = 0;  private state: 'closed' | 'open' | 'half-open' = 'closed';
  constructor(private config: CircuitBreakerConfig) {}
  async execute<T>(operation: () => Promise<T>): Promise<T> {    if (this.state === 'open') {      if (Date.now() - this.lastFailure > this.config.resetTimeout) {        this.state = 'half-open';      } else {        throw new Error('Circuit breaker is open - request rejected');      }    }
    try {      const result = await operation();      this.onSuccess();      return result;    } catch (error) {      this.onFailure(error);      throw error;    }  }
  private onSuccess(): void {    this.failures = 0;    this.state = 'closed';  }
  private onFailure(error: unknown): void {    if (      error instanceof ProvisionedThroughputExceededException ||      error instanceof ThrottlingException    ) {      this.failures++;      this.lastFailure = Date.now();
      if (this.failures >= this.config.failureThreshold) {        this.state = 'open';      }    }  }}
// Usageconst circuitBreaker = new DynamoDBCircuitBreaker({  failureThreshold: 5,  resetTimeout: 30000,  // 30 seconds});
const writeWithProtection = async (  client: DynamoDBDocumentClient,  item: Record<string, unknown>): Promise<void> => {  await circuitBreaker.execute(async () => {    await client.send(new PutCommand({      TableName: 'MainTable',      Item: item    }));  });};

import {  ProvisionedThroughputExceededException,  ThrottlingException} from '@aws-sdk/client-dynamodb';
interface CircuitBreakerConfig {  failureThreshold: number;  // Failures before opening  resetTimeout: number;  // Time before trying again (ms)}
class DynamoDBCircuitBreaker {  private failures = 0;  private lastFailure: number = 0;  private state: 'closed' | 'open' | 'half-open' = 'closed';
  constructor(private config: CircuitBreakerConfig) {}
  async execute<T>(operation: () => Promise<T>): Promise<T> {    if (this.state === 'open') {      if (Date.now() - this.lastFailure > this.config.resetTimeout) {        this.state = 'half-open';      } else {        throw new Error('Circuit breaker is open - request rejected');      }    }
    try {      const result = await operation();      this.onSuccess();      return result;    } catch (error) {      this.onFailure(error);      throw error;    }  }
  private onSuccess(): void {    this.failures = 0;    this.state = 'closed';  }
  private onFailure(error: unknown): void {    if (      error instanceof ProvisionedThroughputExceededException ||      error instanceof ThrottlingException    ) {      this.failures++;      this.lastFailure = Date.now();
      if (this.failures >= this.config.failureThreshold) {        this.state = 'open';      }    }  }}
// Usageconst circuitBreaker = new DynamoDBCircuitBreaker({  failureThreshold: 5,  resetTimeout: 30000,  // 30 seconds});
const writeWithProtection = async (  client: DynamoDBDocumentClient,  item: Record<string, unknown>): Promise<void> => {  await circuitBreaker.execute(async () => {    await client.send(new PutCommand({      TableName: 'MainTable',      Item: item    }));  });};

Client-Side Rate Limiting

Proactively limiting request rates prevents throttling from occurring:

typescript

class TokenBucket {  private tokens: number;  private lastRefill: number;
  constructor(    private maxTokens: number,    private refillRate: number  // tokens per second  ) {    this.tokens = maxTokens;    this.lastRefill = Date.now();  }
  async acquire(count: number = 1): Promise<boolean> {    this.refill();
    if (this.tokens >= count) {      this.tokens -= count;      return true;    }
    // Wait for tokens to be available    const waitTime = ((count - this.tokens) / this.refillRate) * 1000;    await sleep(waitTime);    this.refill();    this.tokens -= count;    return true;  }
  private refill(): void {    const now = Date.now();    const elapsed = (now - this.lastRefill) / 1000;    this.tokens = Math.min(      this.maxTokens,      this.tokens + elapsed * this.refillRate    );    this.lastRefill = now;  }}
// Rate-limited DynamoDB wrapperclass RateLimitedDynamoDB {  private readBucket: TokenBucket;  private writeBucket: TokenBucket;
  constructor(    private client: DynamoDBDocumentClient,    readCapacity: number,    writeCapacity: number  ) {    // Use 80% of capacity to leave headroom    this.readBucket = new TokenBucket(readCapacity * 0.8, readCapacity * 0.8);    this.writeBucket = new TokenBucket(writeCapacity * 0.8, writeCapacity * 0.8);  }
  async get(    tableName: string,    key: Record<string, unknown>  ): Promise<Record<string, unknown> | undefined> {    await this.readBucket.acquire(1);  // 1 RCU for <4KB item
    const result = await this.client.send(new GetCommand({      TableName: tableName,      Key: key    }));
    return result.Item;  }
  async put(    tableName: string,    item: Record<string, unknown>  ): Promise<void> {    const itemSize = JSON.stringify(item).length;    const wcuNeeded = Math.ceil(itemSize / 1024);  // 1 WCU per KB
    await this.writeBucket.acquire(wcuNeeded);
    await this.client.send(new PutCommand({      TableName: tableName,      Item: item    }));  }}

class TokenBucket {  private tokens: number;  private lastRefill: number;
  constructor(    private maxTokens: number,    private refillRate: number  // tokens per second  ) {    this.tokens = maxTokens;    this.lastRefill = Date.now();  }
  async acquire(count: number = 1): Promise<boolean> {    this.refill();
    if (this.tokens >= count) {      this.tokens -= count;      return true;    }
    // Wait for tokens to be available    const waitTime = ((count - this.tokens) / this.refillRate) * 1000;    await sleep(waitTime);    this.refill();    this.tokens -= count;    return true;  }
  private refill(): void {    const now = Date.now();    const elapsed = (now - this.lastRefill) / 1000;    this.tokens = Math.min(      this.maxTokens,      this.tokens + elapsed * this.refillRate    );    this.lastRefill = now;  }}
// Rate-limited DynamoDB wrapperclass RateLimitedDynamoDB {  private readBucket: TokenBucket;  private writeBucket: TokenBucket;
  constructor(    private client: DynamoDBDocumentClient,    readCapacity: number,    writeCapacity: number  ) {    // Use 80% of capacity to leave headroom    this.readBucket = new TokenBucket(readCapacity * 0.8, readCapacity * 0.8);    this.writeBucket = new TokenBucket(writeCapacity * 0.8, writeCapacity * 0.8);  }
  async get(    tableName: string,    key: Record<string, unknown>  ): Promise<Record<string, unknown> | undefined> {    await this.readBucket.acquire(1);  // 1 RCU for <4KB item
    const result = await this.client.send(new GetCommand({      TableName: tableName,      Key: key    }));
    return result.Item;  }
  async put(    tableName: string,    item: Record<string, unknown>  ): Promise<void> {    const itemSize = JSON.stringify(item).length;    const wcuNeeded = Math.ceil(itemSize / 1024);  // 1 WCU per KB
    await this.writeBucket.acquire(wcuNeeded);
    await this.client.send(new PutCommand({      TableName: tableName,      Item: item    }));  }}

CloudWatch Monitoring and Alerting

Proper monitoring catches throttling before it impacts users.

Key Metrics

typescript

const throttlingMetrics = {  primary: [    {      name: 'ThrottledRequests',      description: 'Any request that was throttled',      alarm: 'Sum > 0 for 1 minute',      action: 'Investigate immediately'    },    {      name: 'ReadThrottleEvents',      description: 'Individual read throttle events',      alarm: 'Sum > 10 per minute',      action: 'Check partition key design or increase capacity'    },    {      name: 'WriteThrottleEvents',      description: 'Individual write throttle events',      alarm: 'Sum > 10 per minute',      action: 'Implement write sharding'    }  ],
  utilization: [    {      name: 'ConsumedReadCapacityUnits',      alarm: 'Average > 80% of provisioned for 5 minutes',      action: 'Scale up or enable auto-scaling'    },    {      name: 'ConsumedWriteCapacityUnits',      alarm: 'Average > 80% of provisioned for 5 minutes',      action: 'Scale up or enable auto-scaling'    }  ],
  gsi: [    {      name: 'OnlineIndexThrottleEvents',      description: 'GSI throttling (causes backpressure)',      alarm: 'Any occurrence',      action: 'Increase GSI capacity'    }  ],
  // Granular throttle metrics (useful for diagnosing specific issues)  advanced: [    { name: 'ReadMaxOnDemandThroughputThrottleEvents', description: 'On-demand max throughput exceeded' },    { name: 'WriteMaxOnDemandThroughputThrottleEvents', description: 'On-demand max throughput exceeded' },    { name: 'ReadAccountLimitThrottleEvents', description: 'Account-level limit hit' },    { name: 'WriteAccountLimitThrottleEvents', description: 'Account-level limit hit' },    { name: 'ReadKeyRangeThroughputThrottleEvents', description: 'Partition-level limit hit' },    { name: 'WriteKeyRangeThroughputThrottleEvents', description: 'Partition-level limit hit' }  ]};

const throttlingMetrics = {  primary: [    {      name: 'ThrottledRequests',      description: 'Any request that was throttled',      alarm: 'Sum > 0 for 1 minute',      action: 'Investigate immediately'    },    {      name: 'ReadThrottleEvents',      description: 'Individual read throttle events',      alarm: 'Sum > 10 per minute',      action: 'Check partition key design or increase capacity'    },    {      name: 'WriteThrottleEvents',      description: 'Individual write throttle events',      alarm: 'Sum > 10 per minute',      action: 'Implement write sharding'    }  ],
  utilization: [    {      name: 'ConsumedReadCapacityUnits',      alarm: 'Average > 80% of provisioned for 5 minutes',      action: 'Scale up or enable auto-scaling'    },    {      name: 'ConsumedWriteCapacityUnits',      alarm: 'Average > 80% of provisioned for 5 minutes',      action: 'Scale up or enable auto-scaling'    }  ],
  gsi: [    {      name: 'OnlineIndexThrottleEvents',      description: 'GSI throttling (causes backpressure)',      alarm: 'Any occurrence',      action: 'Increase GSI capacity'    }  ],
  // Granular throttle metrics (useful for diagnosing specific issues)  advanced: [    { name: 'ReadMaxOnDemandThroughputThrottleEvents', description: 'On-demand max throughput exceeded' },    { name: 'WriteMaxOnDemandThroughputThrottleEvents', description: 'On-demand max throughput exceeded' },    { name: 'ReadAccountLimitThrottleEvents', description: 'Account-level limit hit' },    { name: 'WriteAccountLimitThrottleEvents', description: 'Account-level limit hit' },    { name: 'ReadKeyRangeThroughputThrottleEvents', description: 'Partition-level limit hit' },    { name: 'WriteKeyRangeThroughputThrottleEvents', description: 'Partition-level limit hit' }  ]};

CDK Alarm Configuration

typescript

import * as cloudwatch from 'aws-cdk-lib/aws-cloudwatch';import * as cloudwatch_actions from 'aws-cdk-lib/aws-cloudwatch-actions';import * as sns from 'aws-cdk-lib/aws-sns';import { Duration } from 'aws-cdk-lib';
const createThrottlingAlarms = (  table: dynamodb.Table,  alertTopic: sns.Topic): cloudwatch.Alarm[] => {  const alarms: cloudwatch.Alarm[] = [];
  // Throttled requests alarm - immediate attention  alarms.push(new cloudwatch.Alarm(table, 'ThrottlingAlarm', {    alarmName: `${table.tableName}-Throttling`,    metric: table.metricThrottledRequestsForOperations({      operations: [        dynamodb.Operation.GET_ITEM,        dynamodb.Operation.PUT_ITEM,        dynamodb.Operation.QUERY,        dynamodb.Operation.SCAN      ],      period: Duration.minutes(1)    }),    threshold: 1,    evaluationPeriods: 1,    comparisonOperator: cloudwatch.ComparisonOperator.GREATER_THAN_OR_EQUAL_TO_THRESHOLD,    treatMissingData: cloudwatch.TreatMissingData.NOT_BREACHING,  }));
  // High read utilization - early warning  alarms.push(new cloudwatch.Alarm(table, 'HighReadUtilization', {    alarmName: `${table.tableName}-HighReadUtilization`,    metric: new cloudwatch.MathExpression({      expression: 'm1 / m2 * 100',      usingMetrics: {        m1: table.metricConsumedReadCapacityUnits({ period: Duration.minutes(5) }),        m2: table.metricProvisionedReadCapacityUnits({ period: Duration.minutes(5) })      }    }),    threshold: 80,    evaluationPeriods: 3,    comparisonOperator: cloudwatch.ComparisonOperator.GREATER_THAN_THRESHOLD,  }));
  // Add SNS actions  alarms.forEach(alarm => {    alarm.addAlarmAction(new cloudwatch_actions.SnsAction(alertTopic));  });
  return alarms;};

import * as cloudwatch from 'aws-cdk-lib/aws-cloudwatch';import * as cloudwatch_actions from 'aws-cdk-lib/aws-cloudwatch-actions';import * as sns from 'aws-cdk-lib/aws-sns';import { Duration } from 'aws-cdk-lib';
const createThrottlingAlarms = (  table: dynamodb.Table,  alertTopic: sns.Topic): cloudwatch.Alarm[] => {  const alarms: cloudwatch.Alarm[] = [];
  // Throttled requests alarm - immediate attention  alarms.push(new cloudwatch.Alarm(table, 'ThrottlingAlarm', {    alarmName: `${table.tableName}-Throttling`,    metric: table.metricThrottledRequestsForOperations({      operations: [        dynamodb.Operation.GET_ITEM,        dynamodb.Operation.PUT_ITEM,        dynamodb.Operation.QUERY,        dynamodb.Operation.SCAN      ],      period: Duration.minutes(1)    }),    threshold: 1,    evaluationPeriods: 1,    comparisonOperator: cloudwatch.ComparisonOperator.GREATER_THAN_OR_EQUAL_TO_THRESHOLD,    treatMissingData: cloudwatch.TreatMissingData.NOT_BREACHING,  }));
  // High read utilization - early warning  alarms.push(new cloudwatch.Alarm(table, 'HighReadUtilization', {    alarmName: `${table.tableName}-HighReadUtilization`,    metric: new cloudwatch.MathExpression({      expression: 'm1 / m2 * 100',      usingMetrics: {        m1: table.metricConsumedReadCapacityUnits({ period: Duration.minutes(5) }),        m2: table.metricProvisionedReadCapacityUnits({ period: Duration.minutes(5) })      }    }),    threshold: 80,    evaluationPeriods: 3,    comparisonOperator: cloudwatch.ComparisonOperator.GREATER_THAN_THRESHOLD,  }));
  // Add SNS actions  alarms.forEach(alarm => {    alarm.addAlarmAction(new cloudwatch_actions.SnsAction(alertTopic));  });
  return alarms;};

Contributor Insights for Hot Key Detection

Enable Contributor Insights to identify which partition keys are causing throttling:

typescript

import { DynamoDBClient, UpdateContributorInsightsCommand } from '@aws-sdk/client-dynamodb';
// Mode options:// - ACCESSED_AND_THROTTLED_KEYS: All accessed keys + throttled keys (default, higher cost)// - THROTTLED_KEYS: Only throttled keys (cost-effective for throttle debugging)
const enableContributorInsights = async (  client: DynamoDBClient,  tableName: string): Promise<void> => {  await client.send(new UpdateContributorInsightsCommand({    TableName: tableName,    ContributorInsightsAction: 'ENABLE',  }));};
// Contributor Insights reveals:// - Top partition keys by consumed capacity// - Throttled partition keys// - Access patterns over time// Essential for debugging Single Table Design throttling// Tip: Use THROTTLED_KEYS mode if you only need to debug throttling (lower cost)

import { DynamoDBClient, UpdateContributorInsightsCommand } from '@aws-sdk/client-dynamodb';
// Mode options:// - ACCESSED_AND_THROTTLED_KEYS: All accessed keys + throttled keys (default, higher cost)// - THROTTLED_KEYS: Only throttled keys (cost-effective for throttle debugging)
const enableContributorInsights = async (  client: DynamoDBClient,  tableName: string): Promise<void> => {  await client.send(new UpdateContributorInsightsCommand({    TableName: tableName,    ContributorInsightsAction: 'ENABLE',  }));};
// Contributor Insights reveals:// - Top partition keys by consumed capacity// - Throttled partition keys// - Access patterns over time// Essential for debugging Single Table Design throttling// Tip: Use THROTTLED_KEYS mode if you only need to debug throttling (lower cost)

Common Pitfalls and Solutions

Pitfall 1: Relying on Adaptive Capacity

typescript

// WRONG: Assuming DynamoDB handles hot partitions automatically// Reality: Adaptive rebalancing is instant, but split-for-heat takes minutes// Neither helps with single hot partition key (celebrity problem)// Flash sales or viral content on one key = throttling regardless
// RIGHT: Design for even distribution from the start// Use write sharding for known low-cardinality patterns

// WRONG: Assuming DynamoDB handles hot partitions automatically// Reality: Adaptive rebalancing is instant, but split-for-heat takes minutes// Neither helps with single hot partition key (celebrity problem)// Flash sales or viral content on one key = throttling regardless
// RIGHT: Design for even distribution from the start// Use write sharding for known low-cardinality patterns

Pitfall 2: Ignoring GSI Capacity

typescript

// WRONG: Setting GSI capacity lower than base table// Assumption: "GSI has less traffic"// Result: GSI throttling blocks ALL base table writes
// RIGHT: GSI capacity >= base table write capacity// Or use on-demand for automatic scaling

// WRONG: Setting GSI capacity lower than base table// Assumption: "GSI has less traffic"// Result: GSI throttling blocks ALL base table writes
// RIGHT: GSI capacity >= base table write capacity// Or use on-demand for automatic scaling

Pitfall 3: On-Demand Scaling Assumptions

typescript

// WRONG: "On-demand scales instantly to any level"// Reality: 2x scaling limit within 30-minute windows// 50k req/sec to 250k req/sec takes ~1 hour
// RIGHT: Pre-warm before expected spikes// Or use provisioned with high capacity for planned events// Tip: Consider AWS's "warm throughput" feature for configuring// higher initial throughput values on new or restored tables

// WRONG: "On-demand scales instantly to any level"// Reality: 2x scaling limit within 30-minute windows// 50k req/sec to 250k req/sec takes ~1 hour
// RIGHT: Pre-warm before expected spikes// Or use provisioned with high capacity for planned events// Tip: Consider AWS's "warm throughput" feature for configuring// higher initial throughput values on new or restored tables

Pitfall 4: Missing Batch Retry Logic

typescript

// WRONG: Assume BatchWriteItem processes all itemsconst result = await client.send(new BatchWriteCommand({ ... }));// Some items may have failed!
// RIGHT: Always check and retry unprocessed itemsif (result.UnprocessedItems &&    Object.keys(result.UnprocessedItems).length > 0) {  // Implement exponential backoff retry}

// WRONG: Assume BatchWriteItem processes all itemsconst result = await client.send(new BatchWriteCommand({ ... }));// Some items may have failed!
// RIGHT: Always check and retry unprocessed itemsif (result.UnprocessedItems &&    Object.keys(result.UnprocessedItems).length > 0) {  // Implement exponential backoff retry}

Pitfall 5: Not Monitoring Per-Partition Metrics

typescript

// WRONG: Only monitor table-level capacity// "Table has 500 WCU available, why throttling?"
// RIGHT: Enable Contributor Insights// Reveals: One partition key consuming its 1,000 WCU limit// Table-level headroom doesn't help partition-level throttling

// WRONG: Only monitor table-level capacity// "Table has 500 WCU available, why throttling?"
// RIGHT: Enable Contributor Insights// Reveals: One partition key consuming its 1,000 WCU limit// Table-level headroom doesn't help partition-level throttling

Key Takeaways

Design Partition Keys First: Hot partitions cause 90% of throttling issues
Understand Per-Partition Limits: 3,000 RCU / 1,000 WCU per partition is the real constraint
Write Sharding Works: 10 shards = 10x write throughput for same access pattern
Adaptive Capacity Has Limits: Rebalancing is instant, but split-for-heat takes minutes; neither helps single hot keys
On-Demand Has Limits: 2x scaling within 30 minutes, not unlimited
GSI Throttling Blocks Writes: Capacity matching is essential
DAX Needs High Hit Rate: Below 80% cache hit rate, ROI is negative
Monitor Contributor Insights: Only way to identify hot keys in Single Table Design
Retry Unprocessed Items: SDK does not auto-retry batch operation failures
Pre-warm for Events: Both provisioned and on-demand need preparation for traffic spikes

Building throttle-resistant DynamoDB applications requires understanding these mechanics and implementing appropriate patterns at each layer. Start with partition key design, add sharding where needed, implement proper retries, and monitor aggressively. The result is a system that scales predictably without unexpected throttling incidents.

References

docs.aws.amazon.com - Amazon DynamoDB Developer Guide.
docs.aws.amazon.com - AWS documentation home (service guides and API references).
docs.aws.amazon.com - AWS Well-Architected Framework overview.
web.dev - web.dev performance guidance (Core Web Vitals).
typescriptlang.org - TypeScript Handbook and language reference.
github.com - TypeScript project wiki (FAQ and design notes).
opentelemetry.io - OpenTelemetry documentation (metrics, traces, logs).
docs.aws.amazon.com - AWS Overview (official whitepaper).
cloud.google.com - Google Cloud documentation.

Understanding DynamoDB's Throttling Mechanism#

Partition Key Design: The Foundation#

Anti-Patterns to Avoid#

High-Cardinality Patterns That Work#

Write Sharding: Distributing Hot Keys#

Random Suffix Sharding#

Deterministic Sharding#

GSI Write Sharding#

Capacity Mode Selection#

On-Demand Mode: Understanding the Limits#

Provisioned with Auto-Scaling#

Decision Framework#

Burst and Adaptive Capacity#

Burst Capacity#

Adaptive Capacity and Split-for-Heat#

DAX for Read-Heavy Workloads#

When DAX Makes Sense#

TTL Strategy by Data Type#

Retry Strategies and Circuit Breakers#

SDK Configuration#

Batch Operations: Handling Unprocessed Items#

Circuit Breaker for Sustained Throttling#

Client-Side Rate Limiting#

CloudWatch Monitoring and Alerting#

Key Metrics#

CDK Alarm Configuration#

Contributor Insights for Hot Key Detection#

Common Pitfalls and Solutions#

Pitfall 1: Relying on Adaptive Capacity#

Pitfall 2: Ignoring GSI Capacity#

Pitfall 3: On-Demand Scaling Assumptions#

Pitfall 4: Missing Batch Retry Logic#

Pitfall 5: Not Monitoring Per-Partition Metrics#

Key Takeaways#

References#

Related Posts

Understanding DynamoDB's Throttling Mechanism

Partition Key Design: The Foundation

Anti-Patterns to Avoid

High-Cardinality Patterns That Work

Write Sharding: Distributing Hot Keys

Random Suffix Sharding

Deterministic Sharding

GSI Write Sharding

Capacity Mode Selection

On-Demand Mode: Understanding the Limits

Provisioned with Auto-Scaling

Decision Framework

Burst and Adaptive Capacity

Burst Capacity

Adaptive Capacity and Split-for-Heat

DAX for Read-Heavy Workloads

When DAX Makes Sense

TTL Strategy by Data Type

Retry Strategies and Circuit Breakers

SDK Configuration

Batch Operations: Handling Unprocessed Items

Circuit Breaker for Sustained Throttling

Client-Side Rate Limiting

CloudWatch Monitoring and Alerting

Key Metrics

CDK Alarm Configuration

Contributor Insights for Hot Key Detection

Common Pitfalls and Solutions

Pitfall 1: Relying on Adaptive Capacity

Pitfall 2: Ignoring GSI Capacity

Pitfall 3: On-Demand Scaling Assumptions

Pitfall 4: Missing Batch Retry Logic

Pitfall 5: Not Monitoring Per-Partition Metrics

Key Takeaways

References