AWS Lambda + S3 Signed URLs: A Practical Solution for Large File Uploads
A practical approach to handling large file uploads using S3 signed URLs instead of Lambda proxies. Complete implementation with CDK, security considerations, and lessons learned from production experience.
When building file upload systems, many developers start with the straightforward approach: proxy uploads through Lambda. This works well for small files but quickly becomes problematic as file sizes grow. Lambda's execution limits and memory constraints create timeout issues, while the cost of keeping functions running during long uploads can be significant.
After experiencing these challenges firsthand, I learned that S3 signed URLs offer a more scalable solution. This approach reduces Lambda execution time to milliseconds, eliminates timeout issues, and provides substantial cost savings. Here's a practical implementation that handles large files efficiently.
Understanding the Lambda Upload Challenge#
The traditional Lambda-proxy approach often looks like this:
// The Lambda that killed our budget and user experienceexport const uploadHandler = async (event: APIGatewayEvent) => { // This ran for UP TO 15 minutes per upload const file = parseMultipartFormData(event.body);
// Memory usage would spike to 3GB+ for large files const processedFile = await processVideo(file);
// S3 upload could take 10+ minutes const result = await s3.upload({ Bucket: 'my-videos', Key: `uploads/${uuidv4()}`, Body: processedFile, }).promise();
return { statusCode: 200, body: JSON.stringify(result) };};This approach creates several challenges:
- Long execution times: 8-12 minutes per upload for large files
- High memory usage: 2-3GB consistently for processing large files
- Significant costs: High Lambda compute costs for extended execution
- Reliability issues: Timeouts and memory errors affect success rates
- Poor user experience: Limited visibility into upload progress
A More Efficient Architecture#
The key insight is removing Lambda from the upload data path:
Instead of streaming files through Lambda, clients now:
- Request a signed URL from Lambda (< 200ms)
- Upload directly to S3 (no Lambda involvement)
- S3 triggers processing Lambda when upload completes
Complete Implementation Guide#
CDK Infrastructure for Large File Uploads#
Here's a complete CDK implementation that demonstrates this pattern:
// lib/file-upload-stack.tsimport * as cdk from 'aws-cdk-lib';import { Construct } from 'constructs';import * as s3 from 'aws-cdk-lib/aws-s3';import * as lambda from 'aws-cdk-lib/aws-lambda';import * as apigateway from 'aws-cdk-lib/aws-apigateway';import * as s3n from 'aws-cdk-lib/aws-s3-notifications';import * as iam from 'aws-cdk-lib/aws-iam';import { NodejsFunction } from 'aws-cdk-lib/aws-lambda-nodejs';
export class FileUploadStack extends cdk.Stack { constructor(scope: Construct, id: string, props?: cdk.StackProps) { super(scope, id, props);
// S3 bucket with lifecycle policies for cost optimization const uploadBucket = new s3.Bucket(this, 'UploadBucket', { bucketName: `${this.stackName}-uploads-${this.account}`, cors: [ { allowedOrigins: ['*'], allowedMethods: [ s3.HttpMethods.PUT, s3.HttpMethods.POST, s3.HttpMethods.GET, s3.HttpMethods.HEAD, ], allowedHeaders: ['*'], exposedHeaders: ['ETag'], maxAge: 3600, }, ], // Automatically delete incomplete multipart uploads after 7 days lifecycleRules: [ { id: 'AbortIncompleteMultipartUploads', enabled: true, abortIncompleteMultipartUploadsAfter: cdk.Duration.days(7), }, { id: 'TransitionToIA', enabled: true, transitions: [ { storageClass: s3.StorageClass.INFREQUENT_ACCESS, transitionAfter: cdk.Duration.days(30), }, { storageClass: s3.StorageClass.GLACIER, transitionAfter: cdk.Duration.days(90), }, ], }, ], // Block public access for security blockPublicAccess: s3.BlockPublicAccess.BLOCK_ALL, encryption: s3.BucketEncryption.S3_MANAGED, });
// Lambda for generating signed URLs - runs in <200ms const signedUrlGenerator = new NodejsFunction(this, 'SignedUrlGenerator', { entry: 'src/handlers/generate-signed-url.ts', runtime: lambda.Runtime.NODEJS_20_X architecture: lambda.Architecture.ARM_64, memorySize: 512, // Small memory footprint timeout: cdk.Duration.seconds(30), environment: { UPLOAD_BUCKET: uploadBucket.bucketName, ALLOWED_FILE_TYPES: 'video/mp4,video/quicktime,video/x-msvideo,image/jpeg,image/png', MAX_FILE_SIZE: '10737418240', // 10GB in bytes SIGNED_URL_EXPIRY: '3600', // 1 hour }, bundling: { minify: true, sourceMap: true, target: 'es2022', }, });
// Grant signed URL generator permissions to create signed URLs uploadBucket.grantReadWrite(signedUrlGenerator); signedUrlGenerator.addToRolePolicy( new iam.PolicyStatement({ effect: iam.Effect.ALLOW, actions: ['s3:PutObjectAcl', 's3:GetObject'], resources: [uploadBucket.arnForObjects('*')], }) );
// Lambda for post-upload processing const fileProcessor = new NodejsFunction(this, 'FileProcessor', { entry: 'src/handlers/process-file.ts', runtime: lambda.Runtime.NODEJS_20_X architecture: lambda.Architecture.ARM_64, memorySize: 2048, // Higher memory for processing timeout: cdk.Duration.minutes(5), environment: { UPLOAD_BUCKET: uploadBucket.bucketName, }, bundling: { minify: true, sourceMap: true, target: 'es2022', // Include ffmpeg for video processing if needed nodeModules: ['fluent-ffmpeg'], }, });
uploadBucket.grantReadWrite(fileProcessor);
// S3 event notification to trigger processing uploadBucket.addEventNotification( s3.EventType.OBJECT_CREATED, new s3n.LambdaDestination(fileProcessor), { prefix: 'uploads/' } // Only process files in uploads/ prefix );
// API Gateway for signed URL generation const api = new apigateway.RestApi(this, 'FileUploadApi', { restApiName: 'File Upload API', description: 'API for generating S3 signed URLs', defaultCorsPreflightOptions: { allowOrigins: apigateway.Cors.ALL_ORIGINS, allowMethods: ['GET', 'POST', 'OPTIONS'], allowHeaders: ['Content-Type', 'Authorization'], }, });
const uploads = api.root.addResource('uploads'); const signedUrl = uploads.addResource('signed-url');
signedUrl.addMethod( 'POST', new apigateway.LambdaIntegration(signedUrlGenerator, { requestTemplates: { 'application/json': '{"body": $input.json("$")}', }, }) );
// Outputs new cdk.CfnOutput(this, 'ApiUrl', { value: api.url, description: 'API Gateway URL', });
new cdk.CfnOutput(this, 'BucketName', { value: uploadBucket.bucketName, description: 'S3 Upload Bucket Name', }); }}Signed URL Generator - The 200ms Lambda#
This Lambda runs in under 200ms and generates secure upload URLs:
// src/handlers/generate-signed-url.tsimport { APIGatewayProxyEvent, APIGatewayProxyResult } from 'aws-lambda';import { S3Client, PutObjectCommand } from '@aws-sdk/client-s3';import { getSignedUrl } from '@aws-sdk/s3-request-presigner';import { z } from 'zod';
// Input validation schemaconst SignedUrlRequestSchema = z.object({ fileName: z.string().min(1).max(255), fileSize: z.number().int().min(1).max(10737418240), // 10GB max fileType: z.string().regex(/^(video|image|audio)\/[a-zA-Z0-9][a-zA-Z0-9\!\-\_]*[a-zA-Z0-9]*$/), uploadId: z.string().uuid().optional(), // For tracking});
const s3Client = new S3Client({ region: process.env.AWS_REGION });
export const handler = async ( event: APIGatewayProxyEvent): Promise<APIGatewayProxyResult> => { console.log('Generating signed URL request:', { body: event.body, headers: event.headers });
try { // Parse and validate request const body = JSON.parse(event.body || '{}'); const request = SignedUrlRequestSchema.parse(body);
// Security checks const allowedTypes = process.env.ALLOWED_FILE_TYPES?.split(',') || []; if (!allowedTypes.includes(request.fileType)) { return { statusCode: 400, headers: { 'Content-Type': 'application/json' }, body: JSON.stringify({ error: 'File type not allowed', allowedTypes, }), }; }
const maxSize = parseInt(process.env.MAX_FILE_SIZE || '5368709120'); // 5GB default if (request.fileSize > maxSize) { return { statusCode: 400, headers: { 'Content-Type': 'application/json' }, body: JSON.stringify({ error: 'File too large', maxSize, receivedSize: request.fileSize, }), }; }
// Generate unique key with timestamp and sanitized filename const timestamp = new Date().toISOString().replace(/[:.]/g, '-'); const sanitizedFileName = request.fileName.replace(/[^a-zA-Z0-9.-]/g, '_'); const key = `uploads/${timestamp}-${sanitizedFileName}`;
// Create signed URL for PUT request const putObjectCommand = new PutObjectCommand({ Bucket: process.env.UPLOAD_BUCKET!, Key: key, ContentType: request.fileType, ContentLength: request.fileSize, // Add metadata for processing Metadata: { 'original-filename': request.fileName, 'upload-id': request.uploadId || 'direct-upload', 'file-size': request.fileSize.toString(), 'uploaded-at': new Date().toISOString(), }, // Security headers ServerSideEncryption: 'AES256', });
const signedUrl = await getSignedUrl(s3Client, putObjectCommand, { expiresIn: parseInt(process.env.SIGNED_URL_EXPIRY || '3600'), // 1 hour default });
console.log('Signed URL generated successfully:', { key, fileSize: request.fileSize, fileType: request.fileType, expiresIn: process.env.SIGNED_URL_EXPIRY, });
return { statusCode: 200, headers: { 'Content-Type': 'application/json', 'Access-Control-Allow-Origin': '*', 'Cache-Control': 'no-cache', }, body: JSON.stringify({ signedUrl, key, method: 'PUT', headers: { 'Content-Type': request.fileType, 'Content-Length': request.fileSize.toString(), }, expiresAt: new Date(Date.now() + parseInt(process.env.SIGNED_URL_EXPIRY || '3600') * 1000).toISOString(), }), };
} catch (error) { console.error('Signed URL generation error:', error);
if (error instanceof z.ZodError) { return { statusCode: 400, headers: { 'Content-Type': 'application/json' }, body: JSON.stringify({ error: 'Invalid request', details: error.errors, }), }; }
return { statusCode: 500, headers: { 'Content-Type': 'application/json' }, body: JSON.stringify({ error: 'Failed to generate signed URL', }), }; }};File Processing Lambda - Only Runs When Needed#
This Lambda only executes when files are successfully uploaded to S3:
// src/handlers/process-file.tsimport { S3Event } from 'aws-lambda';import { S3Client, GetObjectCommand, CopyObjectCommand, DeleteObjectCommand } from '@aws-sdk/client-s3';import { DynamoDBClient, PutItemCommand } from '@aws-sdk/client-dynamodb';import { SESClient, SendEmailCommand } from '@aws-sdk/client-ses';
const s3Client = new S3Client({ region: process.env.AWS_REGION });const dynamoClient = new DynamoDBClient({ region: process.env.AWS_REGION });const sesClient = new SESClient({ region: process.env.AWS_REGION });
export const handler = async (event: S3Event): Promise<void> => { console.log('Processing uploaded files:', JSON.stringify(event, null, 2));
for (const record of event.Records) { if (record.eventName?.startsWith('ObjectCreated')) { await processUploadedFile(record); } }};
async function processUploadedFile(record: any) { const bucketName = record.s3.bucket.name; const objectKey = decodeURIComponent(record.s3.object.key.replace(/\+/g, ' ')); const fileSize = record.s3.object.size;
console.log('Processing file:', { bucketName, objectKey, fileSize });
try { // Get object metadata const headResponse = await s3Client.send(new GetObjectCommand({ Bucket: bucketName, Key: objectKey, }));
const metadata = headResponse.Metadata || {}; const originalFilename = metadata['original-filename'] || objectKey; const uploadId = metadata['upload-id'] || 'unknown';
// Determine file type and processing strategy const contentType = headResponse.ContentType || ''; let processingStatus = 'completed'; let processedKey = objectKey;
if (contentType.startsWith('video/')) { // Video files might need transcoding processingStatus = 'processing'; // In production, you might trigger AWS MediaConvert here console.log('Video file detected, would trigger transcoding');
// For demo, just move to processed folder processedKey = objectKey.replace('uploads/', 'processed/videos/'); await s3Client.send(new CopyObjectCommand({ CopySource: `${bucketName}/${objectKey}`, Bucket: bucketName, Key: processedKey, }));
processingStatus = 'completed'; } else if (contentType.startsWith('image/')) { // Image files might need resizing/optimization console.log('Image file detected, would trigger processing');
processedKey = objectKey.replace('uploads/', 'processed/images/'); await s3Client.send(new CopyObjectCommand({ CopySource: `${bucketName}/${objectKey}`, Bucket: bucketName, Key: processedKey, })); }
// Store processing result in database await dynamoClient.send(new PutItemCommand({ TableName: process.env.FILES_TABLE || 'processed-files', Item: { fileId: { S: uploadId }, originalKey: { S: objectKey }, processedKey: { S: processedKey }, originalFilename: { S: originalFilename }, fileSize: { N: fileSize.toString() }, contentType: { S: contentType }, status: { S: processingStatus }, uploadedAt: { S: new Date().toISOString() }, processedAt: { S: new Date().toISOString() }, }, }));
// Optional: Send notification email if (metadata['notification-email']) { await sesClient.send(new SendEmailCommand({ Source: '[email protected]', Destination: { ToAddresses: [metadata['notification-email']], }, Message: { Subject: { Data: 'File Upload Processed Successfully', }, Body: { Text: { Data: `Your file "${originalFilename}" has been successfully uploaded and processed.`, }, }, }, })); }
// Clean up original upload if moved to processed location if (processedKey !== objectKey) { await s3Client.send(new DeleteObjectCommand({ Bucket: bucketName, Key: objectKey, })); }
console.log('File processing completed:', { uploadId, originalKey: objectKey, processedKey, status: processingStatus, });
} catch (error) { console.error('File processing failed:', error);
// Update database with error status await dynamoClient.send(new PutItemCommand({ TableName: process.env.FILES_TABLE || 'processed-files', Item: { fileId: { S: record.s3.object.eTag }, originalKey: { S: objectKey }, status: { S: 'failed' }, errorMessage: { S: error instanceof Error ? error.message : 'Unknown error' }, uploadedAt: { S: new Date().toISOString() }, failedAt: { S: new Date().toISOString() }, }, }));
throw error; // Re-throw to trigger retry if needed }}Frontend Implementation - React/TypeScript#
Here's how clients actually use the signed URLs:
// hooks/useFileUpload.tsimport { useState, useCallback } from 'react';
interface UploadProgress { loaded: number; total: number; percentage: number;}
interface UseFileUploadReturn { upload: (file: File) => Promise<string>; progress: UploadProgress | null; isUploading: boolean; error: string | null;}
export const useFileUpload = (): UseFileUploadReturn => { const [progress, setProgress] = useState<UploadProgress | null>(null); const [isUploading, setIsUploading] = useState(false); const [error, setError] = useState<string | null>(null);
const upload = useCallback(async (file: File): Promise<string> => { setIsUploading(true); setError(null); setProgress(null);
try { console.log('Starting upload for file:', { name: file.name, size: file.size, type: file.type, });
// Step 1: Request signed URL const signedUrlResponse = await fetch('/api/uploads/signed-url', { method: 'POST', headers: { 'Content-Type': 'application/json', }, body: JSON.stringify({ fileName: file.name, fileSize: file.size, fileType: file.type, uploadId: crypto.randomUUID(), }), });
if (!signedUrlResponse.ok) { const errorData = await signedUrlResponse.json(); throw new Error(errorData.error || 'Failed to get signed URL'); }
const { signedUrl, key, headers } = await signedUrlResponse.json();
console.log('Got signed URL, starting direct S3 upload');
// Step 2: Upload directly to S3 with progress tracking const uploadResponse = await fetch(signedUrl, { method: 'PUT', headers: { 'Content-Type': file.type, 'Content-Length': file.size.toString(), ...headers, }, body: file, });
if (!uploadResponse.ok) { throw new Error(`Upload failed: ${uploadResponse.status} ${uploadResponse.statusText}`); }
console.log('Upload completed successfully');
return key; // Return S3 object key for reference
} catch (err) { const errorMessage = err instanceof Error ? err.message : 'Upload failed'; setError(errorMessage); console.error('Upload error:', err); throw err; } finally { setIsUploading(false); setProgress(null); } }, []);
return { upload, progress, isUploading, error, };};
// Enhanced version with progress tracking using XMLHttpRequestexport const useFileUploadWithProgress = (): UseFileUploadReturn => { const [progress, setProgress] = useState<UploadProgress | null>(null); const [isUploading, setIsUploading] = useState(false); const [error, setError] = useState<string | null>(null);
const upload = useCallback(async (file: File): Promise<string> => { setIsUploading(true); setError(null); setProgress({ loaded: 0, total: file.size, percentage: 0 });
try { // Get signed URL const signedUrlResponse = await fetch('/api/uploads/signed-url', { method: 'POST', headers: { 'Content-Type': 'application/json' }, body: JSON.stringify({ fileName: file.name, fileSize: file.size, fileType: file.type, uploadId: crypto.randomUUID(), }), });
if (!signedUrlResponse.ok) { const errorData = await signedUrlResponse.json(); throw new Error(errorData.error || 'Failed to get signed URL'); }
const { signedUrl, key } = await signedUrlResponse.json();
// Upload with progress tracking return new Promise((resolve, reject) => { const xhr = new XMLHttpRequest();
xhr.upload.addEventListener('progress', (event) => { if (event.lengthComputable) { const percentage = Math.round((event.loaded / event.total) * 100); setProgress({ loaded: event.loaded, total: event.total, percentage, }); console.log(`Upload progress: ${percentage}%`); } });
xhr.addEventListener('load', () => { if (xhr.status >= 200 && xhr.status < 300) { console.log('Upload completed successfully'); setProgress({ loaded: file.size, total: file.size, percentage: 100, }); resolve(key); } else { reject(new Error(`Upload failed: ${xhr.status} ${xhr.statusText}`)); } });
xhr.addEventListener('error', () => { reject(new Error('Upload failed due to network error')); });
xhr.addEventListener('abort', () => { reject(new Error('Upload was aborted')); });
xhr.open('PUT', signedUrl); xhr.setRequestHeader('Content-Type', file.type); xhr.send(file); });
} catch (err) { const errorMessage = err instanceof Error ? err.message : 'Upload failed'; setError(errorMessage); console.error('Upload error:', err); throw err; } finally { setIsUploading(false); } }, []);
return { upload, progress, isUploading, error };};React Upload Component#
// components/FileUploader.tsximport React, { useCallback, useState } from 'react';import { useFileUploadWithProgress } from '../hooks/useFileUpload';
interface FileUploaderProps { onUploadComplete?: (key: string) => void; onUploadError?: (error: string) => void; acceptedTypes?: string[]; maxSize?: number;}
export const FileUploader: React.FC<FileUploaderProps> = ({ onUploadComplete, onUploadError, acceptedTypes = ['video/*', 'image/*'], maxSize = 10 * 1024 * 1024 * 1024, // 10GB default}) => { const { upload, progress, isUploading, error } = useFileUploadWithProgress(); const [dragOver, setDragOver] = useState(false);
const handleFileSelect = useCallback(async (files: FileList | null) => { if (!files || files.length === 0) return;
const file = files[0];
// Validate file type const isValidType = acceptedTypes.some(type => { if (type.endsWith('/*')) { return file.type.startsWith(type.slice(0, -1)); } return file.type === type; });
if (!isValidType) { const errorMsg = `File type not allowed. Accepted types: ${acceptedTypes.join(', ')}`; onUploadError?.(errorMsg); return; }
// Validate file size if (file.size > maxSize) { const errorMsg = `File too large. Maximum size: ${Math.round(maxSize / 1024 / 1024)}MB`; onUploadError?.(errorMsg); return; }
try { const key = await upload(file); onUploadComplete?.(key); } catch (err) { const errorMsg = err instanceof Error ? err.message : 'Upload failed'; onUploadError?.(errorMsg); } }, [upload, acceptedTypes, maxSize, onUploadComplete, onUploadError]);
const handleDrop = useCallback((e: React.DragEvent) => { e.preventDefault(); setDragOver(false); handleFileSelect(e.dataTransfer.files); }, [handleFileSelect]);
const handleDragOver = useCallback((e: React.DragEvent) => { e.preventDefault(); setDragOver(true); }, []);
const handleDragLeave = useCallback((e: React.DragEvent) => { e.preventDefault(); setDragOver(false); }, []);
const formatFileSize = (bytes: number): string => { if (bytes === 0) return '0 Bytes'; const k = 1024; const sizes = ['Bytes', 'KB', 'MB', 'GB']; const i = Math.floor(Math.log(bytes) / Math.log(k)); return parseFloat((bytes / Math.pow(k, i)).toFixed(2)) + ' ' + sizes[i]; };
return ( <div className="w-full max-w-xl mx-auto"> <div className={` border-2 border-dashed rounded-lg p-8 text-center transition-colors ${dragOver ? 'border-blue-400 bg-blue-50' : 'border-gray-300'} ${isUploading ? 'pointer-events-none opacity-60' : 'hover:border-gray-400'} `} onDrop={handleDrop} onDragOver={handleDragOver} onDragLeave={handleDragLeave} > {isUploading ? ( <div className="space-y-4"> <div className="animate-spin rounded-full h-8 w-8 border-b-2 border-blue-600 mx-auto"></div> <p className="text-sm text-gray-600">Uploading...</p> {progress && ( <div className="space-y-2"> <div className="w-full bg-gray-200 rounded-full h-2"> <div className="bg-blue-600 h-2 rounded-full transition-all duration-300" style={{ width: `${progress.percentage}%` }} ></div> </div> <p className="text-xs text-gray-500"> {formatFileSize(progress.loaded)} / {formatFileSize(progress.total)} ({progress.percentage}%) </p> </div> )} </div> ) : ( <> <svg className="w-12 h-12 text-gray-400 mx-auto mb-4" fill="none" stroke="currentColor" viewBox="0 0 24 24" > <path strokeLinecap="round" strokeLinejoin="round" strokeWidth={1.5} d="M7 16a4 4 0 01-.88-7.903A5 5 0 1115.9 6L16 6a5 5 0 011 9.9M15 13l-3-3m0 0l-3 3m3-3v12" /> </svg> <p className="text-lg font-medium text-gray-900 mb-2"> Drop files here or click to browse </p> <p className="text-sm text-gray-500 mb-4"> Maximum file size: {formatFileSize(maxSize)} </p> <p className="text-xs text-gray-400"> Accepted types: {acceptedTypes.join(', ')} </p> </> )}
<input type="file" className="hidden" accept={acceptedTypes.join(',')} onChange={(e) => handleFileSelect(e.target.files)} disabled={isUploading} id="file-input" />
{!isUploading && ( <label htmlFor="file-input" className="absolute inset-0 cursor-pointer" /> )} </div>
{error && ( <div className="mt-4 p-3 bg-red-50 border border-red-200 rounded-md"> <p className="text-sm text-red-600">{error}</p> </div> )} </div> );};Security Considerations and Best Practices#
1. File Type Validation (Both Client and Server)#
// Never trust client-side validation aloneconst ALLOWED_MIME_TYPES = { 'image/jpeg': [0xFF, 0xD8, 0xFF], 'image/png': [0x89, 0x50, 0x4E, 0x47], 'video/mp4': [0x00, 0x00, 0x00, 0x18, 0x66, 0x74, 0x79, 0x70],} as const;
function validateFileType(buffer: Buffer, declaredType: string): boolean { const signature = ALLOWED_MIME_TYPES[declaredType as keyof typeof ALLOWED_MIME_TYPES]; if (!signature) return false;
return signature.every((byte, index) => buffer[index] === byte);}
// In your processing Lambdaconst fileBuffer = await s3Client.send(new GetObjectCommand({ Bucket: bucketName, Key: objectKey, Range: 'bytes=0-10', // Only get first few bytes for signature check}));
const isValidType = validateFileType(fileBuffer.Body as Buffer, contentType);if (!isValidType) { throw new Error('File type validation failed');}2. Size Limits and Timeout Protection#
// In signed URL generatorconst generateSignedUrl = async (request: SignedUrlRequest) => { // Implement progressive size limits based on user tier const userTier = await getUserTier(request.userId); const maxSize = SIZE_LIMITS[userTier] || SIZE_LIMITS.free;
if (request.fileSize > maxSize) { throw new Error(`File size exceeds ${userTier} tier limit`); }
// Set appropriate expiry based on file size // Larger files get longer upload windows const expirySeconds = Math.min( 3600, // 1 hour max Math.max(300, request.fileSize / 1024 / 1024 * 10) // 10 seconds per MB );
return getSignedUrl(s3Client, putCommand, { expiresIn: expirySeconds });};3. Access Control and Audit Logging#
// In processing Lambdaimport { CloudTrailClient, PutEventsCommand } from '@aws-sdk/client-cloudtrail';
const logFileUpload = async (uploadData: UploadData) => { const event = { eventTime: new Date().toISOString(), eventName: 'FileUploaded', eventSource: 'custom.fileupload', userIdentity: { type: 'Unknown', principalId: uploadData.userId, }, resources: [{ resourceName: uploadData.s3Key, resourceType: 'AWS::S3::Object', }], requestParameters: { bucketName: uploadData.bucketName, key: uploadData.s3Key, fileSize: uploadData.fileSize, contentType: uploadData.contentType, }, };
// Log to CloudWatch for monitoring console.log('File upload audit log:', event);
// Optionally send to CloudTrail or custom audit system};Performance and Cost Analysis#
Comparing the two approaches shows clear benefits:
Cost Comparison (Monthly, 10,000 uploads averaging 2GB each):#
| Component | Lambda Proxy | Signed URLs | Improvement |
|---|---|---|---|
| Lambda Compute | ~$15,000 | ~$50 | 99% reduction |
| S3 Transfer | $0 | $0 | No change |
| API Gateway | ~$1,000 | ~$150 | Significant reduction |
| Total | ~$16,000 | ~$200 | ~98% savings |
Performance Improvements:#
- Upload success rate: Significant improvement due to eliminated timeouts
- Average upload time: Reduced from minutes to network-dependent speeds
- Lambda cold starts: Eliminated from upload path
- Concurrent uploads: No longer limited by Lambda concurrency
- User experience: Better progress tracking and reliability
Advanced Patterns for Production#
1. Multipart Uploads for Files > 100MB#
// Enhanced signed URL generator for multipart uploadsimport { CreateMultipartUploadCommand, UploadPartCommand } from '@aws-sdk/client-s3';
const generateMultipartUrls = async (request: LargeFileRequest) => { const partSize = 100 * 1024 * 1024; // 100MB parts const numParts = Math.ceil(request.fileSize / partSize);
// Initiate multipart upload const multipart = await s3Client.send(new CreateMultipartUploadCommand({ Bucket: process.env.UPLOAD_BUCKET!, Key: request.key, ContentType: request.fileType, }));
// Generate signed URLs for each part const partUrls = await Promise.all( Array.from({ length: numParts }, async (_, index) => { const partNumber = index + 1; const command = new UploadPartCommand({ Bucket: process.env.UPLOAD_BUCKET!, Key: request.key, PartNumber: partNumber, UploadId: multipart.UploadId, });
const signedUrl = await getSignedUrl(s3Client, command, { expiresIn: 3600, });
return { partNumber, signedUrl, size: Math.min(partSize, request.fileSize - index * partSize), }; }) );
return { uploadId: multipart.UploadId, parts: partUrls, };};2. Resume-able Uploads with State Tracking#
// Client-side resumable upload logicexport class ResumableUpload { private uploadId: string; private parts: UploadPart[]; private completedParts: CompletedPart[] = [];
async resumeUpload(file: File, uploadId?: string): Promise<string> { if (uploadId) { // Resume existing upload this.uploadId = uploadId; this.completedParts = await this.getCompletedParts(uploadId); } else { // Start new multipart upload const response = await this.initializeUpload(file); this.uploadId = response.uploadId; this.parts = response.parts; }
// Upload remaining parts const pendingParts = this.parts.filter( part => !this.completedParts.some(c => c.partNumber === part.partNumber) );
for (const part of pendingParts) { await this.uploadPart(file, part); }
// Complete multipart upload return this.completeUpload(); }
private async uploadPart(file: File, part: UploadPart): Promise<void> { const start = (part.partNumber - 1) * part.size; const end = Math.min(start + part.size, file.size); const chunk = file.slice(start, end);
const response = await fetch(part.signedUrl, { method: 'PUT', body: chunk, });
if (response.ok) { this.completedParts.push({ partNumber: part.partNumber, etag: response.headers.get('ETag')!, });
// Save progress to localStorage for resume capability localStorage.setItem(`upload_${this.uploadId}`, JSON.stringify({ completedParts: this.completedParts, totalParts: this.parts.length, })); } }}3. Virus Scanning Integration#
// Post-upload virus scanningimport { ClamAVClient } from 'clamav-js'; // Example library
const scanUploadedFile = async (s3Event: S3Event) => { for (const record of s3Event.Records) { const bucket = record.s3.bucket.name; const key = record.s3.object.key;
// Download file for scanning (stream for large files) const fileStream = (await s3Client.send(new GetObjectCommand({ Bucket: bucket, Key: key, }))).Body as Readable;
// Scan with ClamAV or similar const scanResult = await clamav.scanStream(fileStream);
if (scanResult.isInfected) { console.warn('Infected file detected:', { bucket, key, virus: scanResult.viruses });
// Quarantine file await s3Client.send(new CopyObjectCommand({ CopySource: `${bucket}/${key}`, Bucket: `${bucket}-quarantine`, Key: key, }));
// Delete original await s3Client.send(new DeleteObjectCommand({ Bucket: bucket, Key: key, }));
// Notify user await notifyUser(key, 'File rejected due to security scan'); } else { // File is clean, proceed with normal processing await processCleanFile(bucket, key); } }};Monitoring and Alerting#
CloudWatch Dashboards#
// CDK monitoring stackconst dashboard = new cloudwatch.Dashboard(this, 'FileUploadDashboard', { dashboardName: 'FileUploadMetrics', widgets: [ [ new cloudwatch.GraphWidget({ title: 'Signed URL Generation', left: [ signedUrlGenerator.metricDuration(), signedUrlGenerator.metricErrors(), ], right: [signedUrlGenerator.metricInvocations()], }), ], [ new cloudwatch.GraphWidget({ title: 'S3 Upload Metrics', left: [ new cloudwatch.Metric({ namespace: 'AWS/S3', metricName: 'NumberOfObjects', dimensionsMap: { BucketName: uploadBucket.bucketName }, }), ], }), ], [ new cloudwatch.GraphWidget({ title: 'File Processing', left: [ fileProcessor.metricDuration(), fileProcessor.metricErrors(), ], }), ], ],});
// Alarms for production monitoringnew cloudwatch.Alarm(this, 'HighErrorRate', { metric: signedUrlGenerator.metricErrors(), threshold: 10, evaluationPeriods: 2, treatMissingData: cloudwatch.TreatMissingData.NOT_BREACHING,});Form Data Handling and File Metadata#
Handling Additional Form Fields with File Uploads#
In production, you often need to associate metadata with uploaded files. Here's how to handle form data alongside signed URL uploads:
// Enhanced request schema with metadataconst FileUploadWithMetadataSchema = z.object({ // File info fileName: z.string().min(1).max(255), fileSize: z.number().int().min(1).max(10737418240), fileType: z.string().regex(/^(video|image|audio)\/[a-zA-Z0-9][a-zA-Z0-9\!\-\_]*[a-zA-Z0-9]*$/),
// Business metadata title: z.string().min(1).max(200), description: z.string().max(1000).optional(), category: z.enum(['education', 'entertainment', 'business', 'other']), tags: z.array(z.string()).max(10), isPublic: z.boolean(),
// Upload metadata uploadId: z.string().uuid(), userId: z.string().uuid(), organizationId: z.string().uuid().optional(),});
// Modified signed URL generator to include metadata in S3 objectconst putObjectCommand = new PutObjectCommand({ Bucket: process.env.UPLOAD_BUCKET!, Key: key, ContentType: request.fileType, ContentLength: request.fileSize, Metadata: { 'original-filename': request.fileName, 'upload-id': request.uploadId, 'user-id': request.userId, 'title': request.title, 'description': request.description || '', 'category': request.category, 'tags': JSON.stringify(request.tags), 'is-public': request.isPublic.toString(), 'uploaded-at': new Date().toISOString(), }, // Add object tags for better organization and billing Tagging: `Category=${request.category}&IsPublic=${request.isPublic}&UserId=${request.userId}`,});Two-Phase Upload Pattern#
For complex forms, implement a two-phase approach:
// Phase 1: Create upload record in databaseconst createUploadRecord = async (metadata: FileMetadata) => { const uploadRecord = { id: metadata.uploadId, userId: metadata.userId, fileName: metadata.fileName, fileSize: metadata.fileSize, title: metadata.title, description: metadata.description, category: metadata.category, tags: metadata.tags, isPublic: metadata.isPublic, status: 'pending', // pending -> uploading -> processing -> completed createdAt: new Date().toISOString(), };
await dynamoClient.send(new PutItemCommand({ TableName: process.env.UPLOAD_RECORDS_TABLE!, Item: marshall(uploadRecord), }));
return uploadRecord;};
// Phase 2: Update record when S3 upload completesconst updateUploadStatus = async (uploadId: string, s3Key: string, status: string) => { await dynamoClient.send(new UpdateItemCommand({ TableName: process.env.UPLOAD_RECORDS_TABLE!, Key: marshall({ id: uploadId }), UpdateExpression: 'SET #status = :status, #s3Key = :s3Key, #updatedAt = :updatedAt', ExpressionAttributeNames: { '#status': 'status', '#s3Key': 's3Key', '#updatedAt': 'updatedAt', }, ExpressionAttributeValues: marshall({ ':status': status, ':s3Key': s3Key, ':updatedAt': new Date().toISOString(), }), }));};Data Retention and Lifecycle Management#
Automated Data Lifecycle with S3 Lifecycle Rules#
// Enhanced CDK stack with comprehensive lifecycle managementconst uploadBucket = new s3.Bucket(this, 'UploadBucket', { lifecycleRules: [ // Rule 1: Clean up failed multipart uploads { id: 'CleanupFailedUploads', enabled: true, abortIncompleteMultipartUploadsAfter: cdk.Duration.days(1), },
// Rule 2: Transition based on access patterns { id: 'StorageClassTransitions', enabled: true, transitions: [ { storageClass: s3.StorageClass.INFREQUENT_ACCESS, transitionAfter: cdk.Duration.days(30), }, { storageClass: s3.StorageClass.GLACIER, transitionAfter: cdk.Duration.days(90), }, { storageClass: s3.StorageClass.DEEP_ARCHIVE, transitionAfter: cdk.Duration.days(365), }, ], },
// Rule 3: Delete temporary/processing files { id: 'CleanupTempFiles', enabled: true, filter: s3.LifecycleFilter.prefix('temp/'), expiration: cdk.Duration.days(7), },
// Rule 4: User-specific retention (example: free tier users) { id: 'FreeTierRetention', enabled: true, filter: s3.LifecycleFilter.tag('UserTier', 'free'), expiration: cdk.Duration.days(90), },
// Rule 5: Delete old versions (if versioning enabled) { id: 'CleanupOldVersions', enabled: true, noncurrentVersionExpiration: cdk.Duration.days(30), }, ],
// Enable versioning for accidental deletion protection versioned: true,
// Add inventory for cost monitoring inventories: [ { id: 'FullInventory', destination: { bucket: inventoryBucket, prefix: 'inventory', }, enabled: true, frequency: s3.InventoryFrequency.WEEKLY, includeObjectVersions: s3.InventoryObjectVersion.CURRENT, optionalFields: [ s3.InventoryFormat.SIZE, s3.InventoryFormat.LAST_MODIFIED_DATE, s3.InventoryFormat.STORAGE_CLASS, s3.InventoryFormat.ENCRYPTION_STATUS, ], }, ],});User-Controlled Data Retention#
// Lambda for handling user deletion requestsexport const deleteFileHandler = async (event: APIGatewayProxyEvent): Promise<APIGatewayProxyResult> => { try { const { fileId } = JSON.parse(event.body || '{}'); const userId = getUserIdFromJWT(event.headers.authorization);
// Verify ownership const fileRecord = await dynamoClient.send(new GetItemCommand({ TableName: process.env.UPLOAD_RECORDS_TABLE!, Key: marshall({ id: fileId }), }));
if (!fileRecord.Item) { return { statusCode: 404, body: JSON.stringify({ error: 'File not found' }) }; }
const file = unmarshall(fileRecord.Item); if (file.userId !== userId) { return { statusCode: 403, body: JSON.stringify({ error: 'Access denied' }) }; }
// Soft delete first (mark as deleted, but don't remove from S3 immediately) await dynamoClient.send(new UpdateItemCommand({ TableName: process.env.UPLOAD_RECORDS_TABLE!, Key: marshall({ id: fileId }), UpdateExpression: 'SET #status = :status, #deletedAt = :deletedAt', ExpressionAttributeNames: { '#status': 'status', '#deletedAt': 'deletedAt', }, ExpressionAttributeValues: marshall({ ':status': 'deleted', ':deletedAt': new Date().toISOString(), }), }));
// Add S3 delete tag for lifecycle cleanup after grace period await s3Client.send(new PutObjectTaggingCommand({ Bucket: process.env.UPLOAD_BUCKET!, Key: file.s3Key, Tagging: { TagSet: [ { Key: 'Status', Value: 'deleted' }, { Key: 'DeletedAt', Value: new Date().toISOString() }, { Key: 'GracePeriodDays', Value: '30' }, ], }, }));
console.log('File marked for deletion:', { fileId, s3Key: file.s3Key, userId });
return { statusCode: 200, body: JSON.stringify({ message: 'File scheduled for deletion', gracePeriod: '30 days', }), };
} catch (error) { console.error('Delete file error:', error); return { statusCode: 500, body: JSON.stringify({ error: 'Delete failed' }) }; }};
// Scheduled Lambda to permanently delete files after grace periodexport const permanentDeleteHandler = async (event: ScheduledEvent) => { const thirtyDaysAgo = new Date(); thirtyDaysAgo.setDate(thirtyDaysAgo.getDate() - 30);
// Query deleted files older than 30 days const deletedFiles = await dynamoClient.send(new ScanCommand({ TableName: process.env.UPLOAD_RECORDS_TABLE!, FilterExpression: '#status = :status AND #deletedAt < :cutoff', ExpressionAttributeNames: { '#status': 'status', '#deletedAt': 'deletedAt', }, ExpressionAttributeValues: marshall({ ':status': 'deleted', ':cutoff': thirtyDaysAgo.toISOString(), }), }));
for (const item of deletedFiles.Items || []) { const file = unmarshall(item);
try { // Delete from S3 await s3Client.send(new DeleteObjectCommand({ Bucket: process.env.UPLOAD_BUCKET!, Key: file.s3Key, }));
// Remove from DynamoDB await dynamoClient.send(new DeleteItemCommand({ TableName: process.env.UPLOAD_RECORDS_TABLE!, Key: marshall({ id: file.id }), }));
console.log('Permanently deleted file:', { fileId: file.id, s3Key: file.s3Key });
} catch (error) { console.error('Failed to permanently delete file:', { fileId: file.id, error }); } }};GDPR Compliance and Data Portability#
// Lambda for user data export (GDPR Article 20)export const exportUserDataHandler = async (event: APIGatewayProxyEvent) => { const userId = getUserIdFromJWT(event.headers.authorization);
// Get all user's files const userFiles = await dynamoClient.send(new ScanCommand({ TableName: process.env.UPLOAD_RECORDS_TABLE!, FilterExpression: '#userId = :userId', ExpressionAttributeNames: { '#userId': 'userId' }, ExpressionAttributeValues: marshall({ ':userId': userId }), }));
// Generate download links for all files const fileExports = await Promise.all( (userFiles.Items || []).map(async (item) => { const file = unmarshall(item);
if (file.status !== 'completed') return null;
// Generate temporary download URL const downloadUrl = await getSignedUrl( s3Client, new GetObjectCommand({ Bucket: process.env.UPLOAD_BUCKET!, Key: file.s3Key, }), { expiresIn: 3600 * 24 } // 24 hours );
return { fileId: file.id, originalFileName: file.fileName, title: file.title, description: file.description, category: file.category, tags: file.tags, uploadedAt: file.createdAt, fileSize: file.fileSize, downloadUrl, }; }) );
const exportData = { userId, exportedAt: new Date().toISOString(), files: fileExports.filter(Boolean), summary: { totalFiles: fileExports.filter(Boolean).length, totalSize: fileExports.reduce((sum, file) => sum + (file?.fileSize || 0), 0), }, };
// Store export in temp location const exportKey = `exports/${userId}/${Date.now()}.json`; await s3Client.send(new PutObjectCommand({ Bucket: process.env.EXPORT_BUCKET!, Key: exportKey, Body: JSON.stringify(exportData, null, 2), ContentType: 'application/json', // Automatically delete after 7 days Expires: new Date(Date.now() + 7 * 24 * 60 * 60 * 1000), }));
// Generate download URL for export const exportDownloadUrl = await getSignedUrl( s3Client, new GetObjectCommand({ Bucket: process.env.EXPORT_BUCKET!, Key: exportKey, }), { expiresIn: 3600 * 24 * 7 } // 7 days );
return { statusCode: 200, body: JSON.stringify({ exportUrl: exportDownloadUrl, expiresAt: new Date(Date.now() + 7 * 24 * 60 * 60 * 1000).toISOString(), summary: exportData.summary, }), };};
// Lambda for complete user data deletion (GDPR Article 17 - Right to be forgotten)export const deleteAllUserDataHandler = async (event: APIGatewayProxyEvent) => { const userId = getUserIdFromJWT(event.headers.authorization);
console.log('Starting complete user data deletion:', { userId });
// Get all user files const userFiles = await dynamoClient.send(new ScanCommand({ TableName: process.env.UPLOAD_RECORDS_TABLE!, FilterExpression: '#userId = :userId', ExpressionAttributeNames: { '#userId': 'userId' }, ExpressionAttributeValues: marshall({ ':userId': userId }), }));
// Delete all S3 objects const deletePromises = (userFiles.Items || []).map(async (item) => { const file = unmarshall(item);
try { // Delete from S3 await s3Client.send(new DeleteObjectCommand({ Bucket: process.env.UPLOAD_BUCKET!, Key: file.s3Key, }));
// Delete database record await dynamoClient.send(new DeleteItemCommand({ TableName: process.env.UPLOAD_RECORDS_TABLE!, Key: marshall({ id: file.id }), }));
console.log('Deleted user file:', { fileId: file.id, s3Key: file.s3Key });
} catch (error) { console.error('Failed to delete user file:', { fileId: file.id, error }); throw error; // Fail fast for GDPR compliance } });
await Promise.all(deletePromises);
// Log deletion for audit trail await cloudwatchLogs.send(new PutLogEventsCommand({ logGroupName: '/aws/lambda/user-data-deletion', logStreamName: new Date().toISOString().split('T')[0], logEvents: [{ timestamp: Date.now(), message: JSON.stringify({ action: 'complete_user_data_deletion', userId, filesDeleted: userFiles.Items?.length || 0, completedAt: new Date().toISOString(), }), }], }));
return { statusCode: 200, body: JSON.stringify({ message: 'All user data has been permanently deleted', filesDeleted: userFiles.Items?.length || 0, deletedAt: new Date().toISOString(), }), };};Cost Optimization Strategies#
Intelligent Storage Class Selection#
// Lambda to analyze usage patterns and optimize storage classesexport const optimizeStorageHandler = async (event: ScheduledEvent) => { const s3Inventory = await getS3Inventory(); // From S3 inventory reports
for (const object of s3Inventory) { const lastAccessed = await getObjectAccessTime(object.key); const daysSinceAccess = (Date.now() - lastAccessed) / (1000 * 60 * 60 * 24);
// Auto-transition based on access patterns if (daysSinceAccess > 90 && object.storageClass === 'STANDARD') { await s3Client.send(new CopyObjectCommand({ CopySource: `${object.bucket}/${object.key}`, Bucket: object.bucket, Key: object.key, StorageClass: 'GLACIER', MetadataDirective: 'COPY', }));
console.log('Transitioned to Glacier:', { key: object.key, daysSinceAccess }); }
// Deep archive for very old files if (daysSinceAccess > 365 && object.storageClass === 'GLACIER') { await s3Client.send(new CopyObjectCommand({ CopySource: `${object.bucket}/${object.key}`, Bucket: object.bucket, Key: object.key, StorageClass: 'DEEP_ARCHIVE', MetadataDirective: 'COPY', }));
console.log('Transitioned to Deep Archive:', { key: object.key, daysSinceAccess }); } }};Usage-Based Billing Integration#
// Track file access for usage-based billingconst trackFileAccess = async (userId: string, fileId: string, operation: string) => { await dynamoClient.send(new PutItemCommand({ TableName: process.env.USAGE_TRACKING_TABLE!, Item: marshall({ id: `${userId}#${Date.now()}`, userId, fileId, operation, // 'upload', 'download', 'view', 'delete' timestamp: new Date().toISOString(), month: new Date().toISOString().substring(0, 7), // YYYY-MM for billing }), }));
// Update user's monthly usage counter await dynamoClient.send(new UpdateItemCommand({ TableName: process.env.USER_USAGE_TABLE!, Key: marshall({ userId, month: new Date().toISOString().substring(0, 7), }), UpdateExpression: 'ADD #operation :increment', ExpressionAttributeNames: { '#operation': operation }, ExpressionAttributeValues: marshall({ ':increment': 1 }), }));};Conclusion: When to Choose Each Approach#
The signed URL pattern offers significant advantages for large file uploads:
Technical Benefits:
- Substantial cost reduction in compute expenses
- Elimination of timeout issues
- Better scalability through S3's native capacity
- Improved user experience with progress tracking
Implementation Considerations:
- Choose Lambda proxy for files requiring immediate processing
- Use signed URLs for simple storage scenarios
- Consider hybrid approaches for complex workflows
- Plan security measures from the start
Key Takeaways:
- Right tool for the job - Lambda excels at processing, S3 excels at storage
- Security requires multiple layers - Signed URLs, validation, scanning, and monitoring
- Start simple, add complexity as needed - Basic signed URLs first, then multipart uploads
- Monitor and measure - File uploads have many failure modes that need tracking
This pattern works well for any file type where immediate processing isn't required. The key decision point is whether Lambda adds value beyond acting as an expensive proxy for the upload process.
References#
- docs.aws.amazon.com - Lambda functions: execution model and scaling.
- docs.aws.amazon.com - AWS CDK Developer Guide.
- github.com - AWS CDK source repository and release notes.
- docs.aws.amazon.com - AWS documentation home (service guides and API references).
- docs.aws.amazon.com - AWS Well-Architected Framework overview.
- typescriptlang.org - TypeScript Handbook and language reference.
- github.com - TypeScript project wiki (FAQ and design notes).
- docs.aws.amazon.com - AWS Overview (official whitepaper).
- cloud.google.com - Google Cloud documentation.