Skip to content

AWS Lambda + S3 Signed URLs: A Practical Solution for Large File Uploads

A practical approach to handling large file uploads using S3 signed URLs instead of Lambda proxies. Complete implementation with CDK, security considerations, and lessons learned from production experience.

When building file upload systems, many developers start with the straightforward approach: proxy uploads through Lambda. This works well for small files but quickly becomes problematic as file sizes grow. Lambda's execution limits and memory constraints create timeout issues, while the cost of keeping functions running during long uploads can be significant.

After experiencing these challenges firsthand, I learned that S3 signed URLs offer a more scalable solution. This approach reduces Lambda execution time to milliseconds, eliminates timeout issues, and provides substantial cost savings. Here's a practical implementation that handles large files efficiently.

Understanding the Lambda Upload Challenge

The traditional Lambda-proxy approach often looks like this:

typescript
// The Lambda that killed our budget and user experienceexport const uploadHandler = async (event: APIGatewayEvent) => {  // This ran for UP TO 15 minutes per upload  const file = parseMultipartFormData(event.body);
  // Memory usage would spike to 3GB+ for large files  const processedFile = await processVideo(file);
  // S3 upload could take 10+ minutes  const result = await s3.upload({    Bucket: 'my-videos',    Key: `uploads/${uuidv4()}`,    Body: processedFile,  }).promise();
  return { statusCode: 200, body: JSON.stringify(result) };};

This approach creates several challenges:

  • Long execution times: 8-12 minutes per upload for large files
  • High memory usage: 2-3GB consistently for processing large files
  • Significant costs: High Lambda compute costs for extended execution
  • Reliability issues: Timeouts and memory errors affect success rates
  • Poor user experience: Limited visibility into upload progress

A More Efficient Architecture

The key insight is removing Lambda from the upload data path:

Instead of streaming files through Lambda, clients now:

  1. Request a signed URL from Lambda (< 200ms)
  2. Upload directly to S3 (no Lambda involvement)
  3. S3 triggers processing Lambda when upload completes

Complete Implementation Guide

CDK Infrastructure for Large File Uploads

Here's a complete CDK implementation that demonstrates this pattern:

typescript
// lib/file-upload-stack.tsimport * as cdk from 'aws-cdk-lib';import { Construct } from 'constructs';import * as s3 from 'aws-cdk-lib/aws-s3';import * as lambda from 'aws-cdk-lib/aws-lambda';import * as apigateway from 'aws-cdk-lib/aws-apigateway';import * as s3n from 'aws-cdk-lib/aws-s3-notifications';import * as iam from 'aws-cdk-lib/aws-iam';import { NodejsFunction } from 'aws-cdk-lib/aws-lambda-nodejs';
export class FileUploadStack extends cdk.Stack {  constructor(scope: Construct, id: string, props?: cdk.StackProps) {    super(scope, id, props);
    // S3 bucket with lifecycle policies for cost optimization    const uploadBucket = new s3.Bucket(this, 'UploadBucket', {      bucketName: `${this.stackName}-uploads-${this.account}`,      cors: [        {          allowedOrigins: ['*'],          allowedMethods: [            s3.HttpMethods.PUT,            s3.HttpMethods.POST,            s3.HttpMethods.GET,            s3.HttpMethods.HEAD,          ],          allowedHeaders: ['*'],          exposedHeaders: ['ETag'],          maxAge: 3600,        },      ],      // Automatically delete incomplete multipart uploads after 7 days      lifecycleRules: [        {          id: 'AbortIncompleteMultipartUploads',          enabled: true,          abortIncompleteMultipartUploadsAfter: cdk.Duration.days(7),        },        {          id: 'TransitionToIA',          enabled: true,          transitions: [            {              storageClass: s3.StorageClass.INFREQUENT_ACCESS,              transitionAfter: cdk.Duration.days(30),            },            {              storageClass: s3.StorageClass.GLACIER,              transitionAfter: cdk.Duration.days(90),            },          ],        },      ],      // Block public access for security      blockPublicAccess: s3.BlockPublicAccess.BLOCK_ALL,      encryption: s3.BucketEncryption.S3_MANAGED,    });
    // Lambda for generating signed URLs - runs in <200ms    const signedUrlGenerator = new NodejsFunction(this, 'SignedUrlGenerator', {      entry: 'src/handlers/generate-signed-url.ts',      runtime: lambda.Runtime.NODEJS_20_X      architecture: lambda.Architecture.ARM_64,      memorySize: 512, // Small memory footprint      timeout: cdk.Duration.seconds(30),      environment: {        UPLOAD_BUCKET: uploadBucket.bucketName,        ALLOWED_FILE_TYPES: 'video/mp4,video/quicktime,video/x-msvideo,image/jpeg,image/png',        MAX_FILE_SIZE: '10737418240', // 10GB in bytes        SIGNED_URL_EXPIRY: '3600', // 1 hour      },      bundling: {        minify: true,        sourceMap: true,        target: 'es2022',      },    });
    // Grant signed URL generator permissions to create signed URLs    uploadBucket.grantReadWrite(signedUrlGenerator);    signedUrlGenerator.addToRolePolicy(      new iam.PolicyStatement({        effect: iam.Effect.ALLOW,        actions: ['s3:PutObjectAcl', 's3:GetObject'],        resources: [uploadBucket.arnForObjects('*')],      })    );
    // Lambda for post-upload processing    const fileProcessor = new NodejsFunction(this, 'FileProcessor', {      entry: 'src/handlers/process-file.ts',      runtime: lambda.Runtime.NODEJS_20_X      architecture: lambda.Architecture.ARM_64,      memorySize: 2048, // Higher memory for processing      timeout: cdk.Duration.minutes(5),      environment: {        UPLOAD_BUCKET: uploadBucket.bucketName,      },      bundling: {        minify: true,        sourceMap: true,        target: 'es2022',        // Include ffmpeg for video processing if needed        nodeModules: ['fluent-ffmpeg'],      },    });
    uploadBucket.grantReadWrite(fileProcessor);
    // S3 event notification to trigger processing    uploadBucket.addEventNotification(      s3.EventType.OBJECT_CREATED,      new s3n.LambdaDestination(fileProcessor),      { prefix: 'uploads/' } // Only process files in uploads/ prefix    );
    // API Gateway for signed URL generation    const api = new apigateway.RestApi(this, 'FileUploadApi', {      restApiName: 'File Upload API',      description: 'API for generating S3 signed URLs',      defaultCorsPreflightOptions: {        allowOrigins: apigateway.Cors.ALL_ORIGINS,        allowMethods: ['GET', 'POST', 'OPTIONS'],        allowHeaders: ['Content-Type', 'Authorization'],      },    });
    const uploads = api.root.addResource('uploads');    const signedUrl = uploads.addResource('signed-url');
    signedUrl.addMethod(      'POST',      new apigateway.LambdaIntegration(signedUrlGenerator, {        requestTemplates: {          'application/json': '{"body": $input.json("$")}',        },      })    );
    // Outputs    new cdk.CfnOutput(this, 'ApiUrl', {      value: api.url,      description: 'API Gateway URL',    });
    new cdk.CfnOutput(this, 'BucketName', {      value: uploadBucket.bucketName,      description: 'S3 Upload Bucket Name',    });  }}

Signed URL Generator - The 200ms Lambda

This Lambda runs in under 200ms and generates secure upload URLs:

typescript
// src/handlers/generate-signed-url.tsimport { APIGatewayProxyEvent, APIGatewayProxyResult } from 'aws-lambda';import { S3Client, PutObjectCommand } from '@aws-sdk/client-s3';import { getSignedUrl } from '@aws-sdk/s3-request-presigner';import { z } from 'zod';
// Input validation schemaconst SignedUrlRequestSchema = z.object({  fileName: z.string().min(1).max(255),  fileSize: z.number().int().min(1).max(10737418240), // 10GB max  fileType: z.string().regex(/^(video|image|audio)\/[a-zA-Z0-9][a-zA-Z0-9\!\-\_]*[a-zA-Z0-9]*$/),  uploadId: z.string().uuid().optional(), // For tracking});
const s3Client = new S3Client({ region: process.env.AWS_REGION });
export const handler = async (  event: APIGatewayProxyEvent): Promise<APIGatewayProxyResult> => {  console.log('Generating signed URL request:', {    body: event.body,    headers: event.headers  });
  try {    // Parse and validate request    const body = JSON.parse(event.body || '{}');    const request = SignedUrlRequestSchema.parse(body);
    // Security checks    const allowedTypes = process.env.ALLOWED_FILE_TYPES?.split(',') || [];    if (!allowedTypes.includes(request.fileType)) {      return {        statusCode: 400,        headers: { 'Content-Type': 'application/json' },        body: JSON.stringify({          error: 'File type not allowed',          allowedTypes,        }),      };    }
    const maxSize = parseInt(process.env.MAX_FILE_SIZE || '5368709120'); // 5GB default    if (request.fileSize > maxSize) {      return {        statusCode: 400,        headers: { 'Content-Type': 'application/json' },        body: JSON.stringify({          error: 'File too large',          maxSize,          receivedSize: request.fileSize,        }),      };    }
    // Generate unique key with timestamp and sanitized filename    const timestamp = new Date().toISOString().replace(/[:.]/g, '-');    const sanitizedFileName = request.fileName.replace(/[^a-zA-Z0-9.-]/g, '_');    const key = `uploads/${timestamp}-${sanitizedFileName}`;
    // Create signed URL for PUT request    const putObjectCommand = new PutObjectCommand({      Bucket: process.env.UPLOAD_BUCKET!,      Key: key,      ContentType: request.fileType,      ContentLength: request.fileSize,      // Add metadata for processing      Metadata: {        'original-filename': request.fileName,        'upload-id': request.uploadId || 'direct-upload',        'file-size': request.fileSize.toString(),        'uploaded-at': new Date().toISOString(),      },      // Security headers      ServerSideEncryption: 'AES256',    });
    const signedUrl = await getSignedUrl(s3Client, putObjectCommand, {      expiresIn: parseInt(process.env.SIGNED_URL_EXPIRY || '3600'), // 1 hour default    });
    console.log('Signed URL generated successfully:', {      key,      fileSize: request.fileSize,      fileType: request.fileType,      expiresIn: process.env.SIGNED_URL_EXPIRY,    });
    return {      statusCode: 200,      headers: {        'Content-Type': 'application/json',        'Access-Control-Allow-Origin': '*',        'Cache-Control': 'no-cache',      },      body: JSON.stringify({        signedUrl,        key,        method: 'PUT',        headers: {          'Content-Type': request.fileType,          'Content-Length': request.fileSize.toString(),        },        expiresAt: new Date(Date.now() + parseInt(process.env.SIGNED_URL_EXPIRY || '3600') * 1000).toISOString(),      }),    };
  } catch (error) {    console.error('Signed URL generation error:', error);
    if (error instanceof z.ZodError) {      return {        statusCode: 400,        headers: { 'Content-Type': 'application/json' },        body: JSON.stringify({          error: 'Invalid request',          details: error.errors,        }),      };    }
    return {      statusCode: 500,      headers: { 'Content-Type': 'application/json' },      body: JSON.stringify({        error: 'Failed to generate signed URL',      }),    };  }};

File Processing Lambda - Only Runs When Needed

This Lambda only executes when files are successfully uploaded to S3:

typescript
// src/handlers/process-file.tsimport { S3Event } from 'aws-lambda';import { S3Client, GetObjectCommand, CopyObjectCommand, DeleteObjectCommand } from '@aws-sdk/client-s3';import { DynamoDBClient, PutItemCommand } from '@aws-sdk/client-dynamodb';import { SESClient, SendEmailCommand } from '@aws-sdk/client-ses';
const s3Client = new S3Client({ region: process.env.AWS_REGION });const dynamoClient = new DynamoDBClient({ region: process.env.AWS_REGION });const sesClient = new SESClient({ region: process.env.AWS_REGION });
export const handler = async (event: S3Event): Promise<void> => {  console.log('Processing uploaded files:', JSON.stringify(event, null, 2));
  for (const record of event.Records) {    if (record.eventName?.startsWith('ObjectCreated')) {      await processUploadedFile(record);    }  }};
async function processUploadedFile(record: any) {  const bucketName = record.s3.bucket.name;  const objectKey = decodeURIComponent(record.s3.object.key.replace(/\+/g, ' '));  const fileSize = record.s3.object.size;
  console.log('Processing file:', { bucketName, objectKey, fileSize });
  try {    // Get object metadata    const headResponse = await s3Client.send(new GetObjectCommand({      Bucket: bucketName,      Key: objectKey,    }));
    const metadata = headResponse.Metadata || {};    const originalFilename = metadata['original-filename'] || objectKey;    const uploadId = metadata['upload-id'] || 'unknown';
    // Determine file type and processing strategy    const contentType = headResponse.ContentType || '';    let processingStatus = 'completed';    let processedKey = objectKey;
    if (contentType.startsWith('video/')) {      // Video files might need transcoding      processingStatus = 'processing';      // In production, you might trigger AWS MediaConvert here      console.log('Video file detected, would trigger transcoding');
      // For demo, just move to processed folder      processedKey = objectKey.replace('uploads/', 'processed/videos/');      await s3Client.send(new CopyObjectCommand({        CopySource: `${bucketName}/${objectKey}`,        Bucket: bucketName,        Key: processedKey,      }));
      processingStatus = 'completed';    } else if (contentType.startsWith('image/')) {      // Image files might need resizing/optimization      console.log('Image file detected, would trigger processing');
      processedKey = objectKey.replace('uploads/', 'processed/images/');      await s3Client.send(new CopyObjectCommand({        CopySource: `${bucketName}/${objectKey}`,        Bucket: bucketName,        Key: processedKey,      }));    }
    // Store processing result in database    await dynamoClient.send(new PutItemCommand({      TableName: process.env.FILES_TABLE || 'processed-files',      Item: {        fileId: { S: uploadId },        originalKey: { S: objectKey },        processedKey: { S: processedKey },        originalFilename: { S: originalFilename },        fileSize: { N: fileSize.toString() },        contentType: { S: contentType },        status: { S: processingStatus },        uploadedAt: { S: new Date().toISOString() },        processedAt: { S: new Date().toISOString() },      },    }));
    // Optional: Send notification email    if (metadata['notification-email']) {      await sesClient.send(new SendEmailCommand({        Source: '[email protected]',        Destination: {          ToAddresses: [metadata['notification-email']],        },        Message: {          Subject: {            Data: 'File Upload Processed Successfully',          },          Body: {            Text: {              Data: `Your file "${originalFilename}" has been successfully uploaded and processed.`,            },          },        },      }));    }
    // Clean up original upload if moved to processed location    if (processedKey !== objectKey) {      await s3Client.send(new DeleteObjectCommand({        Bucket: bucketName,        Key: objectKey,      }));    }
    console.log('File processing completed:', {      uploadId,      originalKey: objectKey,      processedKey,      status: processingStatus,    });
  } catch (error) {    console.error('File processing failed:', error);
    // Update database with error status    await dynamoClient.send(new PutItemCommand({      TableName: process.env.FILES_TABLE || 'processed-files',      Item: {        fileId: { S: record.s3.object.eTag },        originalKey: { S: objectKey },        status: { S: 'failed' },        errorMessage: { S: error instanceof Error ? error.message : 'Unknown error' },        uploadedAt: { S: new Date().toISOString() },        failedAt: { S: new Date().toISOString() },      },    }));
    throw error; // Re-throw to trigger retry if needed  }}

Frontend Implementation - React/TypeScript

Here's how clients actually use the signed URLs:

typescript
// hooks/useFileUpload.tsimport { useState, useCallback } from 'react';
interface UploadProgress {  loaded: number;  total: number;  percentage: number;}
interface UseFileUploadReturn {  upload: (file: File) => Promise<string>;  progress: UploadProgress | null;  isUploading: boolean;  error: string | null;}
export const useFileUpload = (): UseFileUploadReturn => {  const [progress, setProgress] = useState<UploadProgress | null>(null);  const [isUploading, setIsUploading] = useState(false);  const [error, setError] = useState<string | null>(null);
  const upload = useCallback(async (file: File): Promise<string> => {    setIsUploading(true);    setError(null);    setProgress(null);
    try {      console.log('Starting upload for file:', {        name: file.name,        size: file.size,        type: file.type,      });
      // Step 1: Request signed URL      const signedUrlResponse = await fetch('/api/uploads/signed-url', {        method: 'POST',        headers: {          'Content-Type': 'application/json',        },        body: JSON.stringify({          fileName: file.name,          fileSize: file.size,          fileType: file.type,          uploadId: crypto.randomUUID(),        }),      });
      if (!signedUrlResponse.ok) {        const errorData = await signedUrlResponse.json();        throw new Error(errorData.error || 'Failed to get signed URL');      }
      const { signedUrl, key, headers } = await signedUrlResponse.json();
      console.log('Got signed URL, starting direct S3 upload');
      // Step 2: Upload directly to S3 with progress tracking      const uploadResponse = await fetch(signedUrl, {        method: 'PUT',        headers: {          'Content-Type': file.type,          'Content-Length': file.size.toString(),          ...headers,        },        body: file,      });
      if (!uploadResponse.ok) {        throw new Error(`Upload failed: ${uploadResponse.status} ${uploadResponse.statusText}`);      }
      console.log('Upload completed successfully');
      return key; // Return S3 object key for reference
    } catch (err) {      const errorMessage = err instanceof Error ? err.message : 'Upload failed';      setError(errorMessage);      console.error('Upload error:', err);      throw err;    } finally {      setIsUploading(false);      setProgress(null);    }  }, []);
  return {    upload,    progress,    isUploading,    error,  };};
// Enhanced version with progress tracking using XMLHttpRequestexport const useFileUploadWithProgress = (): UseFileUploadReturn => {  const [progress, setProgress] = useState<UploadProgress | null>(null);  const [isUploading, setIsUploading] = useState(false);  const [error, setError] = useState<string | null>(null);
  const upload = useCallback(async (file: File): Promise<string> => {    setIsUploading(true);    setError(null);    setProgress({ loaded: 0, total: file.size, percentage: 0 });
    try {      // Get signed URL      const signedUrlResponse = await fetch('/api/uploads/signed-url', {        method: 'POST',        headers: { 'Content-Type': 'application/json' },        body: JSON.stringify({          fileName: file.name,          fileSize: file.size,          fileType: file.type,          uploadId: crypto.randomUUID(),        }),      });
      if (!signedUrlResponse.ok) {        const errorData = await signedUrlResponse.json();        throw new Error(errorData.error || 'Failed to get signed URL');      }
      const { signedUrl, key } = await signedUrlResponse.json();
      // Upload with progress tracking      return new Promise((resolve, reject) => {        const xhr = new XMLHttpRequest();
        xhr.upload.addEventListener('progress', (event) => {          if (event.lengthComputable) {            const percentage = Math.round((event.loaded / event.total) * 100);            setProgress({              loaded: event.loaded,              total: event.total,              percentage,            });            console.log(`Upload progress: ${percentage}%`);          }        });
        xhr.addEventListener('load', () => {          if (xhr.status >= 200 && xhr.status < 300) {            console.log('Upload completed successfully');            setProgress({              loaded: file.size,              total: file.size,              percentage: 100,            });            resolve(key);          } else {            reject(new Error(`Upload failed: ${xhr.status} ${xhr.statusText}`));          }        });
        xhr.addEventListener('error', () => {          reject(new Error('Upload failed due to network error'));        });
        xhr.addEventListener('abort', () => {          reject(new Error('Upload was aborted'));        });
        xhr.open('PUT', signedUrl);        xhr.setRequestHeader('Content-Type', file.type);        xhr.send(file);      });
    } catch (err) {      const errorMessage = err instanceof Error ? err.message : 'Upload failed';      setError(errorMessage);      console.error('Upload error:', err);      throw err;    } finally {      setIsUploading(false);    }  }, []);
  return { upload, progress, isUploading, error };};

React Upload Component

typescript
// components/FileUploader.tsximport React, { useCallback, useState } from 'react';import { useFileUploadWithProgress } from '../hooks/useFileUpload';
interface FileUploaderProps {  onUploadComplete?: (key: string) => void;  onUploadError?: (error: string) => void;  acceptedTypes?: string[];  maxSize?: number;}
export const FileUploader: React.FC<FileUploaderProps> = ({  onUploadComplete,  onUploadError,  acceptedTypes = ['video/*', 'image/*'],  maxSize = 10 * 1024 * 1024 * 1024, // 10GB default}) => {  const { upload, progress, isUploading, error } = useFileUploadWithProgress();  const [dragOver, setDragOver] = useState(false);
  const handleFileSelect = useCallback(async (files: FileList | null) => {    if (!files || files.length === 0) return;
    const file = files[0];
    // Validate file type    const isValidType = acceptedTypes.some(type => {      if (type.endsWith('/*')) {        return file.type.startsWith(type.slice(0, -1));      }      return file.type === type;    });
    if (!isValidType) {      const errorMsg = `File type not allowed. Accepted types: ${acceptedTypes.join(', ')}`;      onUploadError?.(errorMsg);      return;    }
    // Validate file size    if (file.size > maxSize) {      const errorMsg = `File too large. Maximum size: ${Math.round(maxSize / 1024 / 1024)}MB`;      onUploadError?.(errorMsg);      return;    }
    try {      const key = await upload(file);      onUploadComplete?.(key);    } catch (err) {      const errorMsg = err instanceof Error ? err.message : 'Upload failed';      onUploadError?.(errorMsg);    }  }, [upload, acceptedTypes, maxSize, onUploadComplete, onUploadError]);
  const handleDrop = useCallback((e: React.DragEvent) => {    e.preventDefault();    setDragOver(false);    handleFileSelect(e.dataTransfer.files);  }, [handleFileSelect]);
  const handleDragOver = useCallback((e: React.DragEvent) => {    e.preventDefault();    setDragOver(true);  }, []);
  const handleDragLeave = useCallback((e: React.DragEvent) => {    e.preventDefault();    setDragOver(false);  }, []);
  const formatFileSize = (bytes: number): string => {    if (bytes === 0) return '0 Bytes';    const k = 1024;    const sizes = ['Bytes', 'KB', 'MB', 'GB'];    const i = Math.floor(Math.log(bytes) / Math.log(k));    return parseFloat((bytes / Math.pow(k, i)).toFixed(2)) + ' ' + sizes[i];  };
  return (    <div className="w-full max-w-xl mx-auto">      <div        className={`          border-2 border-dashed rounded-lg p-8 text-center transition-colors          ${dragOver ? 'border-blue-400 bg-blue-50' : 'border-gray-300'}          ${isUploading ? 'pointer-events-none opacity-60' : 'hover:border-gray-400'}        `}        onDrop={handleDrop}        onDragOver={handleDragOver}        onDragLeave={handleDragLeave}      >        {isUploading ? (          <div className="space-y-4">            <div className="animate-spin rounded-full h-8 w-8 border-b-2 border-blue-600 mx-auto"></div>            <p className="text-sm text-gray-600">Uploading...</p>            {progress && (              <div className="space-y-2">                <div className="w-full bg-gray-200 rounded-full h-2">                  <div                    className="bg-blue-600 h-2 rounded-full transition-all duration-300"                    style={{ width: `${progress.percentage}%` }}                  ></div>                </div>                <p className="text-xs text-gray-500">                  {formatFileSize(progress.loaded)} / {formatFileSize(progress.total)} ({progress.percentage}%)                </p>              </div>            )}          </div>        ) : (          <>            <svg              className="w-12 h-12 text-gray-400 mx-auto mb-4"              fill="none"              stroke="currentColor"              viewBox="0 0 24 24"            >              <path                strokeLinecap="round"                strokeLinejoin="round"                strokeWidth={1.5}                d="M7 16a4 4 0 01-.88-7.903A5 5 0 1115.9 6L16 6a5 5 0 011 9.9M15 13l-3-3m0 0l-3 3m3-3v12"              />            </svg>            <p className="text-lg font-medium text-gray-900 mb-2">              Drop files here or click to browse            </p>            <p className="text-sm text-gray-500 mb-4">              Maximum file size: {formatFileSize(maxSize)}            </p>            <p className="text-xs text-gray-400">              Accepted types: {acceptedTypes.join(', ')}            </p>          </>        )}
        <input          type="file"          className="hidden"          accept={acceptedTypes.join(',')}          onChange={(e) => handleFileSelect(e.target.files)}          disabled={isUploading}          id="file-input"        />
        {!isUploading && (          <label            htmlFor="file-input"            className="absolute inset-0 cursor-pointer"          />        )}      </div>
      {error && (        <div className="mt-4 p-3 bg-red-50 border border-red-200 rounded-md">          <p className="text-sm text-red-600">{error}</p>        </div>      )}    </div>  );};

Security Considerations and Best Practices

1. File Type Validation (Both Client and Server)

typescript
// Never trust client-side validation aloneconst ALLOWED_MIME_TYPES = {  'image/jpeg': [0xFF, 0xD8, 0xFF],  'image/png': [0x89, 0x50, 0x4E, 0x47],  'video/mp4': [0x00, 0x00, 0x00, 0x18, 0x66, 0x74, 0x79, 0x70],} as const;
function validateFileType(buffer: Buffer, declaredType: string): boolean {  const signature = ALLOWED_MIME_TYPES[declaredType as keyof typeof ALLOWED_MIME_TYPES];  if (!signature) return false;
  return signature.every((byte, index) => buffer[index] === byte);}
// In your processing Lambdaconst fileBuffer = await s3Client.send(new GetObjectCommand({  Bucket: bucketName,  Key: objectKey,  Range: 'bytes=0-10', // Only get first few bytes for signature check}));
const isValidType = validateFileType(fileBuffer.Body as Buffer, contentType);if (!isValidType) {  throw new Error('File type validation failed');}

2. Size Limits and Timeout Protection

typescript
// In signed URL generatorconst generateSignedUrl = async (request: SignedUrlRequest) => {  // Implement progressive size limits based on user tier  const userTier = await getUserTier(request.userId);  const maxSize = SIZE_LIMITS[userTier] || SIZE_LIMITS.free;
  if (request.fileSize > maxSize) {    throw new Error(`File size exceeds ${userTier} tier limit`);  }
  // Set appropriate expiry based on file size  // Larger files get longer upload windows  const expirySeconds = Math.min(    3600, // 1 hour max    Math.max(300, request.fileSize / 1024 / 1024 * 10) // 10 seconds per MB  );
  return getSignedUrl(s3Client, putCommand, { expiresIn: expirySeconds });};

3. Access Control and Audit Logging

typescript
// In processing Lambdaimport { CloudTrailClient, PutEventsCommand } from '@aws-sdk/client-cloudtrail';
const logFileUpload = async (uploadData: UploadData) => {  const event = {    eventTime: new Date().toISOString(),    eventName: 'FileUploaded',    eventSource: 'custom.fileupload',    userIdentity: {      type: 'Unknown',      principalId: uploadData.userId,    },    resources: [{      resourceName: uploadData.s3Key,      resourceType: 'AWS::S3::Object',    }],    requestParameters: {      bucketName: uploadData.bucketName,      key: uploadData.s3Key,      fileSize: uploadData.fileSize,      contentType: uploadData.contentType,    },  };
  // Log to CloudWatch for monitoring  console.log('File upload audit log:', event);
  // Optionally send to CloudTrail or custom audit system};

Performance and Cost Analysis

Comparing the two approaches shows clear benefits:

Cost Comparison (Monthly, 10,000 uploads averaging 2GB each):

ComponentLambda ProxySigned URLsImprovement
Lambda Compute~$15,000~$5099% reduction
S3 Transfer$0$0No change
API Gateway~$1,000~$150Significant reduction
Total~$16,000~$200~98% savings

Performance Improvements:

  • Upload success rate: Significant improvement due to eliminated timeouts
  • Average upload time: Reduced from minutes to network-dependent speeds
  • Lambda cold starts: Eliminated from upload path
  • Concurrent uploads: No longer limited by Lambda concurrency
  • User experience: Better progress tracking and reliability

Advanced Patterns for Production

1. Multipart Uploads for Files > 100MB

typescript
// Enhanced signed URL generator for multipart uploadsimport { CreateMultipartUploadCommand, UploadPartCommand } from '@aws-sdk/client-s3';
const generateMultipartUrls = async (request: LargeFileRequest) => {  const partSize = 100 * 1024 * 1024; // 100MB parts  const numParts = Math.ceil(request.fileSize / partSize);
  // Initiate multipart upload  const multipart = await s3Client.send(new CreateMultipartUploadCommand({    Bucket: process.env.UPLOAD_BUCKET!,    Key: request.key,    ContentType: request.fileType,  }));
  // Generate signed URLs for each part  const partUrls = await Promise.all(    Array.from({ length: numParts }, async (_, index) => {      const partNumber = index + 1;      const command = new UploadPartCommand({        Bucket: process.env.UPLOAD_BUCKET!,        Key: request.key,        PartNumber: partNumber,        UploadId: multipart.UploadId,      });
      const signedUrl = await getSignedUrl(s3Client, command, {        expiresIn: 3600,      });
      return {        partNumber,        signedUrl,        size: Math.min(partSize, request.fileSize - index * partSize),      };    })  );
  return {    uploadId: multipart.UploadId,    parts: partUrls,  };};

2. Resume-able Uploads with State Tracking

typescript
// Client-side resumable upload logicexport class ResumableUpload {  private uploadId: string;  private parts: UploadPart[];  private completedParts: CompletedPart[] = [];
  async resumeUpload(file: File, uploadId?: string): Promise<string> {    if (uploadId) {      // Resume existing upload      this.uploadId = uploadId;      this.completedParts = await this.getCompletedParts(uploadId);    } else {      // Start new multipart upload      const response = await this.initializeUpload(file);      this.uploadId = response.uploadId;      this.parts = response.parts;    }
    // Upload remaining parts    const pendingParts = this.parts.filter(      part => !this.completedParts.some(c => c.partNumber === part.partNumber)    );
    for (const part of pendingParts) {      await this.uploadPart(file, part);    }
    // Complete multipart upload    return this.completeUpload();  }
  private async uploadPart(file: File, part: UploadPart): Promise<void> {    const start = (part.partNumber - 1) * part.size;    const end = Math.min(start + part.size, file.size);    const chunk = file.slice(start, end);
    const response = await fetch(part.signedUrl, {      method: 'PUT',      body: chunk,    });
    if (response.ok) {      this.completedParts.push({        partNumber: part.partNumber,        etag: response.headers.get('ETag')!,      });
      // Save progress to localStorage for resume capability      localStorage.setItem(`upload_${this.uploadId}`, JSON.stringify({        completedParts: this.completedParts,        totalParts: this.parts.length,      }));    }  }}

3. Virus Scanning Integration

typescript
// Post-upload virus scanningimport { ClamAVClient } from 'clamav-js'; // Example library
const scanUploadedFile = async (s3Event: S3Event) => {  for (const record of s3Event.Records) {    const bucket = record.s3.bucket.name;    const key = record.s3.object.key;
    // Download file for scanning (stream for large files)    const fileStream = (await s3Client.send(new GetObjectCommand({      Bucket: bucket,      Key: key,    }))).Body as Readable;
    // Scan with ClamAV or similar    const scanResult = await clamav.scanStream(fileStream);
    if (scanResult.isInfected) {      console.warn('Infected file detected:', { bucket, key, virus: scanResult.viruses });
      // Quarantine file      await s3Client.send(new CopyObjectCommand({        CopySource: `${bucket}/${key}`,        Bucket: `${bucket}-quarantine`,        Key: key,      }));
      // Delete original      await s3Client.send(new DeleteObjectCommand({        Bucket: bucket,        Key: key,      }));
      // Notify user      await notifyUser(key, 'File rejected due to security scan');    } else {      // File is clean, proceed with normal processing      await processCleanFile(bucket, key);    }  }};

Monitoring and Alerting

CloudWatch Dashboards

typescript
// CDK monitoring stackconst dashboard = new cloudwatch.Dashboard(this, 'FileUploadDashboard', {  dashboardName: 'FileUploadMetrics',  widgets: [    [      new cloudwatch.GraphWidget({        title: 'Signed URL Generation',        left: [          signedUrlGenerator.metricDuration(),          signedUrlGenerator.metricErrors(),        ],        right: [signedUrlGenerator.metricInvocations()],      }),    ],    [      new cloudwatch.GraphWidget({        title: 'S3 Upload Metrics',        left: [          new cloudwatch.Metric({            namespace: 'AWS/S3',            metricName: 'NumberOfObjects',            dimensionsMap: { BucketName: uploadBucket.bucketName },          }),        ],      }),    ],    [      new cloudwatch.GraphWidget({        title: 'File Processing',        left: [          fileProcessor.metricDuration(),          fileProcessor.metricErrors(),        ],      }),    ],  ],});
// Alarms for production monitoringnew cloudwatch.Alarm(this, 'HighErrorRate', {  metric: signedUrlGenerator.metricErrors(),  threshold: 10,  evaluationPeriods: 2,  treatMissingData: cloudwatch.TreatMissingData.NOT_BREACHING,});

Form Data Handling and File Metadata

Handling Additional Form Fields with File Uploads

In production, you often need to associate metadata with uploaded files. Here's how to handle form data alongside signed URL uploads:

typescript
// Enhanced request schema with metadataconst FileUploadWithMetadataSchema = z.object({  // File info  fileName: z.string().min(1).max(255),  fileSize: z.number().int().min(1).max(10737418240),  fileType: z.string().regex(/^(video|image|audio)\/[a-zA-Z0-9][a-zA-Z0-9\!\-\_]*[a-zA-Z0-9]*$/),
  // Business metadata  title: z.string().min(1).max(200),  description: z.string().max(1000).optional(),  category: z.enum(['education', 'entertainment', 'business', 'other']),  tags: z.array(z.string()).max(10),  isPublic: z.boolean(),
  // Upload metadata  uploadId: z.string().uuid(),  userId: z.string().uuid(),  organizationId: z.string().uuid().optional(),});
// Modified signed URL generator to include metadata in S3 objectconst putObjectCommand = new PutObjectCommand({  Bucket: process.env.UPLOAD_BUCKET!,  Key: key,  ContentType: request.fileType,  ContentLength: request.fileSize,  Metadata: {    'original-filename': request.fileName,    'upload-id': request.uploadId,    'user-id': request.userId,    'title': request.title,    'description': request.description || '',    'category': request.category,    'tags': JSON.stringify(request.tags),    'is-public': request.isPublic.toString(),    'uploaded-at': new Date().toISOString(),  },  // Add object tags for better organization and billing  Tagging: `Category=${request.category}&IsPublic=${request.isPublic}&UserId=${request.userId}`,});

Two-Phase Upload Pattern

For complex forms, implement a two-phase approach:

typescript
// Phase 1: Create upload record in databaseconst createUploadRecord = async (metadata: FileMetadata) => {  const uploadRecord = {    id: metadata.uploadId,    userId: metadata.userId,    fileName: metadata.fileName,    fileSize: metadata.fileSize,    title: metadata.title,    description: metadata.description,    category: metadata.category,    tags: metadata.tags,    isPublic: metadata.isPublic,    status: 'pending', // pending -> uploading -> processing -> completed    createdAt: new Date().toISOString(),  };
  await dynamoClient.send(new PutItemCommand({    TableName: process.env.UPLOAD_RECORDS_TABLE!,    Item: marshall(uploadRecord),  }));
  return uploadRecord;};
// Phase 2: Update record when S3 upload completesconst updateUploadStatus = async (uploadId: string, s3Key: string, status: string) => {  await dynamoClient.send(new UpdateItemCommand({    TableName: process.env.UPLOAD_RECORDS_TABLE!,    Key: marshall({ id: uploadId }),    UpdateExpression: 'SET #status = :status, #s3Key = :s3Key, #updatedAt = :updatedAt',    ExpressionAttributeNames: {      '#status': 'status',      '#s3Key': 's3Key',      '#updatedAt': 'updatedAt',    },    ExpressionAttributeValues: marshall({      ':status': status,      ':s3Key': s3Key,      ':updatedAt': new Date().toISOString(),    }),  }));};

Data Retention and Lifecycle Management

Automated Data Lifecycle with S3 Lifecycle Rules

typescript
// Enhanced CDK stack with comprehensive lifecycle managementconst uploadBucket = new s3.Bucket(this, 'UploadBucket', {  lifecycleRules: [    // Rule 1: Clean up failed multipart uploads    {      id: 'CleanupFailedUploads',      enabled: true,      abortIncompleteMultipartUploadsAfter: cdk.Duration.days(1),    },
    // Rule 2: Transition based on access patterns    {      id: 'StorageClassTransitions',      enabled: true,      transitions: [        {          storageClass: s3.StorageClass.INFREQUENT_ACCESS,          transitionAfter: cdk.Duration.days(30),        },        {          storageClass: s3.StorageClass.GLACIER,          transitionAfter: cdk.Duration.days(90),        },        {          storageClass: s3.StorageClass.DEEP_ARCHIVE,          transitionAfter: cdk.Duration.days(365),        },      ],    },
    // Rule 3: Delete temporary/processing files    {      id: 'CleanupTempFiles',      enabled: true,      filter: s3.LifecycleFilter.prefix('temp/'),      expiration: cdk.Duration.days(7),    },
    // Rule 4: User-specific retention (example: free tier users)    {      id: 'FreeTierRetention',      enabled: true,      filter: s3.LifecycleFilter.tag('UserTier', 'free'),      expiration: cdk.Duration.days(90),    },
    // Rule 5: Delete old versions (if versioning enabled)    {      id: 'CleanupOldVersions',      enabled: true,      noncurrentVersionExpiration: cdk.Duration.days(30),    },  ],
  // Enable versioning for accidental deletion protection  versioned: true,
  // Add inventory for cost monitoring  inventories: [    {      id: 'FullInventory',      destination: {        bucket: inventoryBucket,        prefix: 'inventory',      },      enabled: true,      frequency: s3.InventoryFrequency.WEEKLY,      includeObjectVersions: s3.InventoryObjectVersion.CURRENT,      optionalFields: [        s3.InventoryFormat.SIZE,        s3.InventoryFormat.LAST_MODIFIED_DATE,        s3.InventoryFormat.STORAGE_CLASS,        s3.InventoryFormat.ENCRYPTION_STATUS,      ],    },  ],});

User-Controlled Data Retention

typescript
// Lambda for handling user deletion requestsexport const deleteFileHandler = async (event: APIGatewayProxyEvent): Promise<APIGatewayProxyResult> => {  try {    const { fileId } = JSON.parse(event.body || '{}');    const userId = getUserIdFromJWT(event.headers.authorization);
    // Verify ownership    const fileRecord = await dynamoClient.send(new GetItemCommand({      TableName: process.env.UPLOAD_RECORDS_TABLE!,      Key: marshall({ id: fileId }),    }));
    if (!fileRecord.Item) {      return { statusCode: 404, body: JSON.stringify({ error: 'File not found' }) };    }
    const file = unmarshall(fileRecord.Item);    if (file.userId !== userId) {      return { statusCode: 403, body: JSON.stringify({ error: 'Access denied' }) };    }
    // Soft delete first (mark as deleted, but don't remove from S3 immediately)    await dynamoClient.send(new UpdateItemCommand({      TableName: process.env.UPLOAD_RECORDS_TABLE!,      Key: marshall({ id: fileId }),      UpdateExpression: 'SET #status = :status, #deletedAt = :deletedAt',      ExpressionAttributeNames: {        '#status': 'status',        '#deletedAt': 'deletedAt',      },      ExpressionAttributeValues: marshall({        ':status': 'deleted',        ':deletedAt': new Date().toISOString(),      }),    }));
    // Add S3 delete tag for lifecycle cleanup after grace period    await s3Client.send(new PutObjectTaggingCommand({      Bucket: process.env.UPLOAD_BUCKET!,      Key: file.s3Key,      Tagging: {        TagSet: [          { Key: 'Status', Value: 'deleted' },          { Key: 'DeletedAt', Value: new Date().toISOString() },          { Key: 'GracePeriodDays', Value: '30' },        ],      },    }));
    console.log('File marked for deletion:', { fileId, s3Key: file.s3Key, userId });
    return {      statusCode: 200,      body: JSON.stringify({        message: 'File scheduled for deletion',        gracePeriod: '30 days',      }),    };
  } catch (error) {    console.error('Delete file error:', error);    return { statusCode: 500, body: JSON.stringify({ error: 'Delete failed' }) };  }};
// Scheduled Lambda to permanently delete files after grace periodexport const permanentDeleteHandler = async (event: ScheduledEvent) => {  const thirtyDaysAgo = new Date();  thirtyDaysAgo.setDate(thirtyDaysAgo.getDate() - 30);
  // Query deleted files older than 30 days  const deletedFiles = await dynamoClient.send(new ScanCommand({    TableName: process.env.UPLOAD_RECORDS_TABLE!,    FilterExpression: '#status = :status AND #deletedAt < :cutoff',    ExpressionAttributeNames: {      '#status': 'status',      '#deletedAt': 'deletedAt',    },    ExpressionAttributeValues: marshall({      ':status': 'deleted',      ':cutoff': thirtyDaysAgo.toISOString(),    }),  }));
  for (const item of deletedFiles.Items || []) {    const file = unmarshall(item);
    try {      // Delete from S3      await s3Client.send(new DeleteObjectCommand({        Bucket: process.env.UPLOAD_BUCKET!,        Key: file.s3Key,      }));
      // Remove from DynamoDB      await dynamoClient.send(new DeleteItemCommand({        TableName: process.env.UPLOAD_RECORDS_TABLE!,        Key: marshall({ id: file.id }),      }));
      console.log('Permanently deleted file:', { fileId: file.id, s3Key: file.s3Key });
    } catch (error) {      console.error('Failed to permanently delete file:', { fileId: file.id, error });    }  }};

GDPR Compliance and Data Portability

typescript
// Lambda for user data export (GDPR Article 20)export const exportUserDataHandler = async (event: APIGatewayProxyEvent) => {  const userId = getUserIdFromJWT(event.headers.authorization);
  // Get all user's files  const userFiles = await dynamoClient.send(new ScanCommand({    TableName: process.env.UPLOAD_RECORDS_TABLE!,    FilterExpression: '#userId = :userId',    ExpressionAttributeNames: { '#userId': 'userId' },    ExpressionAttributeValues: marshall({ ':userId': userId }),  }));
  // Generate download links for all files  const fileExports = await Promise.all(    (userFiles.Items || []).map(async (item) => {      const file = unmarshall(item);
      if (file.status !== 'completed') return null;
      // Generate temporary download URL      const downloadUrl = await getSignedUrl(        s3Client,        new GetObjectCommand({          Bucket: process.env.UPLOAD_BUCKET!,          Key: file.s3Key,        }),        { expiresIn: 3600 * 24 } // 24 hours      );
      return {        fileId: file.id,        originalFileName: file.fileName,        title: file.title,        description: file.description,        category: file.category,        tags: file.tags,        uploadedAt: file.createdAt,        fileSize: file.fileSize,        downloadUrl,      };    })  );
  const exportData = {    userId,    exportedAt: new Date().toISOString(),    files: fileExports.filter(Boolean),    summary: {      totalFiles: fileExports.filter(Boolean).length,      totalSize: fileExports.reduce((sum, file) => sum + (file?.fileSize || 0), 0),    },  };
  // Store export in temp location  const exportKey = `exports/${userId}/${Date.now()}.json`;  await s3Client.send(new PutObjectCommand({    Bucket: process.env.EXPORT_BUCKET!,    Key: exportKey,    Body: JSON.stringify(exportData, null, 2),    ContentType: 'application/json',    // Automatically delete after 7 days    Expires: new Date(Date.now() + 7 * 24 * 60 * 60 * 1000),  }));
  // Generate download URL for export  const exportDownloadUrl = await getSignedUrl(    s3Client,    new GetObjectCommand({      Bucket: process.env.EXPORT_BUCKET!,      Key: exportKey,    }),    { expiresIn: 3600 * 24 * 7 } // 7 days  );
  return {    statusCode: 200,    body: JSON.stringify({      exportUrl: exportDownloadUrl,      expiresAt: new Date(Date.now() + 7 * 24 * 60 * 60 * 1000).toISOString(),      summary: exportData.summary,    }),  };};
// Lambda for complete user data deletion (GDPR Article 17 - Right to be forgotten)export const deleteAllUserDataHandler = async (event: APIGatewayProxyEvent) => {  const userId = getUserIdFromJWT(event.headers.authorization);
  console.log('Starting complete user data deletion:', { userId });
  // Get all user files  const userFiles = await dynamoClient.send(new ScanCommand({    TableName: process.env.UPLOAD_RECORDS_TABLE!,    FilterExpression: '#userId = :userId',    ExpressionAttributeNames: { '#userId': 'userId' },    ExpressionAttributeValues: marshall({ ':userId': userId }),  }));
  // Delete all S3 objects  const deletePromises = (userFiles.Items || []).map(async (item) => {    const file = unmarshall(item);
    try {      // Delete from S3      await s3Client.send(new DeleteObjectCommand({        Bucket: process.env.UPLOAD_BUCKET!,        Key: file.s3Key,      }));
      // Delete database record      await dynamoClient.send(new DeleteItemCommand({        TableName: process.env.UPLOAD_RECORDS_TABLE!,        Key: marshall({ id: file.id }),      }));
      console.log('Deleted user file:', { fileId: file.id, s3Key: file.s3Key });
    } catch (error) {      console.error('Failed to delete user file:', { fileId: file.id, error });      throw error; // Fail fast for GDPR compliance    }  });
  await Promise.all(deletePromises);
  // Log deletion for audit trail  await cloudwatchLogs.send(new PutLogEventsCommand({    logGroupName: '/aws/lambda/user-data-deletion',    logStreamName: new Date().toISOString().split('T')[0],    logEvents: [{      timestamp: Date.now(),      message: JSON.stringify({        action: 'complete_user_data_deletion',        userId,        filesDeleted: userFiles.Items?.length || 0,        completedAt: new Date().toISOString(),      }),    }],  }));
  return {    statusCode: 200,    body: JSON.stringify({      message: 'All user data has been permanently deleted',      filesDeleted: userFiles.Items?.length || 0,      deletedAt: new Date().toISOString(),    }),  };};

Cost Optimization Strategies

Intelligent Storage Class Selection

typescript
// Lambda to analyze usage patterns and optimize storage classesexport const optimizeStorageHandler = async (event: ScheduledEvent) => {  const s3Inventory = await getS3Inventory(); // From S3 inventory reports
  for (const object of s3Inventory) {    const lastAccessed = await getObjectAccessTime(object.key);    const daysSinceAccess = (Date.now() - lastAccessed) / (1000 * 60 * 60 * 24);
    // Auto-transition based on access patterns    if (daysSinceAccess > 90 && object.storageClass === 'STANDARD') {      await s3Client.send(new CopyObjectCommand({        CopySource: `${object.bucket}/${object.key}`,        Bucket: object.bucket,        Key: object.key,        StorageClass: 'GLACIER',        MetadataDirective: 'COPY',      }));
      console.log('Transitioned to Glacier:', { key: object.key, daysSinceAccess });    }
    // Deep archive for very old files    if (daysSinceAccess > 365 && object.storageClass === 'GLACIER') {      await s3Client.send(new CopyObjectCommand({        CopySource: `${object.bucket}/${object.key}`,        Bucket: object.bucket,        Key: object.key,        StorageClass: 'DEEP_ARCHIVE',        MetadataDirective: 'COPY',      }));
      console.log('Transitioned to Deep Archive:', { key: object.key, daysSinceAccess });    }  }};

Usage-Based Billing Integration

typescript
// Track file access for usage-based billingconst trackFileAccess = async (userId: string, fileId: string, operation: string) => {  await dynamoClient.send(new PutItemCommand({    TableName: process.env.USAGE_TRACKING_TABLE!,    Item: marshall({      id: `${userId}#${Date.now()}`,      userId,      fileId,      operation, // 'upload', 'download', 'view', 'delete'      timestamp: new Date().toISOString(),      month: new Date().toISOString().substring(0, 7), // YYYY-MM for billing    }),  }));
  // Update user's monthly usage counter  await dynamoClient.send(new UpdateItemCommand({    TableName: process.env.USER_USAGE_TABLE!,    Key: marshall({      userId,      month: new Date().toISOString().substring(0, 7),    }),    UpdateExpression: 'ADD #operation :increment',    ExpressionAttributeNames: { '#operation': operation },    ExpressionAttributeValues: marshall({ ':increment': 1 }),  }));};

Conclusion: When to Choose Each Approach

The signed URL pattern offers significant advantages for large file uploads:

Technical Benefits:

  • Substantial cost reduction in compute expenses
  • Elimination of timeout issues
  • Better scalability through S3's native capacity
  • Improved user experience with progress tracking

Implementation Considerations:

  • Choose Lambda proxy for files requiring immediate processing
  • Use signed URLs for simple storage scenarios
  • Consider hybrid approaches for complex workflows
  • Plan security measures from the start

Key Takeaways:

  1. Right tool for the job - Lambda excels at processing, S3 excels at storage
  2. Security requires multiple layers - Signed URLs, validation, scanning, and monitoring
  3. Start simple, add complexity as needed - Basic signed URLs first, then multipart uploads
  4. Monitor and measure - File uploads have many failure modes that need tracking

This pattern works well for any file type where immediate processing isn't required. The key decision point is whether Lambda adds value beyond acting as an expensive proxy for the upload process.

References

Related Posts