Skip to content

Multi-Tenancy, Permission Libraries, and Architectural Decisions

Add multi-tenant isolation to your permission system, evaluate CASL as a library alternative, and use decision frameworks to choose the right authorization architecture.

Abstract

Post 101 established seven goals for a permission system and exposed the scattered-check anti-pattern. Post 102 centralized authorization inside a service layer. Post 103 added type-safe RBAC. Post 104 replaced the role-permission matrix with an ABAC policy engine. Post 105 extended ABAC with environment rules, field-level read/write permissions, and database query filtering.

Three production concerns remain. First, the system has no tenant boundary -- a user in Organization A can access Organization B's resources if the query is crafted correctly. Second, the custom ABAC engine is functional but the question arises: should the team maintain bespoke authorization code or migrate to a library like CASL? Third, there is no comprehensive decision framework for choosing between RBAC, custom ABAC, library-based ABAC, and external policy engines.

This capstone post closes all three gaps: multi-tenancy as a first-class permission concern, a working CASL migration with honest friction analysis, and the series' definitive comparison across every approach.

Multi-Tenancy Models

The Tenant Concept

In SaaS, a tenant is an organization, workspace, or account that groups users and their resources. Slack workspaces, GitHub organizations, and Notion workspaces are all tenants. The tenant boundary is the outermost permission boundary -- before checking roles, ownership, or field access, the system must verify the user belongs to the tenant that owns the resource.

Extending the Domain Model

The series' domain model gains a tenant dimension:

typescript
interface User {  userId: string;  role: Role;  departmentId?: string;  tenantId: string;  // which organization this user belongs to  tenantRole?: TenantRole; // role within the tenant (owner, admin, member)}
interface Document {  id: string;  title: string;  content: string;  authorId: string;  status: 'draft' | 'published' | 'archived';  projectId: string;  departmentId: string;  tenantId: string;  // which tenant owns this document}
interface Project {  id: string;  name: string;  ownerId: string;  departmentId: string;  tenantId: string;  // which tenant owns this project}
type TenantRole = 'owner' | 'admin' | 'member';

Every resource now carries a tenantId. Every user belongs to exactly one tenant (for simplicity -- multi-tenant membership is possible but adds complexity outside this scope).

Three Isolation Strategies

Row-Level Isolation (shared schema): All tenants share the same tables. Every table has a tenant_id column. This is the simplest infrastructure and cheapest option, but forgetting a WHERE tenant_id = ? clause leaks cross-tenant data. PostgreSQL Row-Level Security (RLS) can enforce this at the database level as a safety net.

Schema-Level Isolation: Each tenant gets a separate schema within the same database. Stronger isolation -- a missing WHERE clause produces an error rather than a data leak. Migrations must run across N schemas. Viable for dozens to hundreds of tenants.

Database-Level Isolation: Each tenant gets a dedicated database instance. Maximum isolation and the strongest compliance posture. Highest cost and operational complexity.

AspectRow-LevelSchema-LevelDatabase-Level
Infrastructure costLowMediumHigh
Isolation strengthApplication-enforcedDB-schema-enforcedPhysical isolation
Tenant count scalabilityThousands+HundredsDozens
Migration complexitySingle migrationN migrationsN migrations + N databases
Cross-tenant query riskHigh (missing WHERE)Low (wrong schema = error)None
Per-tenant customizationLimitedModerateFull
Compliance suitabilityStandardSOC 2 / ISOHIPAA / PCI-DSS

This series uses the row-level isolation model because it is the most common starting point and the most challenging from a permission perspective. Schema and database isolation solve tenant boundaries at the infrastructure level. Row-level isolation requires the application to enforce those boundaries.

Tenant-Aware Permission Layer

The Scattered Tenant Check Anti-Pattern

Without tenant-aware permissions, every service method manually checks tenant isolation:

typescript
// Anti-pattern: manual tenant check in every service methodasync function getDocumentById(documentId: string) {  const session = await requireSession();  const document = await db.document.findUnique({ where: { id: documentId } });
  // Manual tenant check -- easy to forget  if (document.tenantId !== session.tenantId) {    throw new ForbiddenError();  }
  // Then the regular ABAC check  if (!can(session, 'read', 'document', document)) {    throw new ForbiddenError();  }
  return filterFields(session, 'document', document);}

This is Post 101's scattered-check pattern reappearing at the tenant level. If a developer forgets the tenant check on one endpoint, they create a cross-tenant data leak -- the most dangerous class of authorization bug because it exposes other customers' data.

Tenant Isolation as a Global ABAC Condition

The correct approach: tenant isolation becomes a built-in condition that runs automatically for every permission check:

typescript
// Tenant isolation as a built-in global conditionconst permissions = new PermissionBuilder()  // Global condition: applies to ALL roles, ALL resources  .global((user, data) => {    // Every resource must have a tenantId that matches the user's    if ('tenantId' in data && user.tenantId !== data.tenantId) {      return false; // Cross-tenant access: DENY    }    return true; // Same tenant: continue to role-specific checks  })
  .role('admin')    .can('manage', 'document')    .can('manage', 'project')
  .role('editor')    .can(['read', 'update'], 'document', [      (user, doc) => user.departmentId === doc.departmentId,    ])
  // ... remaining policies from Posts 104-105  .build();

The global() condition runs before any role-specific conditions. It acts as an implicit WHERE clause on every permission check. Even if a developer creates a new role or a new resource type, tenant isolation is automatically enforced.

Updating the can() Function

The can() function evaluates global conditions first:

typescript
function can<R extends Resource>(  user: User,  action: Action,  resource: R,  data?: ResourceDataMap[R],  env?: Environment): boolean {  // Step 1: Evaluate global conditions (tenant isolation)  if (data) {    for (const globalCondition of permissions.globalConditions) {      if (!globalCondition(user, data as Record<string, unknown>)) {        return false; // Global condition failed (e.g., wrong tenant)      }    }  }
  // Step 2: Find matching entries for role + resource + action  // (same logic as Posts 104-105)  const entries = permissions[user.role] as PermissionEntry<R>[];  for (const entry of entries) {    if (entry.resource !== resource) continue;    if (!entry.actions.includes(action)) continue;
    if (!entry.conditions || entry.conditions.length === 0) return true;    if (data) {      const allMet = entry.conditions.every(c => c.evaluate(user, data, env));      if (allMet) return true;    }  }
  return false; // Deny by default}

Updating Database Query Filtering

The toWhereClause() function from Post 105 must include tenant filtering:

typescript
function toWhereClause<R extends Resource>(  user: User,  resource: R,  action: Action,  env?: Environment): WhereClause<R> | null {  // Always include tenant filter  const tenantFilter = { tenantId: user.tenantId };
  const roleFilter = buildRoleFilter(user, resource, action, env);  if (roleFilter === null) return null; // No access
  // Combine tenant filter with role-specific filter  return { ...tenantFilter, ...roleFilter };}

The tenant filter is always present. Even if buildRoleFilter() returns {} (no additional filter for an admin), the query still includes WHERE tenantId = ?.

Cross-Tenant Access: The Exception

Some scenarios require cross-tenant access:

  • Platform admins (super-admins) who manage all tenants
  • Shared resources (templates, public content) that exist outside any tenant
  • Support tools for customer service to view tenant data
typescript
// Platform admin bypasses tenant isolation.role('platform_admin')  .global(() => true) // Override global tenant check  .can('manage', 'document')  .can('manage', 'project')  .can('manage', 'tenant')
// Shared resources have no tenantIdinterface SharedTemplate {  id: string;  title: string;  // No tenantId -- accessible to all tenants}

Warning: Cross-tenant exceptions must be explicit and auditable. The platform admin role should be separate from tenant-level admin, with additional authentication requirements (MFA, IP restrictions) enforced via environment conditions from Post 105.

Why Use a Permission Library?

The Build vs. Library Decision

Posts 101-105 built a custom permission system covering RBAC, ABAC, field-level permissions, DB query filtering, environment rules, and now multi-tenancy. This is approximately 300-500 lines of core permission logic. At what point does maintaining this code become more expensive than adopting a library?

Custom Implementation Strengths

  1. Zero dependencies: No third-party code in the critical security path
  2. Full control over API surface: The can() signature evolves exactly as needed
  3. Perfect TypeScript integration: Generic constraints, builder patterns, and type inference designed for the specific domain
  4. No serialization overhead: Plain functions, no class instances, RSC-compatible
  5. Team understanding: Every condition is a function the team wrote -- no black boxes
  6. Predictable behavior: Debugging follows standard function-call stacks

Custom Implementation Weaknesses

  1. Maintenance burden: The team owns bugs, edge cases, and security patches
  2. Limited community testing: Unusual edge cases may not surface until production
  3. Feature reimplementation: Field permissions, DB query conversion, condition operators ($in, $ne, $gte) -- rebuilding what libraries already provide
  4. Onboarding cost: New team members learn a bespoke API instead of a documented library

Library Strengths

  1. Community-tested: Thousands of projects, edge cases discovered and fixed
  2. Built-in features: Field permissions, MongoDB-style conditions, Prisma/Mongoose adapters
  3. Documentation and community: Tutorials, Stack Overflow answers, conference talks
  4. Reduced maintenance: Security patches and feature additions handled by maintainers

Library Weaknesses

  1. API constraints: The library's API may not match the series' can() signature
  2. Dependency risk: Library maintenance can slow or stop
  3. Integration friction: Class-based libraries clash with React Server Components
  4. Black box behavior: Debugging permission denials requires understanding library internals

Build vs. Library Decision Framework

In-Code vs. DSL-Based Approaches

ApproachExamplesStrengthsWeaknesses
In-code (TypeScript)Custom, CASLType-safe, no runtime overhead, familiar languageCoupled to deployment, no runtime changes
DSL / Policy languageOPA/Rego, Cedar, Cerbos (YAML)Decoupled from app code, non-dev editable, auditableLearning curve, tooling overhead, latency
HybridPermit.io, custom DB-stored rulesRuntime-configurable + code-based defaultsComplexity, consistency challenges

CASL Integration

Why CASL for This Series

CASL is the most popular JavaScript/TypeScript authorization library (~6KB core). It is isomorphic (works on server and client), supports ABAC conditions, field-level permissions, and database query conversion. Since the series has already built everything CASL provides, a direct feature-for-feature comparison is possible.

bash
npm install @casl/ability @casl/prisma

Migration: AbilityBuilder

The custom PermissionBuilder from Post 104 maps to CASL's AbilityBuilder:

Custom (Posts 104-105):

typescript
const permissions = new PermissionBuilder()  .role('admin')    .can('manage', 'document')  .role('editor')    .can(['read', 'update'], 'document', [      (user, doc) => user.departmentId === doc.departmentId,    ])  .role('author')    .can(['read', 'update'], 'document', [      (user, doc) => doc.authorId === user.userId,    ])  .build();

CASL equivalent:

typescript
import { AbilityBuilder, createMongoAbility, MongoAbility } from '@casl/ability';
type Actions = 'create' | 'read' | 'update' | 'delete' | 'manage';type Subjects = 'Document' | 'Project' | 'all';type AppAbility = MongoAbility<[Actions, Subjects]>;
function defineAbilitiesFor(user: User): AppAbility {  const { can, cannot, build } = new AbilityBuilder<AppAbility>(    createMongoAbility  );
  if (user.role === 'admin') {    can('manage', 'all');  }
  if (user.role === 'editor') {    can(['read', 'update'], 'Document', { departmentId: user.departmentId });  }
  if (user.role === 'author') {    can(['read', 'update'], 'Document', { authorId: user.userId });    can('create', 'Document');  }
  if (user.role === 'viewer') {    can('read', 'Document', { status: 'published' });  }
  return build();}

Key API differences:

  • Conditions are MongoDB-style objects ({ authorId: user.userId }) instead of functions
  • No builder-pattern chaining for roles -- uses if/else branching on user role
  • cannot() for negative rules (CASL exclusive -- the custom system did not have this)
  • 'manage' is CASL's wildcard for all CRUD actions; 'all' for all subjects

The subject() Helper and Its Friction

CASL needs to know the type of an object being checked. With classes, this is automatic (via the class name). With plain objects -- which TypeScript applications typically use -- the subject() helper is required:

typescript
import { subject } from '@casl/ability';
// CASL requires wrapping plain objectsability.can('update', subject('Document', document));
// Problem: subject() mutates the object by adding __caslSubjectType__// This conflicts with React Server Components (objects must be serializable)

Workaround 1: Object spreading

typescript
// Create a copy to avoid mutating the originalability.can('update', subject('Document', { ...document }));

Workaround 2: Custom detectSubjectType

typescript
import { createMongoAbility } from '@casl/ability';
const ability = createMongoAbility(rules, {  detectSubjectType: (object) => {    // Use a custom property instead of class name    return object.__type || object.constructor?.modelName || 'unknown';  },});
// In the service layer, add __type to returned objectsfunction toDocumentDTO(doc: Document): DocumentDTO & { __type: 'Document' } {  return { ...doc, __type: 'Document' };}

Workaround 3: PureAbility with lambda matcher (RSC-compatible)

typescript
import {  PureAbility,  AbilityBuilder,  type AbilityTuple,  type MatchConditions,} from '@casl/ability';
type AppAbility = PureAbility<AbilityTuple, MatchConditions>;const lambdaMatcher = (matchConditions: MatchConditions) => matchConditions;
function defineAbilityFor(user: User): AppAbility {  const { can, build } = new AbilityBuilder<AppAbility>(PureAbility);
  // Lambda conditions instead of MongoDB-style -- works without classes  can('read', 'Document', ({ authorId }) => authorId === user.userId);
  return build({ conditionsMatcher: lambdaMatcher });}

Tip: The PureAbility + lambda matcher approach is the most RSC-compatible option, but it loses CASL's MongoDB-style query operators and Prisma integration. There is a real tradeoff between CASL's full feature set and modern React compatibility.

Tenant Isolation in CASL

typescript
function defineAbilitiesFor(user: User): AppAbility {  const { can, cannot, build } = new AbilityBuilder<AppAbility>(    createMongoAbility  );
  // Tenant isolation: tenantId must be added to EVERY rule  // CASL does not have a global() condition
  if (user.role === 'admin') {    can('manage', 'Document', { tenantId: user.tenantId });    can('manage', 'Project', { tenantId: user.tenantId });  }
  if (user.role === 'editor') {    can(['read', 'update'], 'Document', {      tenantId: user.tenantId,      departmentId: user.departmentId,    });  }
  // Platform admin: no tenantId filter  if (user.role === 'platform_admin') {    can('manage', 'all');  }
  return build();}

The custom system had a global() condition that applied tenant isolation automatically to every rule. CASL requires adding tenantId to every rule individually. Missing it on one rule creates a cross-tenant leak. This is a significant ergonomic difference.

CASL Field and DB Integration

Field-Level Permissions with permittedFieldsOf

typescript
import { permittedFieldsOf } from '@casl/ability/extra';
// Define field-level rulescan('read', 'Document', ['title', 'content', 'status'], {  status: 'published',});can(  'read',  'Document',  ['title', 'content', 'status', 'internalNotes', 'reviewComments'],  { authorId: user.userId });
// Get permitted fields for a specific documentconst fields = permittedFieldsOf(ability, 'read', 'Document', {  fieldsFrom: (rule) =>    rule.fields || [      'title',      'content',      'status',      'authorId',      'internalNotes',      'reviewComments',      'publishedAt',    ],});

Compare to the custom getVisibleFields() from Post 105 -- the concept is the same, the API is different. CASL requires a fieldsFrom callback that returns all possible fields when a rule has no field restriction.

CASL AST to Prisma Query Conversion

typescript
import { accessibleBy } from '@casl/prisma';
// Convert CASL rules to Prisma where clauseconst documents = await prisma.document.findMany({  where: accessibleBy(ability).Document,});
// Combine with business logic filtersconst documents = await prisma.document.findMany({  where: {    AND: [accessibleBy(ability).Document, { projectId: projectId }],  },});

Compare to the custom toWhereClause() from Post 105:

  • CASL's accessibleBy() converts MongoDB-style conditions into Prisma where syntax
  • The custom toWhereClause() uses condition descriptors with toFilter callbacks
  • CASL automatically handles OR logic across multiple matching rules
  • CASL throws ForbiddenError if no rules match at all (fail-closed)

Warning: accessibleBy() only works with MongoDB-style conditions (from createMongoAbility), not with lambda conditions (PureAbility). If you use the RSC-compatible PureAbility pattern, you lose Prisma query conversion. This is a hard tradeoff in the current CASL architecture.

The Comprehensive Comparison

RBAC vs. Custom ABAC vs. CASL ABAC

DimensionRBAC (Post 103)Custom ABAC (Posts 104-105)CASL ABAC (This Post)
Core logicRole-permission lookupPolicy engine with conditionsAbilityBuilder with MongoDB conditions
Lines of auth code~80 (matrix + can())~300-500 (builder + engine + field + DB)~50 (defineAbilitiesFor) + library
can() signaturecan(role, resource, action)can(user, action, resource, data?, env?)ability.can(action, subject(type, data))
Contextual conditionsNo (requires helpers)Yes (inline in policy builder)Yes (MongoDB-style objects or lambdas)
Field-level permissionsNoYes (getVisibleFields, pickPermittedFields)Yes (permittedFieldsOf)
DB query filteringNoYes (toWhereClause())Yes (accessibleBy() with Prisma)
Environment rulesNoYes (time, IP, flags)Partial (via custom conditions)
Multi-tenancyManual check per methodGlobal condition (automatic)Per-rule tenantId (manual per rule)
Type safetyFull (generics, mapped types)Full (resource-action generics)Good (typed actions/subjects, weaker on conditions)
RSC compatibilityFull (plain functions)Full (plain functions)Partial (subject() mutation issue)
Negative rulesNoNoYes (cannot())
MaintenanceTeam-ownedTeam-ownedLibrary-maintained core
Bundle size0 (built-in)0 (built-in)~6KB (core) + adapters

Decision Framework: Which System Should You Choose?

When to Choose Each

RBAC (Post 103) -- The Default Choice

  • Team: Any size
  • App: Internal tools, simple SaaS, content platforms with clear roles
  • Complexity: Low -- 2-4 roles, permissions depend only on role
  • Signal to choose: Permission requirements map cleanly to "this role can do these things"
  • Signal to upgrade: Helper functions start proliferating alongside can()

Custom ABAC (Posts 104-105) -- Full Control

  • Team: Has authorization expertise, willing to maintain auth code
  • App: SaaS with complex business rules, field-level visibility, large datasets
  • Complexity: High -- ownership, department, status, time conditions
  • Signal to choose: can() must evaluate 3+ contextual conditions per resource
  • Signal to upgrade: Team bandwidth for auth maintenance decreases; need for DB query adapters across multiple ORMs

CASL ABAC (This Post) -- Community-Tested Library

  • Team: Wants to focus on business logic, not auth internals
  • App: SaaS using Prisma/MongoDB, needs field permissions and DB filtering
  • Complexity: High -- but team prefers library API over custom code
  • Signal to choose: The custom ABAC feature set matches CASL's capabilities
  • Signal to avoid: Heavy RSC usage with plain objects; need for environment conditions; need for global tenant isolation

External PDP (Cerbos, OPA, Cedar) -- Authorization as a Service

  • Team: Dedicated platform/security team
  • App: Microservices, polyglot stack, shared authorization decisions across services
  • Complexity: Very high -- multiple services need consistent authorization
  • Signal to choose: Multiple backends need the same authorization decisions; compliance requires decoupled, auditable policy management

Common Pitfalls

  1. Forgetting tenant isolation on one CASL rule: CASL has no global condition. Missing tenantId on one rule creates a cross-tenant leak. Write a lint rule or unit test that verifies every non-platform-admin rule includes tenantId.

  2. Assuming CASL works seamlessly with RSC: The subject() helper mutates objects. React Server Components require serializable data. Use one of the three workarounds from the CASL Integration section.

  3. PureAbility loses Prisma integration: The RSC-compatible PureAbility + lambda matcher pattern cannot convert conditions to Prisma where clauses. Teams must choose between RSC compatibility and DB query filtering.

  4. Over-engineering early: Jumping to ABAC or CASL before RBAC fails is premature. The series' progression mirrors real-world evolution: start simple, add complexity when current limitations appear.

  5. Multi-tenancy as an afterthought: Adding tenant_id to every table after the schema is established is a painful migration. Design tenant isolation from the beginning, even if the first version has only one tenant.

  6. Confusing platform admin with tenant admin: Platform admins manage all tenants (cross-tenant access). Tenant admins manage their own tenant only. Mixing these roles creates either overly permissive tenant admins or insufficiently permissive platform admins.

  7. Choosing an external PDP too early: Cerbos, OPA, and Cedar add infrastructure complexity. For a monolithic Next.js app, in-process authorization (custom or CASL) is simpler and faster. External PDPs make sense when authorization decisions must be shared across independently deployed services.

  8. Not testing cross-tenant scenarios: Unit tests often use a single tenant ID. Add explicit test cases where User A (tenant 1) attempts to access User B's document (tenant 2). These tests catch missing tenant filters.

Series Retrospective

The Seven Goals Scorecard

Post 101 established seven goals for any permission system. Here is how each approach scores:

GoalScattered (101)Service Layer (102)RBAC (103)Custom ABAC (104-105)CASL (106)
Prevent unauthorized accessPartialYesYesYesYes
Consistent (single source of truth)NoYesYesYesYes
Auto-enforceNoArchitecturalArchitecturalArchitectural + GlobalArchitectural
Easy to updateNoModerateYes (matrix)Yes (builder)Yes (rules)
AuditableNoModerateYes (matrix)Yes (builder)Yes (rules)
PerformantVariesYes + cacheYes (O(1) lookup)Yes (condition eval)Yes (condition eval)
Type-safeNoPartialFullFullGood

Series Architecture Evolution

The key insight across all six posts: the service layer from Post 102 never changes. It is the enforcement point for every authorization approach. The decision engine inside it evolves from simple role checks to RBAC to ABAC to CASL, but the architecture remains constant. This is what makes the progressive approach work -- each upgrade is contained within the service layer.

Microservices Authorization: A Forward Look

The series focused on a monolithic Next.js application. As applications grow, authorization decisions must work across service boundaries. Three patterns emerge:

Pattern 1: Centralized Authorization Service

A single service evaluates all permission decisions. Other services call it via gRPC/HTTP. Single source of truth, but a single point of failure with network latency on every request.

Pattern 2: Embedded PDP (Sidecar)

Each microservice runs its own policy engine (OPA sidecar, Cerbos sidecar). Policies are managed centrally and distributed to all sidecars. No network hop for decisions, but policy sync complexity and version drift risk.

Pattern 3: Token-Based Claims

Authorization data embedded in JWT claims (roles, permissions, tenantId). Services trust the token without additional policy checks. Simplest infrastructure, but stale claims and no resource-level authorization.

For teams moving from monolith to microservices: start with Pattern 3 (token claims) for service-to-service auth, and add Pattern 2 (embedded PDP) when fine-grained resource-level authorization is needed across services.

Permission Storage: Code vs. Database

ApproachStrengthsWeaknessesWhen to Use
Code-only (this series)Type-safe, version-controlled, CI/CD testableRequires deployment for changesPermission rules change with app code
Database-storedRuntime-configurable, tenant-customizableNo compile-time safety, migration complexityTenants need custom roles/permissions
HybridDefault rules in code + overrides in DBComplexity of two systems, conflict resolutionSaaS with per-tenant customization

The hybrid pattern works well for production SaaS: define the default permission set in code (type-safe, tested), allow tenants to override specific rules via a database table. The can() function checks code-based rules first, then applies database overrides.

Series Recap

Over six posts, the permission system evolved from scattered if-statements to a production-grade authorization architecture:

  1. Post 101: Identified the problem -- scattered checks, inconsistent enforcement, no fail-closed default
  2. Post 102: Established the architecture -- service layer as the single enforcement point
  3. Post 103: Added the first decision engine -- type-safe RBAC with generic constraints
  4. Post 104: Replaced role-based lookup with attribute-based policies -- ownership, department, status conditions
  5. Post 105: Extended ABAC with environment rules, field-level permissions, and database query filtering
  6. Post 106 (this post): Added multi-tenancy, evaluated CASL as a library alternative, and provided the definitive decision framework

The service layer is the one constant. The decision engine inside it can be RBAC, custom ABAC, CASL, or an external PDP. The architecture from Post 102 supports all of them without structural changes. That was the goal from the beginning.

References

Permission Systems that Scale

A comprehensive guide to building scalable permission systems in TypeScript and Next.js, progressing from naive checks through RBAC and ABAC to production-grade multi-tenant authorization.

Progress6/6 posts completed

Related Posts