System Architecture

ZServed is built on a modern, serverless architecture that prioritizes performance, scalability, and developer experience. This document outlines the key architectural decisions and how the system components work together.

Architecture Principles

Edge-First Design

Global Distribution: Cloudflare Workers deploy to 300+ locations worldwide
Low Latency: Sub-100ms response times from edge locations
Auto-scaling: Automatic scaling based on demand without cold starts
Regional Data: Data stored close to users for optimal performance

Multi-Tenant Architecture

Tenant Isolation: Complete data separation between law firms
Subdomain Routing: Each tenant gets a custom subdomain (e.g., smithlegal.zserved.com)
Resource Sharing: Efficient resource utilization while maintaining security
Independent Scaling: Tenants scale independently based on usage

Serverless-First

No Infrastructure Management: Focus on business logic, not servers
Pay-per-Use: Cost scales with actual usage
Instant Scaling: Handle traffic spikes without pre-provisioning
Built-in Reliability: Automatic failover and redundancy

System Components

Frontend Layer

Astro Framework

Islands Architecture: Selective hydration for optimal performance
Static Generation: Pre-built pages for fast loading
TypeScript: Type safety throughout the application
Tailwind CSS: Utility-first styling with design consistency

Mobile Applications

Capacitor: Native iOS and Android apps from web codebase
Responsive Design: Optimized for all screen sizes
Offline Capabilities: Essential features work without internet
Push Notifications: Real-time updates for mobile users

API Layer

Cloudflare Workers

// API structure example
export default {
  async fetch(request: Request, env: Env, ctx: ExecutionContext) {
    const router = new Router();

    // Multi-tenant routing
    router.get('/api/client-portal/*', handleClientPortal);
    router.post('/api/ai/*', handleAIServices);
    router.all('/api/admin/*', handleAdmin);

    return router.handle(request);
  }
};

Key Features:

File-based Routing: Intuitive API organization
Automatic HTTPS: SSL/TLS termination at the edge
Request/Response Streaming: Handle large files efficiently
Built-in Caching: Intelligent response caching

Data Layer

Cloudflare D1 (SQLite)

SQL Database: Familiar relational database model
Edge Replication: Read replicas at edge locations
ACID Transactions: Data consistency guarantees
Automatic Backups: Point-in-time recovery capabilities

Schema Design:

-- Core entities
CREATE TABLE tenants (
  id TEXT PRIMARY KEY,
  name TEXT NOT NULL,
  slug TEXT UNIQUE NOT NULL,
  subdomain TEXT UNIQUE NOT NULL
);

CREATE TABLE users (
  id TEXT PRIMARY KEY,
  tenant_id TEXT REFERENCES tenants(id),
  email TEXT NOT NULL,
  role TEXT NOT NULL
);

CREATE TABLE jobs (
  id TEXT PRIMARY KEY,
  tenant_id TEXT REFERENCES tenants(id),
  status TEXT NOT NULL,
  created_at DATETIME DEFAULT CURRENT_TIMESTAMP
);

Cloudflare R2 (Object Storage)

File Storage: Legal documents, images, and attachments
S3-Compatible: Standard API for easy integration
Global CDN: Fast file delivery worldwide
Encryption: Data encrypted at rest and in transit

Cloudflare KV (Key-Value Store)

Caching: Session data and frequently accessed information
Configuration: Tenant settings and feature flags
Eventually Consistent: Optimized for read-heavy workloads
TTL Support: Automatic expiration for temporary data

AI & Intelligence Layer

Cloudflare AutoRAG

Document Retrieval: Semantic search across legal documents
Context-Aware Responses: AI answers based on knowledge base
Multi-tenant Isolation: Separate knowledge bases per tenant
Automatic Indexing: Real-time document processing

Vectorize (Vector Database)

Semantic Search: Find similar documents and cases
Embedding Generation: Convert text to vector representations
Similarity Matching: Match client inquiries to past cases
Scalable Storage: Handle millions of document vectors

Custom AI Services

// AI service integration
class LegalRAGService {
  async getLegalGuidance(query: string, tenantId: string) {
    const results = await this.autorag.search(query, {
      tenant: tenantId,
      limit: 5,
      threshold: 0.7
    });

    return this.generateGuidance(results);
  }
}

Data Flow Architecture

Request Processing Flow

Client Request → Browser/Mobile App
Edge Routing → Cloudflare Edge Network
Worker Processing → Cloudflare Workers
Data Access → D1, R2, KV, AutoRAG
Response → JSON/HTML/Files back to client

sequenceDiagram
    participant C as Client
    participant E as Edge
    participant W as Worker
    participant D as Database
    participant A as AutoRAG

    C->>E: HTTP Request
    E->>W: Route to Worker
    W->>D: Query Data
    W->>A: AI Processing
    A-->>W: AI Response
    D-->>W: Data Response
    W-->>E: Formatted Response
    E-->>C: HTTP Response

Multi-Tenant Data Isolation

Subdomain Resolution

function getTenantFromHost(host: string): string {
  const subdomain = host.split('.')[0];
  return subdomain;
}

// All database queries include tenant_id
const jobs = await db
  .select()
  .from('jobs')
  .where('tenant_id', tenantId);

Resource Scoping

Database queries filtered by tenant_id
File storage organized by tenant folders
KV keys prefixed with tenant identifier
AutoRAG knowledge bases scoped by tenant

Security Architecture

Authentication & Authorization

Session Management

HTTP-only Cookies: Secure session storage
CSRF Protection: Token-based request validation
Session Expiration: Automatic timeout and renewal
Multi-device Support: Concurrent session management

Role-Based Access Control (RBAC)

enum UserRole {
  ADMIN = 'admin',
  LAWYER = 'lawyer',
  PARALEGAL = 'paralegal',
  CLIENT = 'client'
}

const permissions = {
  [UserRole.ADMIN]: ['*'],
  [UserRole.LAWYER]: ['cases.*', 'clients.*', 'documents.*'],
  [UserRole.CLIENT]: ['cases.view', 'documents.view']
};

Data Protection

Encryption

At Rest: All data encrypted using AES-256
In Transit: TLS 1.3 for all communications
End-to-End: Sensitive documents encrypted client-side
Key Management: Cloudflare managed encryption keys

Privacy Controls

Data Minimization: Collect only necessary information
Retention Policies: Automatic data purging after retention period
Access Logging: Comprehensive audit trails
GDPR Compliance: Right to deletion and data portability

Performance Architecture

Caching Strategy

Edge Caching

Static Assets: Long-term caching for CSS, JS, images
API Responses: Intelligent caching based on data sensitivity
Database Queries: Query result caching with TTL
File Downloads: CDN caching for document delivery

Cache Invalidation

// Smart cache invalidation
class CacheManager {
  async invalidateUserData(userId: string) {
    await Promise.all([
      this.kv.delete(`user:${userId}`),
      this.kv.delete(`user:${userId}:permissions`),
      this.purgeEdgeCache(`/api/users/${userId}`)
    ]);
  }
}

Database Optimization

Query Optimization

Prepared Statements: Prevent SQL injection and improve performance
Index Strategy: Optimized indexes for common queries
Connection Pooling: Efficient database connection management
Read Replicas: Distribute read queries to edge replicas

Data Partitioning

Tenant-based Partitioning: Logical separation by tenant
Time-based Partitioning: Archive old data automatically
Geographic Distribution: Data close to users

Monitoring & Observability

Logging Strategy

Structured Logging: JSON format for easy parsing
Request Tracing: Track requests across system components
Error Tracking: Automatic error collection and alerting
Performance Metrics: Response times and resource usage

Health Monitoring

// Health check endpoint
export const healthCheck = async (env: Env) => {
  const checks = await Promise.allSettled([
    checkDatabase(env.DB),
    checkStorage(env.zserved_files),
    checkAutoRAG(env.LEGAL_RAG)
  ]);

  return {
    status: checks.every(c => c.status === 'fulfilled') ? 'healthy' : 'degraded',
    checks: checks.map(formatCheckResult)
  };
};

Scalability Considerations

Horizontal Scaling

Stateless Workers: No local state, infinite horizontal scaling
Database Scaling: Read replicas and connection pooling
File Storage: Unlimited object storage with global CDN
AI Services: Automatic scaling based on request volume

Performance Optimization

Code Splitting: Load only necessary JavaScript
Lazy Loading: Defer non-critical resource loading
Image Optimization: Automatic resizing and format conversion
Prefetching: Intelligent resource prefetching

Development Architecture

Code Organization

src/
├── pages/              # API endpoints and page routes
├── components/         # Reusable UI components
├── layouts/           # Page layout templates
├── services/          # Business logic and external integrations
├── utils/             # Helper functions and utilities
└── types/             # TypeScript type definitions

Build Pipeline

TypeScript Compilation: Type checking and JavaScript generation
Astro Build: Static site generation with optimization
Worker Bundling: Code bundling for Cloudflare Workers
Asset Processing: CSS and image optimization

Testing Strategy

Unit Tests: Individual component and function testing
Integration Tests: API endpoint and database testing
E2E Tests: Full user workflow testing
Performance Tests: Load testing and benchmark analysis

This architecture provides a solid foundation for building scalable, secure, and performant legal document service applications while maintaining developer productivity and operational simplicity.