System Architecture
ZServed is built on a modern, serverless architecture that prioritizes performance, scalability, and developer experience. This document outlines the key architectural decisions and how the system components work together.
Architecture Principles
Edge-First Design
- Global Distribution: Cloudflare Workers deploy to 300+ locations worldwide
- Low Latency: Sub-100ms response times from edge locations
- Auto-scaling: Automatic scaling based on demand without cold starts
- Regional Data: Data stored close to users for optimal performance
Multi-Tenant Architecture
- Tenant Isolation: Complete data separation between law firms
- Subdomain Routing: Each tenant gets a custom subdomain (e.g.,
smithlegal.zserved.com
) - Resource Sharing: Efficient resource utilization while maintaining security
- Independent Scaling: Tenants scale independently based on usage
Serverless-First
- No Infrastructure Management: Focus on business logic, not servers
- Pay-per-Use: Cost scales with actual usage
- Instant Scaling: Handle traffic spikes without pre-provisioning
- Built-in Reliability: Automatic failover and redundancy
System Components
Frontend Layer
Astro Framework
- Islands Architecture: Selective hydration for optimal performance
- Static Generation: Pre-built pages for fast loading
- TypeScript: Type safety throughout the application
- Tailwind CSS: Utility-first styling with design consistency
Mobile Applications
- Capacitor: Native iOS and Android apps from web codebase
- Responsive Design: Optimized for all screen sizes
- Offline Capabilities: Essential features work without internet
- Push Notifications: Real-time updates for mobile users
API Layer
Cloudflare Workers
// API structure exampleexport default { async fetch(request: Request, env: Env, ctx: ExecutionContext) { const router = new Router();
// Multi-tenant routing router.get('/api/client-portal/*', handleClientPortal); router.post('/api/ai/*', handleAIServices); router.all('/api/admin/*', handleAdmin);
return router.handle(request); }};
Key Features:
- File-based Routing: Intuitive API organization
- Automatic HTTPS: SSL/TLS termination at the edge
- Request/Response Streaming: Handle large files efficiently
- Built-in Caching: Intelligent response caching
Data Layer
Cloudflare D1 (SQLite)
- SQL Database: Familiar relational database model
- Edge Replication: Read replicas at edge locations
- ACID Transactions: Data consistency guarantees
- Automatic Backups: Point-in-time recovery capabilities
Schema Design:
-- Core entitiesCREATE TABLE tenants ( id TEXT PRIMARY KEY, name TEXT NOT NULL, slug TEXT UNIQUE NOT NULL, subdomain TEXT UNIQUE NOT NULL);
CREATE TABLE users ( id TEXT PRIMARY KEY, tenant_id TEXT REFERENCES tenants(id), email TEXT NOT NULL, role TEXT NOT NULL);
CREATE TABLE jobs ( id TEXT PRIMARY KEY, tenant_id TEXT REFERENCES tenants(id), status TEXT NOT NULL, created_at DATETIME DEFAULT CURRENT_TIMESTAMP);
Cloudflare R2 (Object Storage)
- File Storage: Legal documents, images, and attachments
- S3-Compatible: Standard API for easy integration
- Global CDN: Fast file delivery worldwide
- Encryption: Data encrypted at rest and in transit
Cloudflare KV (Key-Value Store)
- Caching: Session data and frequently accessed information
- Configuration: Tenant settings and feature flags
- Eventually Consistent: Optimized for read-heavy workloads
- TTL Support: Automatic expiration for temporary data
AI & Intelligence Layer
Cloudflare AutoRAG
- Document Retrieval: Semantic search across legal documents
- Context-Aware Responses: AI answers based on knowledge base
- Multi-tenant Isolation: Separate knowledge bases per tenant
- Automatic Indexing: Real-time document processing
Vectorize (Vector Database)
- Semantic Search: Find similar documents and cases
- Embedding Generation: Convert text to vector representations
- Similarity Matching: Match client inquiries to past cases
- Scalable Storage: Handle millions of document vectors
Custom AI Services
// AI service integrationclass LegalRAGService { async getLegalGuidance(query: string, tenantId: string) { const results = await this.autorag.search(query, { tenant: tenantId, limit: 5, threshold: 0.7 });
return this.generateGuidance(results); }}
Data Flow Architecture
Request Processing Flow
- Client Request → Browser/Mobile App
- Edge Routing → Cloudflare Edge Network
- Worker Processing → Cloudflare Workers
- Data Access → D1, R2, KV, AutoRAG
- Response → JSON/HTML/Files back to client
sequenceDiagram participant C as Client participant E as Edge participant W as Worker participant D as Database participant A as AutoRAG
C->>E: HTTP Request E->>W: Route to Worker W->>D: Query Data W->>A: AI Processing A-->>W: AI Response D-->>W: Data Response W-->>E: Formatted Response E-->>C: HTTP Response
Multi-Tenant Data Isolation
Subdomain Resolution
function getTenantFromHost(host: string): string { const subdomain = host.split('.')[0]; return subdomain;}
// All database queries include tenant_idconst jobs = await db .select() .from('jobs') .where('tenant_id', tenantId);
Resource Scoping
- Database queries filtered by
tenant_id
- File storage organized by tenant folders
- KV keys prefixed with tenant identifier
- AutoRAG knowledge bases scoped by tenant
Security Architecture
Authentication & Authorization
Session Management
- HTTP-only Cookies: Secure session storage
- CSRF Protection: Token-based request validation
- Session Expiration: Automatic timeout and renewal
- Multi-device Support: Concurrent session management
Role-Based Access Control (RBAC)
enum UserRole { ADMIN = 'admin', LAWYER = 'lawyer', PARALEGAL = 'paralegal', CLIENT = 'client'}
const permissions = { [UserRole.ADMIN]: ['*'], [UserRole.LAWYER]: ['cases.*', 'clients.*', 'documents.*'], [UserRole.CLIENT]: ['cases.view', 'documents.view']};
Data Protection
Encryption
- At Rest: All data encrypted using AES-256
- In Transit: TLS 1.3 for all communications
- End-to-End: Sensitive documents encrypted client-side
- Key Management: Cloudflare managed encryption keys
Privacy Controls
- Data Minimization: Collect only necessary information
- Retention Policies: Automatic data purging after retention period
- Access Logging: Comprehensive audit trails
- GDPR Compliance: Right to deletion and data portability
Performance Architecture
Caching Strategy
Edge Caching
- Static Assets: Long-term caching for CSS, JS, images
- API Responses: Intelligent caching based on data sensitivity
- Database Queries: Query result caching with TTL
- File Downloads: CDN caching for document delivery
Cache Invalidation
// Smart cache invalidationclass CacheManager { async invalidateUserData(userId: string) { await Promise.all([ this.kv.delete(`user:${userId}`), this.kv.delete(`user:${userId}:permissions`), this.purgeEdgeCache(`/api/users/${userId}`) ]); }}
Database Optimization
Query Optimization
- Prepared Statements: Prevent SQL injection and improve performance
- Index Strategy: Optimized indexes for common queries
- Connection Pooling: Efficient database connection management
- Read Replicas: Distribute read queries to edge replicas
Data Partitioning
- Tenant-based Partitioning: Logical separation by tenant
- Time-based Partitioning: Archive old data automatically
- Geographic Distribution: Data close to users
Monitoring & Observability
Logging Strategy
- Structured Logging: JSON format for easy parsing
- Request Tracing: Track requests across system components
- Error Tracking: Automatic error collection and alerting
- Performance Metrics: Response times and resource usage
Health Monitoring
// Health check endpointexport const healthCheck = async (env: Env) => { const checks = await Promise.allSettled([ checkDatabase(env.DB), checkStorage(env.zserved_files), checkAutoRAG(env.LEGAL_RAG) ]);
return { status: checks.every(c => c.status === 'fulfilled') ? 'healthy' : 'degraded', checks: checks.map(formatCheckResult) };};
Scalability Considerations
Horizontal Scaling
- Stateless Workers: No local state, infinite horizontal scaling
- Database Scaling: Read replicas and connection pooling
- File Storage: Unlimited object storage with global CDN
- AI Services: Automatic scaling based on request volume
Performance Optimization
- Code Splitting: Load only necessary JavaScript
- Lazy Loading: Defer non-critical resource loading
- Image Optimization: Automatic resizing and format conversion
- Prefetching: Intelligent resource prefetching
Development Architecture
Code Organization
src/├── pages/ # API endpoints and page routes├── components/ # Reusable UI components├── layouts/ # Page layout templates├── services/ # Business logic and external integrations├── utils/ # Helper functions and utilities└── types/ # TypeScript type definitions
Build Pipeline
- TypeScript Compilation: Type checking and JavaScript generation
- Astro Build: Static site generation with optimization
- Worker Bundling: Code bundling for Cloudflare Workers
- Asset Processing: CSS and image optimization
Testing Strategy
- Unit Tests: Individual component and function testing
- Integration Tests: API endpoint and database testing
- E2E Tests: Full user workflow testing
- Performance Tests: Load testing and benchmark analysis
This architecture provides a solid foundation for building scalable, secure, and performant legal document service applications while maintaining developer productivity and operational simplicity.