Software Architectures¶
Software architecture refers to the high-level structure of a software system - the way in which components are organized, interact with each other, and address the technical and operational requirements of the system. A well-designed architecture provides a blueprint for building robust, scalable, and maintainable software systems.
1. Monolithic Architecture¶
1. Definition and Core Concept¶
Monolithic architecture refers to a software design pattern where an entire application is built as a single, unified unit. All components—such as the user interface (UI), business logic, data access layer, and any supporting services—are tightly integrated and compiled into one executable or deployable artifact. This contrasts with modular or distributed architectures like microservices, where components are broken into independent services.
At its essence, a monolith is "self-contained." For example, in a web application, the frontend (HTML/CSS/JS), backend (server-side logic), and database interactions are all part of the same codebase and runtime process. Changes to any part require rebuilding and redeploying the entire application. This approach stems from traditional software engineering principles, emphasizing simplicity in development and deployment for smaller-scale systems.
Historically, monolithic designs dominated early software development (e.g., in the 1970s–1990s) because computing resources were limited, and the focus was on efficiency within a single process. Think of it as a "big ball of mud"—a term coined by Brian Foote and Joseph Yoder to describe how monoliths can evolve into complex, intertwined systems if not managed well.
2. Key Characteristics¶
Monolithic architectures exhibit several defining traits:
-
Single Codebase: All code lives in one repository. This includes modules for authentication, data processing, API endpoints, and UI rendering. Version control (e.g., Git) manages the entire app as one entity.
-
Tight Coupling: Components are highly interdependent. For instance, a change in the database schema might require updates across UI controllers, business logic, and data access objects (DAOs). This coupling can lead to ripple effects during modifications.
-
Shared Memory and Resources: Since everything runs in the same process (or a few closely linked processes), components share memory space, databases, and other resources. This enables fast inter-component communication via function calls rather than network APIs.
-
Unified Deployment: The application is deployed as a single unit—e.g., a WAR file for Java apps, a Docker container, or an executable binary. Scaling involves replicating the entire monolith (horizontal scaling) or upgrading hardware (vertical scaling).
-
Technology Stack Uniformity: Typically, a monolith uses a single programming language and framework stack (e.g., Ruby on Rails for the whole app or Java Spring Boot). While polyglot elements can be introduced (e.g., embedding Python scripts in a Java app), they remain within the monolith's boundaries.
-
Synchronous Communication: Internal interactions are usually synchronous (e.g., direct method calls), which simplifies debugging but can create bottlenecks.
In terms of layers, a typical monolith follows a layered architecture:
- Presentation Layer: Handles UI and user inputs.
- Application Layer: Manages business rules and orchestration.
- Domain Layer: Core business entities and logic.
- Infrastructure Layer: Deals with external concerns like databases, caching, or third-party APIs.
These layers are logical separations within the same codebase, not physical ones.
3. Advantages of Monolithic Architecture¶
Monoliths shine in certain scenarios due to their simplicity:
-
Ease of Development: Developers work in a single codebase, reducing context-switching. IDEs like IntelliJ or VS Code provide seamless navigation, autocompletion, and refactoring across the app. For small teams, this accelerates initial development—e.g., a startup can build an MVP (Minimum Viable Product) quickly without worrying about service boundaries.
-
Simplified Testing and Debugging: End-to-end tests cover the entire app in one go. Debugging is straightforward since you can step through code in a single process. Tools like JUnit (Java) or RSpec (Ruby) integrate easily.
-
Performance Efficiency: No network overhead for internal calls means lower latency. For compute-intensive tasks, shared memory allows efficient data passing (e.g., in-memory caching with Redis embedded or local variables).
-
Straightforward Deployment and Operations: Deploying one artifact simplifies CI/CD pipelines (e.g., using Jenkins or GitHub Actions). Monitoring is centralized—tools like New Relic or Prometheus can instrument the whole app without distributed tracing complexities.
-
Cost-Effectiveness for Small-Scale Apps: Less infrastructure overhead; a single server or cloud instance suffices. Scaling vertically (adding CPU/RAM) is often cheaper initially than managing multiple services.
-
Transactional Consistency: Easier to maintain ACID properties in databases since transactions span the monolith without distributed sagas or two-phase commits.
4. Disadvantages and Limitations¶
As systems grow, monoliths reveal pain points:
-
Scalability Challenges: You can't scale individual components independently. If the payment module is CPU-heavy, you must scale the entire app, leading to resource waste. Horizontal scaling helps but replicates everything, increasing costs.
-
Maintenance Hell: Large codebases become unwieldy (e.g., millions of lines). Tight coupling makes changes risky— a bug in one module can crash the whole app. Refactoring is daunting, often leading to "technical debt."
-
Slow Build and Deployment Times: Rebuilding the monolith for minor changes can take minutes or hours, slowing release cycles. This frustrates agile teams aiming for frequent deployments.
-
Technology Lock-In: Switching frameworks or languages requires a full rewrite. For example, migrating from monolithic PHP to Node.js is a massive undertaking.
-
Team Collaboration Issues: As teams grow, concurrent development leads to merge conflicts and coordination overhead. Monoliths don't lend themselves well to domain-driven design (DDD) with bounded contexts.
-
Reliability Risks: A single point of failure—if the monolith crashes, the entire app goes down. No inherent fault isolation.
-
Innovation Barriers: Experimenting with new tech (e.g., adopting AI models) is hard without disrupting the core.
In extreme cases, monoliths evolve into "distributed monoliths," where services are split but still tightly coupled via synchronous calls, inheriting the worst of both worlds.
5. Implementation Details and Best Practices¶
To build a effective monolith:
-
Modular Design Within the Monolith: Use techniques like hexagonal architecture (ports and adapters) or clean architecture to decouple layers. For example, define interfaces for data access, allowing easy swapping of databases (e.g., from MySQL to PostgreSQL).
-
Code Organization: Structure by feature (vertical slices) rather than layers. E.g., group user authentication files together instead of separating all controllers/models.
-
Dependency Management: Use tools like Maven (Java), npm (Node.js), or Composer (PHP) to handle libraries. Avoid global state; prefer dependency injection (e.g., Spring DI in Java).
-
Database Integration: Often uses a single relational database (e.g., SQL Server). Employ ORMs like Hibernate (Java) or Entity Framework (.NET) for abstraction.
-
API Exposure: Expose functionalities via a unified API gateway or REST endpoints within the monolith.
-
Scaling Strategies: Implement load balancing (e.g., NGINX), caching (Redis), and asynchronous tasks (e.g., via message queues like RabbitMQ, even in a monolith).
-
Testing Pyramid: Focus on unit tests (70%), integration tests (20%), and E2E tests (10%). Use mocks for external dependencies.
-
Monitoring and Logging: Integrate ELK stack (Elasticsearch, Logstash, Kibana) for centralized logs.
A key best practice is the "Majestic Monolith" philosophy (popularized by DHH of Basecamp): Keep it monolithic but highly modular, avoiding premature microservices migration.
6. Real-World Examples¶
- WordPress: A classic PHP monolith powering millions of sites. Plugins extend it, but core is unified.
- Shopify (early versions): Started as a Ruby on Rails monolith, scaling to handle e-commerce for thousands.
- Netflix (pre-microservices): Began monolithic but migrated as scale demanded.
- Etsy: Maintained a monolithic Perl/PHP app for years, using feature flags for safe deployments.
- Basecamp: A Rails monolith handling project management without distributed services.
Many legacy enterprise systems (e.g., banking software) remain monolithic due to reliability needs.
7. Comparison to Other Architectures¶
To contextualize, here's a table comparing monolithic to microservices and serverless architectures:
| Aspect | Monolithic | Microservices | Serverless |
|---|---|---|---|
| Structure | Single unit, tight coupling | Independent services, loose coupling | Event-driven functions, no servers |
| Scalability | Vertical/horizontal (whole app) | Independent per service | Auto-scales per function |
| Development Speed | Fast initially, slows with size | Slower setup, faster iterations | Very fast for small tasks |
| Complexity | Low (simple ops) | High (distributed systems) | Medium (vendor lock-in) |
| Fault Tolerance | Low (single failure point) | High (isolation) | High (managed by provider) |
| Use Case | Small-medium apps, MVPs | Large-scale, complex domains | APIs, event processing |
| Tech Flexibility | Limited (uniform stack) | High (polyglot) | High (but provider-dependent) |
| Overhead | Low (no network calls internally) | High (API calls, service discovery) | Low (pay-per-use) |
Monoliths are ideal for apps under 100k users or simple domains; beyond that, consider strangler pattern migrations to microservices.
8. Evolution and Modern Relevance¶
Monoliths aren't obsolete—they've evolved. With containers (Docker) and orchestration (Kubernetes), "modular monoliths" allow internal modularity while deploying as one. Frameworks like Laravel (PHP) or Django (Python) encourage monolithic designs with built-in modularity.
The rise of microservices in the 2010s (pioneered by Amazon, Netflix) highlighted monolith flaws, but many companies (e.g., Stripe) stick with monoliths for speed. Trends like "monorepos" (single repo for multiple services) blend aspects.
In cloud-native eras, monoliths can leverage serverless elements (e.g., AWS Lambda for side tasks) via hybrids.
9. Common Challenges and Mitigation¶
- Challenge: Codebase Bloat → Mitigate with code reviews, linters (e.g., ESLint), and automated refactoring tools.
- Challenge: Deployment Downtime → Use blue-green deployments or canary releases.
- Challenge: Performance Bottlenecks → Profile with tools like YourKit (Java) or Flame Graphs; optimize hotspots.
- Challenge: Team Scaling → Adopt feature teams and trunk-based development.
- Migration Path: If outgrowing a monolith, use the Strangler Fig pattern—gradually replace parts with microservices while keeping the core intact.
2. Microservices Architecture¶
1. Core Definition (What “Microservices” Actually Means)¶
Microservices architecture is an architectural style that structures an application as a collection of loosely coupled, independently deployable services that:
- Each implement a single business capability (or a small, cohesive set of capabilities)
- Own their own data store (when needed)
- Communicate with lightweight protocols (almost always HTTP/JSON, gRPC, or asynchronous events)
- Can be written in different languages and frameworks
- Are deployed, scaled, monitored, and versioned independently
The seminal definition comes from Lewis and Fowler (2014):
“The microservice architectural style is an approach to developing a single application as a suite of small services, each running in its own process and communicating with lightweight mechanisms… These services are built around business capabilities and independently deployable by fully automated deployment machinery.”
2. The Nine Key Characteristics (The Real Checklist)¶
If your system doesn’t satisfy most of these, you don’t have true microservices—you have a distributed monolith.
| # | Characteristic | What It Really Means in Practice |
|---|---|---|
| 1 | Componentization via Services | A service is the unit of deployment and runtime isolation |
| 2 | Organized Around Business Capabilities | Teams and services follow Domain-Driven Design bounded contexts (e.g., Order, Catalog, Payment) |
| 3 | Products Not Projects | Services are long-lived products owned by permanent teams, not temporary projects |
| 4 | Smart Endpoints, Dumb Pipes | Business logic lives in services, not in ESBs or heavy middleware |
| 5 | Decentralized Governance | Teams choose their own tech stack, data storage, etc. |
| 6 | Decentralized Data Management | Each service has its own database (polyglot persistence); no shared schema |
| 7 | Infrastructure Automation | CI/CD, IaC, automated testing, and zero-downtime deployments are mandatory |
| 8 | Design for Failure | Services expect network failures, timeouts, and other services to be down |
| 9 | Evolutionary Design | Services can be replaced, rewritten, or killed without affecting the rest of the system |
3. Advantages (When They Actually Materialize)¶
| Advantage | Real-World Impact (only if you do microservices well) |
|---|---|
| Independent Deployability | Deploy 50 times a day without coordinating with 10 other teams |
| Independent Scalability | Scale only the video-transcoding service during peak, leave billing untouched |
| Technology Heterogeneity | Use Go for low-latency services, Node.js for admin UI, Python for ML, Rust for crypto, etc. |
| Resilience & Fault Isolation | One service crashing doesn’t take down checkout (Netflix Chaos Monkey culture) |
| Team Autonomy & Velocity | Small teams (3–8 people) own a service end-to-end → faster decisions |
| Replaceability | Rewrite the entire recommendation engine in a new stack without touching anything else |
4. The Dark Side: Costs and Trade-offs¶
Microservices are an expensive architecture. Most of the pain comes from distribution, not size.
| Pain Point | Concrete Consequences |
|---|---|
| Distributed System Complexity | Eventual consistency, distributed tracing, latency, cascading failures |
| Data Consistency | You lose ACID across services → sagas, CQRS, event sourcing required |
| Operational Overhead | Hundreds of containers, service discovery, observability, secrets management |
| Testing Complexity | Contract testing (Pact), consumer-driven contracts, end-to-end tests become hard |
| Debugging in Production | Correlating logs across 50 services (trace IDs, OpenTelemetry) |
| Deployment Risk | More moving parts = higher chance of partial failures |
| Developer Cognitive Load | New hires need to understand service boundaries, event flows, and 12-factor principles |
| Higher Infrastructure Cost | At small/medium scale, microservices are 3–10× more expensive than a well-written monolith |
5. When You Should Actually Use Microservices¶
| Situation | Verdict | Why |
|---|---|---|
| < 10 engineers, building an MVP | Avoid | Premature decomposition is the root of all evil (Melvin Conway + Fred Brooks) |
| Multiple teams forced to work on one monolith | Strong candidate | Monolith becomes integration a coordination bottleneck |
| Very different scaling or performance needs | Strong candidate | E.g., chat needs 100k RPS, billing needs 100 RPS |
| Regulatory or security boundaries | Often required | PCI-compliant payment service must be isolated |
| You want to experiment with ML, Web3, etc. | Good fit | You can add a new Rust-based crypto service without touching the Java monolith |
| Your domain is simple CRUD | Usually overkill | A modular monolith is 10× simpler and faster |
6. Real-World Examples (2025)¶
| Company | Scale (approx.) | Key Observations |
|---|---|---|
| Netflix | ~2,000 services | Pioneer; Chaos Engineering, Spinnaker, full async with Kafka |
| Amazon | >10,000 services | “Two-pizza teams”, strict API contracts, service ownership culture |
| Uber | ~4,000 services | Went from monolith → microservices 2015–2018; now uses domain-oriented boundaries |
| Shopify | Modular monolith + some services | Proves you can scale to billions without full microservices |
| SoundCloud | Failed migration (2015) | Classic example of “distributed monolith” – services split but still tightly coupled |
| Zalando | ~2,000 services | Open-source Nakadi (event platform), radical team autonomy |
7. Common Architectural Patterns in Microservices¶
| Pattern | When to Use |
|---|---|
| API Gateway | Single entry point (e.g., Netflix Zuul, Kong, AWS API Gateway) |
| Backend for Frontend (BFF) | Mobile vs Web vs Partner have different needs → separate aggregation services |
| Event Sourcing + CQRS | When you need audit, replay, or multiple read models |
| Saga Pattern | Long-running business transactions across services (choreography or orchestration) |
| Strangler Fig | Incremental migration from monolith → new services replace old ones gradually |
| Circuit Breaker | Prevent cascading failures (Hystrix → Resilience4j, Istio) |
| Service Mesh | Sidecar pattern for traffic management (Istio, Linkerd) |
| Database per Service | Strong isolation (but join problems → use materialized views or API composition) |
8. Modern Tech Stack (2025 Reality)¶
| Layer | Popular Choices (2025) |
|---|---|
| Runtime | Go, Rust, Kotlin, Node.js, .NET 8+, Java 21+ (virtual threads help a lot) |
| Container Orchestration | Kubernetes (99% of large companies), Nomad or ECS for simpler cases |
| Service Mesh | Istio, Linkerd, Cilium |
| Event Streaming | Kafka, Pulsar, Redpanda |
| Observability | OpenTelemetry + Grafana Tempo (traces), Loki (logs), Prometheus/Mimir (metrics) |
| API Gateway | Kong, Traefik, Envoy, GraphQL Federation (Apollo Router) |
| CI/CD | ArgoCD + Tekton, GitHub Actions, Harness |
9. The "Modular Monolith" Counter-Movement (2025)¶
Many companies (Shopify, GitHub, Basecamp, Intercom) now advocate:
- Start with a very modular monolith (clear bounded contexts, interfaces, no package-by-layer)
- Extract to microservices only when you hit concrete pain (scaling, team boundaries, tech diversity) → This is often called the Majestic Monolith or Modular Monolith pattern.
3. Service-Oriented Architecture (SOA)¶
1. Precise Definition¶
Service-Oriented Architecture (SOA) is an architectural style that structures an application as a collection of services that:
- Communicate over a network (usually via standardized protocols)
- Are discoverable and have explicit contracts
- Are reusable across the enterprise
- Are loosely coupled and composable
- Are typically owned and governed centrally (this is the key difference from microservices)
First popularized in the early 2000s (Gartner coined the term in 1996, but it exploded ~2005–2012).
2. The Original SOA Vision (circa 2005–2010)¶
| Pillar | What the Vendors Promised | What Actually Happened (Reality) |
|---|---|---|
| Enterprise Service Bus (ESB) | One magical middleware to rule them all | Became the heaviest, most expensive monolith in history |
| WS-* Standards | Interoperability via SOAP, WSDL, WS-Security, WS-Addressing… | Extremely verbose XML, terrible performance |
| Service Registry / Repository | Central catalog of all services (UDDI) | Almost nobody used UDDI in production |
| Business-IT Alignment | Business people would drag-and-drop services in a BPM tool | Never happened |
| Reuse at Enterprise Scale | Write once, use everywhere | Reuse failed spectacularly — services were too coarse or too generic |
3. Classic SOA Technology Stack (2006–2014)¶
| Layer | Typical Products (the infamous “SOA Suite”) |
|---|---|
| ESB | IBM WebSphere ESB, Oracle Service Bus, Mule ESB, TIBCO, webMethods |
| Messaging | JMS, IBM MQ, SonicMQ |
| Protocols | SOAP, XML, XSD, WS-*, BPEL |
| Orchestration | BPEL engines (Oracle BPEL PM, ActiveVOS) |
| Governance | HP Systinet, SOA Software, Software AG CentraSite |
| Identity | WS-Security, SAML, XACML |
| Adapters | Hundreds of JCA adapters to talk to SAP, Siebel, mainframes |
These stacks cost millions of dollars in licenses and required armies of integration specialists.
4. Why Classic SOA Failed So Hard (The Real Reasons)¶
| # | Failure Mode | Explanation |
|---|---|---|
| 1 | The ESB became a distributed monolith | All logic moved into the bus → single point of failure, performance nightmare |
| 2 | Services were too big and too generic | “Customer Service”, “Order Service” that tried to serve 47 departments |
| 3 | Over-standardization (WS-*) | 500 KB SOAP envelopes to transfer 200 bytes of data |
| 4 | Central governance strangled velocity | Every service change required architecture review board approval |
| 5 | Reuse myth | Departments refused to depend on someone else’s service (NIH syndrome) |
| 6 | Vendor lock-in and insane costs | $5M+ just for licenses before writing a single line of business code |
| 7 | Orchestration hell | Long-running BPEL processes that were impossible to debug |
Result: By 2013–2014, the industry declared “SOA is dead” (Anne Thomas Manes, 2009: “SOA is dead; long live services”).
5. Where SOA Actually Succeeded (And Still Lives in 2025)¶
Despite the failures, some organizations got huge value from SOA principles:
| Company / Industry | How They Made SOA Work |
|---|---|
| Banks (global tier-1) | Used SOA to create canonical data models + reusable integration services for core banking |
| Airlines | Sabre, Amadeus — massive SOA systems (still running in 2025) for reservation, inventory |
| Government & Healthcare | Built integration layers on top of legacy mainframes using ESB + adapters |
| Large retailers | Used SOA for master data management (MDM) and cross-channel orchestration |
These successes had three things in common:
- Strong enterprise architecture governance
- Deep pockets
- Very stable, slowly changing domains
6. SOA vs Microservices — The Real Comparison (2025 Lens)¶
| Dimension | Traditional SOA (2005–2014) | Microservices (2015–2025) |
|---|---|---|
| Service Granularity | Coarse-grained (enterprise services) | Fine-grained (bounded context) |
| Governance | Central (architecture board) | Decentralized (team owns its stack) |
| Communication | Heavy (SOAP, ESB, WS-*) | Light (REST/JSON, gRPC, events) |
| Middleware | Smart pipes (ESB does routing, transformation) | Dumb pipes (Kafka, raw HTTP, service mesh) |
| Data Management | Shared databases common | Database per service (strict) |
| Technology Choice | Enforced standards (Java + Oracle stack) | Polyglot freedom |
| Deployment | Big-bang releases through ESB | Independent CI/CD per service |
| Primary Goal | Reuse & integration | Speed & autonomy |
| Typical Outcome | Expensive failure or slow success | Faster teams, higher operational cost |
Bottom line: Microservices = SOA done right, after learning all the painful lessons.
7. Modern SOA (2025) — It Never Really Died¶
Today, “SOA” has been rebranded and lives on in more pragmatic forms:
| Modern Name | What It Actually Is (SOA 2.0) |
|---|---|
| Enterprise Integration Architecture | Using Kafka + lightweight APIs instead of ESB |
| Event-Driven Architecture | SOA’s orchestration replaced with choreographed events |
| API-led Connectivity (MuleSoft) | Marketing term for “SOA with REST instead of SOAP” |
| Domain-Driven Design + Services | Bounded contexts exposed as services (this is microservices) |
| Internal Platforms / Platform Engineering | Central teams providing shared services (logging, auth, etc.) — this is controlled SOA |
8. When You Might Still Choose “SOA-style” in 2025¶
| Scenario | Recommended Style |
|---|---|
| Heavy mainframe/legacy integration | Modern SOA (Kafka + connectors) |
| Strict regulatory governance (banking, pharma) | Controlled SOA with central governance |
| You already own MuleSoft/Anypoint, webMethods, etc. | Keep using it — don’t rip and replace |
| Greenfield product company with <100 engineers | Avoid classic SOA completely |
4. Layered Architecture¶
1. Definition and Core Concept¶
Layered architecture (also known as n-tier architecture) is one of the most common architectural patterns in software development. It organizes the system into horizontal layers, where each layer has a specific role and responsibility. Components within a layer deal with functionality specific to that layer, and each layer provides services to the layer above it and consumes services from the layer below it.
The fundamental principle is separation of concerns—each layer focuses on a distinct aspect of the application, making the system easier to develop, test, maintain, and scale. This pattern has been the backbone of enterprise software development for decades and remains highly relevant today.
2. Standard Layers¶
The classic layered architecture consists of four primary layers:
┌─────────────────────────────────────────┐
│ Presentation Layer │ ← User interface, views, controllers
├─────────────────────────────────────────┤
│ Business Layer │ ← Business rules, workflows, validation
├─────────────────────────────────────────┤
│ Persistence Layer │ ← Data access, ORM, repositories
├─────────────────────────────────────────┤
│ Database Layer │ ← Database, file system, external data
└─────────────────────────────────────────┘
Presentation Layer:
- Handles all user interface and browser communication logic
- Responsible for presenting information to the user and interpreting user commands
- Includes controllers, views, and UI components
- Technologies: HTML/CSS/JavaScript, React, Angular, Vue.js, or server-side templates
Business Layer (Domain Layer):
- Contains the core business logic and rules
- Implements use cases, workflows, and domain-specific calculations
- Validates data and enforces business constraints
- Should be technology-agnostic and free of UI or database concerns
Persistence Layer (Data Access Layer):
- Manages data access and persistence operations
- Contains repositories, data mappers, and ORM configurations
- Abstracts the database implementation from business logic
- Technologies: Hibernate, Entity Framework, SQLAlchemy, Prisma
Database Layer:
- The actual data storage mechanism
- Can include relational databases, NoSQL stores, file systems, or external APIs
- Technologies: PostgreSQL, MySQL, MongoDB, Redis
3. Layer Communication Rules¶
Strict layered architecture enforces these rules:
| Rule | Description |
|---|---|
| Closed Layers | A request must pass through each layer sequentially; no skipping layers |
| Open Layers | Layers can be bypassed for performance (but reduces isolation) |
| Downward Dependencies | Layers only depend on layers below them, never above |
| No Circular Dependencies | Layer A cannot depend on Layer B if Layer B depends on Layer A |
Closed vs. Open Layers:
In a closed architecture, a request from the presentation layer must go through business → persistence → database. This maximizes isolation but can add latency.
In an open architecture, the presentation layer might directly access the persistence layer for simple CRUD operations, bypassing business logic. This improves performance but increases coupling.
4. Advantages¶
| Advantage | Explanation |
|---|---|
| Separation of Concerns | Each layer has a focused responsibility, making code easier to understand |
| Testability | Layers can be tested in isolation using mocks for dependencies |
| Maintainability | Changes in one layer don't ripple through the entire system |
| Team Organization | Different teams can work on different layers with minimal coordination |
| Technology Flexibility | Database or UI technology can change without affecting business logic |
| Familiarity | Most developers understand this pattern, reducing onboarding time |
5. Disadvantages and Limitations¶
| Disadvantage | Explanation |
|---|---|
| Performance Overhead | Requests traverse multiple layers, adding latency |
| Monolithic Tendency | Can evolve into a "big ball of mud" if layers aren't well-defined |
| Sinkhole Anti-Pattern | Requests that pass through layers without any processing |
| Difficulty Scaling | Hard to scale individual layers independently |
| Deployment Coupling | Changes require redeploying the entire application |
The Sinkhole Anti-Pattern:
This occurs when requests flow through layers without meaningful processing—the business layer simply passes data from presentation to persistence without adding value. If more than 20% of requests are sinkholes, consider whether layered architecture is appropriate.
6. Variants¶
Three-Tier Architecture:
- Presentation tier (client)
- Application tier (server)
- Data tier (database)
- Often deployed on separate physical machines
N-Tier Architecture:
- Extends three-tier with additional layers
- May include caching tier, integration tier, security tier
- Common in enterprise applications
Layered with Domain-Driven Design:
┌─────────────────────────────────────────┐
│ User Interface Layer │
├─────────────────────────────────────────┤
│ Application Layer │ ← Use cases, orchestration
├─────────────────────────────────────────┤
│ Domain Layer │ ← Entities, value objects, domain services
├─────────────────────────────────────────┤
│ Infrastructure Layer │ ← Repositories, external services
└─────────────────────────────────────────┘
7. Real-World Examples¶
- Traditional Java EE Applications: Servlet/JSP → EJB → JDBC → Database
- Spring Boot Applications: Controllers → Services → Repositories → JPA
- ASP.NET MVC: Views/Controllers → Services → Entity Framework → SQL Server
- Django Applications: Templates/Views → Forms/Serializers → Models → ORM
8. Best Practices¶
- Keep layers focused: Each layer should have a single, well-defined responsibility
- Use interfaces between layers: Depend on abstractions, not implementations
- Avoid layer leakage: Don't let database entities reach the presentation layer
- Consider DTOs: Use Data Transfer Objects to pass data between layers
- Watch for sinkholes: If a layer isn't adding value, question the architecture
5. Event-Driven Architecture¶
1. Definition and Core Concept¶
Event-Driven Architecture (EDA) is an architectural paradigm that promotes the production, detection, consumption, and reaction to events. An event represents a significant change in state—something that happened in the system (e.g., "OrderPlaced," "PaymentReceived," "UserRegistered").
Unlike request-response systems where components actively call each other, EDA inverts control: components emit events when something interesting happens, and other components react to those events asynchronously. This creates a highly decoupled, scalable, and responsive system.
EDA has become the foundation of modern distributed systems, enabling real-time data processing, microservices communication, and reactive applications at massive scale.
2. Core Components¶
┌──────────────┐ ┌──────────────┐ ┌──────────────┐
│ Event │ emit │ Event │ consume │ Event │
│ Producer │────────▶│ Channel │────────▶│ Consumer │
└──────────────┘ │ (Broker) │ └──────────────┘
└──────────────┘
│
▼
┌──────────────────────┐
│ Event Store │
│ (optional persistence)│
└──────────────────────┘
Event Producers:
- Generate events when state changes occur
- Don't know (or care) who consumes their events
- Examples: User service emitting "UserCreated," Payment service emitting "PaymentProcessed"
Event Channels (Brokers):
- Transport mechanism for events
- May provide persistence, ordering, replay
- Technologies: Apache Kafka, RabbitMQ, AWS EventBridge, Google Pub/Sub, Redis Streams
Event Consumers (Processors):
- Subscribe to events of interest
- React by executing business logic
- May emit new events (event chaining)
Event Store:
- Persists events for replay, auditing, and recovery
- Enables event sourcing pattern
- Technologies: EventStore, Apache Kafka (with retention), AWS DynamoDB Streams
3. Event-Driven Topologies¶
Mediator Topology:
A central event mediator orchestrates the flow of events through multiple processing steps:
┌─────────────────┐
│ Event Mediator │
│ (Orchestrator) │
└────────┬────────┘
┌────────────┼────────────┐
▼ ▼ ▼
┌─────────┐ ┌─────────┐ ┌─────────┐
│Processor│ │Processor│ │Processor│
│ A │ │ B │ │ C │
└─────────┘ └─────────┘ └─────────┘
- Use when: Complex event flows requiring coordination
- Pros: Centralized control, easier to understand flow
- Cons: Single point of failure, mediator can become bottleneck
Broker Topology:
Events flow through a message broker without central orchestration:
┌─────────┐ ┌─────────────┐ ┌─────────┐
│Producer │─────▶│ Broker │─────▶│Consumer │
└─────────┘ │ (Kafka) │ └─────────┘
│ │ ┌─────────┐
┌─────────┐ │ │─────▶│Consumer │
│Producer │─────▶│ │ └─────────┘
└─────────┘ └─────────────┘ ┌─────────┐
│─────────────▶│Consumer │
└─────────┘
- Use when: High throughput, decoupled processing
- Pros: Highly scalable, no single point of failure
- Cons: Harder to track event flow, eventual consistency challenges
4. Event Patterns¶
| Pattern | Description | Use Case |
|---|---|---|
| Event Notification | Simple notification that something happened; minimal data | Triggering downstream processes |
| Event-Carried State Transfer | Event contains all data needed by consumers | Reduce coupling, enable offline processing |
| Event Sourcing | Store all state changes as immutable events | Audit trails, time travel, rebuilding state |
| CQRS | Separate read and write models | Optimize reads and writes independently |
Event Sourcing Deep Dive:
Instead of storing current state, store the sequence of events that led to that state:
Traditional: Account { balance: 150 }
Event Sourced:
1. AccountOpened { initial_deposit: 100 }
2. MoneyDeposited { amount: 100 }
3. MoneyWithdrawn { amount: 50 }
→ Replay to get current balance: 150
Benefits: - Complete audit trail - Can rebuild state at any point in time - Natural fit for event-driven systems
Challenges: - Event schema evolution - Rebuilding large aggregates can be slow (use snapshots) - Requires different mental model
5. Advantages¶
| Advantage | Explanation |
|---|---|
| Loose Coupling | Producers don't know about consumers; components evolve independently |
| Scalability | Add consumers without affecting producers; parallelize processing |
| Responsiveness | Async processing doesn't block users; immediate acknowledgment |
| Fault Tolerance | Events can be replayed; consumers can recover from failures |
| Extensibility | Add new consumers to existing event streams without changes |
| Real-Time Processing | Natural fit for streaming data and real-time analytics |
| Temporal Decoupling | Producers and consumers don't need to be online simultaneously |
6. Disadvantages and Challenges¶
| Challenge | Explanation |
|---|---|
| Eventual Consistency | Data may be stale; harder to reason about system state |
| Event Ordering | Ensuring correct order across partitions/consumers |
| Debugging Complexity | Following event flow across services is challenging |
| Duplicate Events | At-least-once delivery requires idempotent consumers |
| Event Schema Evolution | Changing event structure without breaking consumers |
| Error Handling | Dead letter queues, retry policies, poison message handling |
Handling Event Ordering:
- Use partition keys to ensure related events go to same partition
- Include sequence numbers or timestamps in events
- Design consumers to handle out-of-order events gracefully
Ensuring Idempotency:
def handle_payment_event(event):
# Check if already processed
if event_store.already_processed(event.id):
return # Idempotent: skip duplicate
process_payment(event)
event_store.mark_processed(event.id)
7. Technologies (2025)¶
| Category | Technologies |
|---|---|
| Message Brokers | Apache Kafka, RabbitMQ, Amazon SQS/SNS, Google Pub/Sub |
| Event Streaming | Apache Kafka, Apache Pulsar, AWS Kinesis, Redpanda |
| Event Stores | EventStore, Axon Server, Apache Kafka (with retention) |
| Serverless Events | AWS EventBridge, Azure Event Grid, Google Eventarc |
| Stream Processing | Apache Flink, Kafka Streams, Apache Spark Streaming |
8. Real-World Examples¶
- Netflix: Uses event-driven architecture for content recommendations, viewing history, and personalization across 200M+ subscribers
- Uber: Real-time driver/rider matching, surge pricing, and trip tracking
- LinkedIn: Activity feeds, notifications, and real-time analytics
- Financial Services: Fraud detection, trade processing, market data distribution
9. When to Use Event-Driven Architecture¶
| Scenario | Verdict |
|---|---|
| Real-time data processing | Excellent fit |
| Microservices communication | Strong candidate |
| IoT sensor data ingestion | Excellent fit |
| Audit logging and compliance | Strong candidate |
| Simple CRUD application | Usually overkill |
| Strong consistency requirements | Consider alternatives |
6. Serverless Architecture¶
Serverless architecture delegates infrastructure management to cloud providers, allowing developers to focus solely on code.
- Characteristics: Function as a Service (FaaS), managed services, statelessness
- Advantages: Reduced operational complexity, automatic scaling, pay-per-use pricing
- Disadvantages: Vendor lock-in, cold start latency, complex debugging
- Use cases: Event-driven applications, microservices, applications with variable workloads
1. Precise Definition¶
Serverless is an execution model where:
- You write pure business logic (functions or containers)
- The cloud provider fully manages the runtime, scaling, patching, capacity planning, and high availability
- You are billed only for exact compute time consumed (per 1 ms or per invocation)
- Two dominant forms exist today:
| Type | Name | Examples | Typical Duration Limit | Cold Start Reality |
|---|---|---|---|---|
| FaaS (Function as a Service) | Classic Serverless | AWS Lambda, Google Cloud Functions, Azure Functions, Cloudflare Workers | 15 min (AWS), 60 min (GCP 2025) | 50 ms – 2.5 s (language + memory) |
| CaaS (Container as a Service) | Serverless Containers | AWS Fargate, Google Cloud Run, Azure Container Apps, Fly.io GPU, Modal | No hard limit | 200 ms – 8 s (image size) |
99% of “serverless” discussions still mean FaaS Function as a Service (Lambda-style), but serverless containers are taking over for heavy workloads (ML, video, long-running APIs).
2. The Full Technical Layers of a Real Serverless System (2025)¶
| Layer | 2025 Production Choices (what actually works at scale) |
|---|---|
| Compute | AWS Lambda (70 %), Cloud Run (20 %), Cloudflare Workers (edge), Azure Functions |
| API Gateway | AWS API Gateway REST/HTTP, ALB + Lambda target groups, Cloudflare Workers, Fastly Compute |
| Authentication | Cognito / Auth0 / Firebase Auth → JWT → Lambda Authorizer or Cloud Run IAM |
| Synchronous State | DynamoDB, PlanetScale, TiDB Serverless, Neon Serverless Postgres |
| Asynchronous State / Events | SQS, EventBridge, Kafka (MSK Serverless or Upstash Kafka), Google Pub/Sub |
| File / Object Storage | S3 + CloudFront, R2, GCS |
| Caching | ElastiCache Serverless (Redis), Momento, Dragonfly, Cloudflare KV |
| Observability | AWS X-Ray / Lumigo / Epsagon / Thundra, OpenTelemetry → Honeycomb or Lightstep |
| CI/CD | GitHub Actions → AWS SAM, Serverless Framework, Terraform, SST, Vercel, Railway |
| Local Development | LocalStack, SST Live Lambda, Docker + aws-sam-cli local, Cloud Run emulator |
3. Cold Starts in 2025 – The Numbers Everyone Lies About¶
| Language + Memory | Cold Start Duration (p50) | Cold Start Duration (p99) | Provisioned Concurrency Cost |
|---|---|---|---|
| Node.js 1024 MB | 180–350 ms | 800 ms–1.4 s | ~$0.0000045 per GB-second |
| Python 1024 MB | 220–450 ms | 1.2–2.5 s | — |
| Go 1024 MB | 110–220 ms | 600 ms–1.1 s | — |
| Java 11/21 2048 MB | 1.8–3.2 s | 6–12 s | Use SnapStart → drops to ~400 ms |
| .NET 8 1024 MB | 400–900 ms | 2–4 s | — |
| Rust (custom runtime) | 60–140 ms | 300–700 ms | — |
Mitigations that actually work in 2025:
- AWS Lambda SnapStart (Java) → 10× reduction
- Provisioned Concurrency (expensive but eliminates p99)
- Cloudflare Workers → no cold starts ever (runs at edge)
- Vercel / Netlify Functions → warm across the planet
4. Real-World Patterns That Actually Work at Scale¶
| Pattern | When It Wins | Example Implementation |
|---|---|---|
| API Backend | 90% of new HTTP APIs in 2025 | Lambda + API Gateway + DynamoDB single-table design |
| Event-Driven Backbone | Replacing Kafka in many cases | S3 → EventBridge → multiple Lambdas (fan-out) |
| WebSockets / Real-time | Live dashboards, chat | Lambda + WebSocket API Gateway + DynamoDB Streams |
| Cron Jobs | Anything < 15 min | EventBridge Scheduler → Lambda |
| ML Inference | < 60 s inference | Lambda with container images (up to 10 GB), or Fargate/Cloud Run for GPU |
| Data Processing Pipelines | TB-scale ETL | S3 → Lambda (container) or EMR Serverless or Glue |
| Backend-for-Frontend (BFF) | Mobile + Web different payloads | Separate Lambda per client type |
5. The Hidden Limits You Will Hit (2025 Hard Limits)¶
| Limit | AWS Lambda | Google Cloud Run | Azure Functions |
|---|---|---|---|
| Max memory | 10 GB | 32 GB (2025) | 8 GB |
| Max execution time | 15 minutes | No limit | 10 minutes |
| Deployment package size | 250 MB (zipped) / 10 GB container | 10 GB (container) | 1.5 GB |
| Concurrent executions | 1,000–100,000 (soft) | Unlimited (pay) | 200 per plan |
| Outbound connection reuse | Must reuse TCP connections | Same problem | Same |
6. Cost Reality (2025)¶
| Workload | Monthly Cost (1 M requests + 2 GB-s each) | Equivalent EC2/Fargate cost |
|---|---|---|
| Low-traffic API (100 RPS peak) | $15–40 | $80–150 |
| Medium (10 k RPS peak) | $400–1,200 | $800–2,000 |
| Spiky ML inference (100 GPU-h/day) | $8,000–12,000 (Lambda GPU or Cloud Run GPU) | $6,000–9,000 (reserved) |
Serverless is cheaper until ~40–60% average utilization — then reserved instances win.
7. Where Serverless Fails Spectacularly (Don’t Do This)¶
| Use Case | Why It Breaks |
|---|---|
| Long-running WebSocket connections (>15 min) | Lambda times out |
| Stateful TCP protocols (FTP, SMTP, game servers) | No connection reuse across invocations |
| Heavy background workers (>15 min) | Use Fargate/Cloud Run or ECS |
| Monolithic Java apps with 15-second startup | Cold starts kill latency |
| Vendor lock-in sensitive projects | You’re married to AWS/Google/Azure |
8. The 2025 Winner Stack (What the Cool Kids Actually Run)¶
| Layer | Stack That Scales to $1B+ ARR |
|---|---|
| Frontend | Next.js / Remix / SvelteKit on Vercel or Cloudflare |
| API | Lambda (Node.js/Go/Rust) or Cloud Run (Docker) |
| Database | DynamoDB single-table or PlanetScale/TiDB Serverless |
| Auth | Clerk, Auth0, or WorkOS |
| File uploads | Upload directly to S3/R2 via presigned URLs |
| Background jobs | Inngest, Trigger.dev, or Temporal Serverless |
| Observability | OpenTelemetry → Honeycomb or Lightstep |
7. Hexagonal Architecture (Ports and Adapters Architecture)¶
1. Definition and Core Concept¶
Hexagonal Architecture, introduced by Alistair Cockburn in 2005, structures an application so that the core business logic is isolated at the center, completely independent of external concerns like databases, web frameworks, or third-party services. The architecture uses ports (interfaces) and adapters (implementations) to connect the core to the outside world.
The hexagonal shape is symbolic—it emphasizes that the application has multiple facets for interaction, not just a top-to-bottom flow. The core doesn't know whether it's being driven by a REST API, a CLI, or a test suite; nor does it care whether data is stored in PostgreSQL, MongoDB, or an in-memory store.
This architecture is also known as Ports and Adapters or Clean Architecture (a variant popularized by Robert C. Martin).
2. Core Structure¶
┌─────────────────────────────────────┐
│ ADAPTERS │
│ ┌─────────────────────────────┐ │
REST API ─────┼─▶│ PRIMARY PORTS │ │
│ │ (Driving Ports) │ │
CLI ───────┼─▶│ │ │
│ │ ┌─────────────────────┐ │ │
Tests ───────┼─▶│ │ │ │ │
│ │ │ APPLICATION │ │ │
│ │ │ CORE │ │ │
│ │ │ (Business Logic) │ │ │
│ │ │ │ │ │
│ │ └─────────────────────┘ │ │
│ │ │ │
│ │ SECONDARY PORTS │────┼───▶ Database
│ │ (Driven Ports) │────┼───▶ External APIs
│ │ │────┼───▶ Message Queue
│ └─────────────────────────────┘ │
└─────────────────────────────────────┘
The Application Core:
- Contains all business logic, domain entities, and use cases
- Has zero dependencies on frameworks, databases, or external services
- Defines its own interfaces (ports) that adapters must implement
- Written in pure language constructs (plain objects, no framework annotations)
Ports:
- Primary Ports (Driving Ports): Interfaces that define how external actors interact with the application (e.g.,
OrderService,UserRegistrationUseCase) - Secondary Ports (Driven Ports): Interfaces that the application uses to interact with external systems (e.g.,
OrderRepository,PaymentGateway,EmailSender)
Adapters:
- Primary Adapters (Driving Adapters): Implementations that translate external requests into calls to primary ports (e.g., REST controllers, GraphQL resolvers, CLI handlers)
- Secondary Adapters (Driven Adapters): Implementations of secondary ports that connect to actual external systems (e.g., PostgreSQL repository, Stripe payment adapter, SendGrid email adapter)
3. The Dependency Rule¶
The most important rule in hexagonal architecture:
Dependencies always point inward toward the core. The core knows nothing about the outer layers.
Adapters → Ports → Application Core
│ ▲
│ │
└────────────────────┘
(Adapters implement ports defined by core)
This is achieved through Dependency Inversion: - The core defines interfaces (ports) - Adapters implement those interfaces - At runtime, adapters are injected into the core
4. Detailed Example¶
Consider an e-commerce order system:
Primary Port (defined by core):
# ports/input/order_service.py
class OrderService(ABC):
@abstractmethod
def place_order(self, order_request: OrderRequest) -> OrderResult:
pass
@abstractmethod
def get_order(self, order_id: str) -> Order:
pass
Secondary Ports (defined by core):
# ports/output/order_repository.py
class OrderRepository(ABC):
@abstractmethod
def save(self, order: Order) -> None:
pass
@abstractmethod
def find_by_id(self, order_id: str) -> Optional[Order]:
pass
# ports/output/payment_gateway.py
class PaymentGateway(ABC):
@abstractmethod
def process_payment(self, payment: PaymentRequest) -> PaymentResult:
pass
Application Core (Use Case):
# core/use_cases/place_order.py
class PlaceOrderUseCase(OrderService):
def __init__(
self,
order_repository: OrderRepository, # Injected secondary port
payment_gateway: PaymentGateway # Injected secondary port
):
self.order_repository = order_repository
self.payment_gateway = payment_gateway
def place_order(self, request: OrderRequest) -> OrderResult:
# Pure business logic—no framework dependencies
order = Order.create(request.items, request.customer_id)
payment_result = self.payment_gateway.process_payment(
PaymentRequest(order.total, request.payment_method)
)
if payment_result.successful:
order.mark_paid()
self.order_repository.save(order)
return OrderResult.success(order.id)
return OrderResult.failure(payment_result.error)
Primary Adapter (REST Controller):
# adapters/input/rest/order_controller.py
class OrderController:
def __init__(self, order_service: OrderService):
self.order_service = order_service
@app.post("/orders")
def create_order(self, request: Request):
order_request = self._map_to_domain(request.json)
result = self.order_service.place_order(order_request)
return self._map_to_response(result)
Secondary Adapters:
# adapters/output/postgres_order_repository.py
class PostgresOrderRepository(OrderRepository):
def save(self, order: Order) -> None:
# PostgreSQL-specific implementation
self.session.add(self._to_entity(order))
self.session.commit()
# adapters/output/stripe_payment_gateway.py
class StripePaymentGateway(PaymentGateway):
def process_payment(self, payment: PaymentRequest) -> PaymentResult:
# Stripe-specific implementation
stripe.Charge.create(amount=payment.amount, ...)
5. Advantages¶
| Advantage | Explanation |
|---|---|
| Testability | Core can be tested with mock adapters; no database or network needed |
| Framework Independence | Switch from Flask to FastAPI, or Spring to Micronaut, without touching business logic |
| Database Independence | Change from PostgreSQL to MongoDB by swapping the repository adapter |
| Delay Infrastructure Decisions | Start with in-memory adapters, add real implementations later |
| Parallel Development | Teams can work on adapters independently once ports are defined |
| Business Logic Focus | Core is free from technical concerns; pure domain modeling |
6. Disadvantages and Trade-offs¶
| Disadvantage | Explanation |
|---|---|
| Increased Complexity | More interfaces, more classes, more indirection |
| Boilerplate Code | Mapping between adapter models and domain models |
| Learning Curve | Team must understand dependency inversion and ports/adapters |
| Overkill for Simple Apps | Simple CRUD apps don't benefit from this complexity |
| Potential Over-Engineering | Easy to create too many abstractions |
7. When to Use Hexagonal Architecture¶
| Scenario | Verdict |
|---|---|
| Complex business logic | Excellent fit |
| Long-lived applications | Strong candidate |
| Multiple integration points (APIs, queues, databases) | Excellent fit |
| High test coverage requirements | Excellent fit |
| Simple CRUD application | Probably overkill |
| Prototype or MVP | Start simpler |
| Team unfamiliar with the pattern | Consider training first |
8. Relationship to Other Architectures¶
Hexagonal vs. Clean Architecture:
Clean Architecture (Robert C. Martin) is essentially hexagonal architecture with more specific layer names: - Entities (innermost) - Use Cases - Interface Adapters - Frameworks & Drivers (outermost)
Hexagonal vs. Onion Architecture:
Onion Architecture (Jeffrey Palermo) is very similar, with concentric layers: - Domain Model (center) - Domain Services - Application Services - Infrastructure (outer)
All three architectures share the same core principle: dependencies point inward, and the domain is at the center.
9. Best Practices¶
- Keep the core pure: No framework annotations, no database dependencies
- Use Dependency Injection: Wire adapters to ports at application startup
- Define clear port contracts: Ports are the API of your core
- Map at adapter boundaries: Convert between adapter DTOs and domain models
- Start simple: Begin with few ports; add more as complexity grows
- Test the core first: Unit tests for use cases with mock adapters
8. Microkernel Architecture¶
1. Definition and Core Concept¶
Microkernel Architecture (also known as the Plugin Architecture) structures an application around a minimal core system that provides essential functionality, with additional features implemented as independent, pluggable modules. The core system defines extension points, and plugins connect to these points to extend or customize behavior.
This architecture originated in operating system design (Mach, QNX, MINIX) where the kernel provides only the most essential services (memory management, IPC), while everything else (file systems, device drivers, networking) runs as separate user-space services. The same principle applies to application software—a stable core with swappable components.
The key insight is that the core is stable and minimal, while variability lives in the plugins. This enables a product platform that can be customized for different customers, markets, or use cases without changing the core.
2. Core Structure¶
┌─────────────────────────────────────────────────────────────────┐
│ PLUGINS │
│ ┌──────────┐ ┌──────────┐ ┌──────────┐ ┌──────────┐ │
│ │ Plugin A │ │ Plugin B │ │ Plugin C │ │ Plugin D │ │
│ │ (Feature)│ │ (Feature)│ │ (Feature)│ │ (Feature)│ │
│ └────┬─────┘ └────┬─────┘ └────┬─────┘ └────┬─────┘ │
│ │ │ │ │ │
│ ▼ ▼ ▼ ▼ │
│ ┌──────────────────────────────────────────────────────┐ │
│ │ PLUGIN REGISTRY / MANAGER │ │
│ │ (Discovers, loads, manages plugins) │ │
│ └──────────────────────────┬───────────────────────────┘ │
│ │ │
│ ▼ │
│ ┌──────────────────────────────────────────────────────┐ │
│ │ CORE SYSTEM │ │
│ │ ┌─────────────────────────────────────────────────┐ │ │
│ │ │ Minimal Functionality + Extension Points │ │ │
│ │ │ • Plugin API definitions │ │ │
│ │ │ • Core services (logging, config, events) │ │ │
│ │ │ • Plugin lifecycle management │ │ │
│ │ └─────────────────────────────────────────────────┘ │ │
│ └──────────────────────────────────────────────────────┘ │
└─────────────────────────────────────────────────────────────────┘
Core System:
- Contains minimal functionality required for the application to operate
- Defines extension points (contracts) that plugins implement
- Provides services that plugins can use (configuration, logging, data access)
- Manages plugin lifecycle (discovery, loading, initialization, shutdown)
- Should be stable—changes to the core are rare and carefully managed
Plugin Registry/Manager:
- Discovers available plugins (file scanning, configuration, runtime registration)
- Loads and initializes plugins
- Manages plugin dependencies and conflicts
- Handles plugin versioning and compatibility
- Provides plugin isolation (optional)
Plugins:
- Self-contained modules implementing specific functionality
- Implement interfaces defined by the core
- Can be added, removed, or updated without affecting the core
- May depend on other plugins or core services
- Loaded dynamically at runtime or startup
3. Extension Point Types¶
| Type | Description | Example |
|---|---|---|
| Point Extensions | Single implementation slot | Theme plugin for UI |
| Set Extensions | Multiple implementations allowed | Payment processors |
| Event Hooks | Plugins react to core events | Pre-save validation |
| Pipeline Extensions | Plugins form a processing chain | Request middleware |
| Override Extensions | Plugins replace core behavior | Custom authentication |
Example: Set Extension (Payment Processors)
# Core defines the extension point
class PaymentProcessor(ABC):
@abstractmethod
def process(self, payment: Payment) -> PaymentResult:
pass
@abstractmethod
def supports(self, method: PaymentMethod) -> bool:
pass
# Plugins implement the interface
class StripePlugin(PaymentProcessor):
def supports(self, method): return method == PaymentMethod.CREDIT_CARD
def process(self, payment): ...
class PayPalPlugin(PaymentProcessor):
def supports(self, method): return method == PaymentMethod.PAYPAL
def process(self, payment): ...
# Core uses all registered plugins
class PaymentService:
def __init__(self, processors: List[PaymentProcessor]):
self.processors = processors
def process_payment(self, payment: Payment):
for processor in self.processors:
if processor.supports(payment.method):
return processor.process(payment)
raise UnsupportedPaymentMethod()
4. Plugin Discovery and Loading¶
Static Discovery: - Plugins listed in configuration file - Loaded at application startup - Simple but requires restart to add plugins
Dynamic Discovery: - Core scans plugin directory for plugin manifests - Plugins can be added/removed at runtime - More complex but more flexible
Plugin Manifest Example:
# plugins/stripe-payment/plugin.yaml
name: stripe-payment
version: 2.1.0
description: Stripe payment processor
author: Acme Corp
entry_point: stripe_plugin.StripePaymentPlugin
dependencies:
- core: ">=3.0.0"
- http-client: ">=1.0.0"
extension_points:
- payment-processor
config:
api_key:
type: string
required: true
secret: true
5. Advantages¶
| Advantage | Explanation |
|---|---|
| Extensibility | Add new features without modifying core |
| Customization | Different customers get different plugin sets |
| Isolation | Plugin bugs don't crash the core (with proper isolation) |
| Independent Deployment | Update plugins without redeploying core |
| Third-Party Ecosystem | External developers can create plugins |
| Reduced Core Complexity | Core stays small and stable |
| Feature Toggles | Enable/disable features by loading/unloading plugins |
6. Disadvantages and Challenges¶
| Challenge | Explanation |
|---|---|
| Plugin Versioning | Managing compatibility between plugins and core versions |
| Performance Overhead | Indirection through plugin interfaces adds latency |
| Testing Complexity | Must test all plugin combinations |
| Security Risks | Malicious or buggy plugins can compromise system |
| Dependency Management | Plugin dependencies may conflict |
| Core API Stability | Changing extension points breaks plugins |
7. Real-World Examples¶
| Application | Core | Plugins |
|---|---|---|
| VS Code | Editor shell, file system, extension API | Languages, themes, debuggers, linters |
| Eclipse IDE | Workbench, OSGi runtime | Java tools, Git, Maven, Spring |
| WordPress | Content management, theme engine | SEO, caching, forms, e-commerce |
| Grafana | Dashboard framework, data source API | Prometheus, CloudWatch, Elasticsearch |
| Jenkins | CI/CD engine, job execution | Git, Docker, Kubernetes, Slack |
| Chrome Browser | Rendering engine, tab management | Ad blockers, password managers |
8. Implementation Patterns¶
Plugin Interface Contract:
class Plugin(ABC):
@abstractmethod
def get_name(self) -> str: pass
@abstractmethod
def get_version(self) -> str: pass
@abstractmethod
def initialize(self, context: PluginContext) -> None: pass
@abstractmethod
def shutdown(self) -> None: pass
Plugin Isolation Strategies:
| Strategy | Isolation Level | Performance | Complexity |
|---|---|---|---|
| Same process, shared memory | None | Excellent | Low |
| Same process, separate classloader | Partial | Good | Medium |
| Separate process | Strong | Lower | High |
| Container/sandbox | Very strong | Lower | High |
9. When to Use Microkernel Architecture¶
| Scenario | Verdict |
|---|---|
| Product that needs customer-specific customization | Excellent fit |
| Application with optional features | Strong candidate |
| Platform enabling third-party extensions | Excellent fit |
| IDE, editor, or development tools | Excellent fit |
| Simple application with fixed features | Overkill |
| Performance-critical hot paths | May add too much overhead |
10. Best Practices¶
- Design stable extension points: Changing plugin APIs breaks the ecosystem
- Version your plugin API: Use semantic versioning for compatibility
- Provide plugin SDK: Documentation, templates, testing tools
- Implement graceful degradation: System works even if plugins fail
- Sandbox untrusted plugins: Limit what third-party plugins can access
- Log plugin lifecycle: Track loading, errors, and performance
9. Pipes and Filters Architecture¶
1. Definition and Core Concept¶
Pipes and Filters Architecture structures a system as a series of processing components (filters) connected by data channels (pipes). Each filter performs a single, well-defined transformation on its input data and produces output for the next filter. Data flows through the pipeline, being progressively transformed until it reaches its final form.
This architecture originated from Unix shell commands where simple utilities could be chained together: cat file.txt | grep "error" | sort | uniq -c. Each command (filter) reads from stdin (input pipe), processes data, and writes to stdout (output pipe).
The key principle is that filters are independent and reusable—a filter doesn't know what comes before or after it in the pipeline. This enables powerful composition of simple components into complex data processing systems.
2. Core Structure¶
┌───────────┐ ┌───────────┐ ┌───────────┐ ┌───────────┐
│ SOURCE │────▶│ Filter A │────▶│ Filter B │────▶│ Filter C │────▶ SINK
│ (Input) │pipe │ Transform │pipe │ Transform │pipe │ Transform │
└───────────┘ └───────────┘ └───────────┘ └───────────┘
│ │ │
▼ ▼ ▼
Read → Process Read → Process Read → Process
→ Write → Write → Write
Pipes:
- Transport data between filters
- May be synchronous (blocking) or asynchronous (buffered)
- Define the data format (bytes, records, JSON objects, etc.)
- Can implement backpressure to handle slow consumers
- Examples: Unix pipes, message queues, in-memory channels, TCP streams
Filters:
- Self-contained processing units
- Read from input pipe, transform data, write to output pipe
- Stateless (preferred) or stateful (when necessary)
- Can be parallelized or scaled independently
- Examples: Parser, validator, transformer, enricher, aggregator
Source:
- Entry point for data into the pipeline
- Reads from files, databases, APIs, sensors, message queues
- May generate data or receive it from external systems
Sink:
- Exit point for processed data
- Writes to files, databases, APIs, displays, or other systems
- May trigger side effects or store results
3. Filter Types¶
| Type | Description | Example |
|---|---|---|
| Producer | Generates data, no input | File reader, sensor, API poller |
| Transformer | Converts data format or structure | JSON to XML, resize image |
| Tester/Filter | Passes or blocks data based on criteria | Spam filter, validation |
| Enricher | Adds information to data | Geocoding, lookup, annotation |
| Aggregator | Combines multiple data items | Sum, average, grouping |
| Splitter | Divides one input into multiple outputs | Routing by type |
| Consumer | Receives data, no output | Database writer, display |
4. Pipeline Topologies¶
Linear Pipeline:
Source → Filter A → Filter B → Filter C → Sink
Simple, sequential processing. Each filter has exactly one input and one output.
Parallel Pipeline:
┌→ Filter A1 ─┐
Source → Split├→ Filter A2 ─├→ Merge → Sink
└→ Filter A3 ─┘
Same processing on partitioned data for throughput.
Fan-Out Pipeline:
┌→ Filter A → Sink A
Source → Split├→ Filter B → Sink B
└→ Filter C → Sink C
Different processing paths based on data characteristics.
Fan-In Pipeline:
Source A → Filter A ─┐
Source B → Filter B ─├→ Merge → Filter D → Sink
Source C → Filter C ─┘
Multiple sources combined into single processing path.
Complex DAG (Directed Acyclic Graph):
Source A ──→ Filter 1 ──┐
├─→ Filter 3 ──→ Sink
Source B ──→ Filter 2 ──┘
│
└─────→ Filter 4 ──→ Sink B
Arbitrary pipeline structures for complex processing.
5. Implementation Patterns¶
Push Model:
Upstream filters push data to downstream filters. Simple but can overwhelm slow consumers.
class PushFilter:
def __init__(self, next_filter):
self.next = next_filter
def process(self, data):
result = self.transform(data)
if self.next:
self.next.process(result)
Pull Model:
Downstream filters pull data from upstream filters. Natural backpressure but can leave upstream idle.
class PullFilter:
def __init__(self, source):
self.source = source
def get_next(self):
data = self.source.get_next()
return self.transform(data)
Active Filters:
Each filter runs in its own thread/process, pulling from input and pushing to output asynchronously.
class ActiveFilter(Thread):
def __init__(self, input_queue, output_queue):
self.input = input_queue
self.output = output_queue
def run(self):
while True:
data = self.input.get()
result = self.transform(data)
self.output.put(result)
6. Advantages¶
| Advantage | Explanation |
|---|---|
| Modularity | Filters are independent, self-contained units |
| Reusability | Same filter can be used in multiple pipelines |
| Flexibility | Easy to add, remove, or reorder filters |
| Parallelism | Filters can run concurrently on multi-core systems |
| Testability | Filters tested in isolation with mock input/output |
| Understandability | Data flow is explicit and easy to trace |
| Incremental Processing | Process data as it arrives, not all at once |
7. Disadvantages and Challenges¶
| Challenge | Explanation |
|---|---|
| Data Transformation Overhead | Converting between filter formats costs CPU/memory |
| Error Handling Complexity | Failed record handling across distributed filters |
| Latency | Multiple filter hops add end-to-end latency |
| State Management | Stateful filters are harder to scale and parallelize |
| Ordering Guarantees | Parallel processing may reorder data |
| Debugging Difficulty | Tracking data through many filters is challenging |
| Backpressure | Slow consumers can cause pipeline backup |
8. Real-World Examples¶
| System | Filters | Use Case |
|---|---|---|
| Unix Shell | grep, sed, awk, sort, uniq | Text processing |
| Apache Kafka Streams | Map, filter, aggregate, join | Real-time event processing |
| ETL Pipelines | Extract, transform, load | Data warehousing |
| Image Processing | Resize, crop, filter, compress | Media processing |
| Compilers | Lexer, parser, optimizer, code generator | Code compilation |
| Network Protocols | TCP/IP layer stack | Network communication |
| Apache NiFi | Ingestion, routing, transformation | Data flow management |
| UNIX Pipes | cat | grep | sort | uniq |
Shell scripting |
9. Technologies (2025)¶
| Category | Technologies |
|---|---|
| Stream Processing | Apache Kafka Streams, Apache Flink, Apache Spark Streaming |
| ETL/Data Pipelines | Apache Airflow, Dagster, Prefect, dbt |
| Data Flow | Apache NiFi, StreamSets, AWS Glue |
| Messaging | Apache Kafka, RabbitMQ, AWS Kinesis |
| In-Process | Java Streams, RxJava, Project Reactor, Python generators |
10. When to Use Pipes and Filters¶
| Scenario | Verdict |
|---|---|
| Data transformation/ETL pipelines | Excellent fit |
| Stream/event processing | Excellent fit |
| Batch data processing | Strong candidate |
| Image/video/audio processing | Excellent fit |
| Log processing and analysis | Excellent fit |
| Real-time low-latency systems | Consider overhead |
| Simple CRUD applications | Overkill |
11. Best Practices¶
- Keep filters simple: One filter, one responsibility
- Use standard data formats: Consistent format reduces transformation overhead
- Design for failure: Handle errors gracefully, support retry and dead-letter queues
- Implement backpressure: Prevent fast producers from overwhelming slow consumers
- Monitor the pipeline: Track throughput, latency, and error rates at each stage
- Consider idempotency: Filters should handle duplicate data safely
10. Service-Based Architecture¶
1. Definition and Core Concept¶
Service-Based Architecture (SBA) is a hybrid architectural style that sits between monolithic and microservices architectures. It organizes an application into a collection of coarse-grained, domain-aligned services (typically 4-12 services) that share a common database or use a mix of shared and independent data stores.
Unlike microservices, which emphasize fine-grained services with strict database-per-service rules, service-based architecture takes a more pragmatic approach. Services are larger, there are fewer of them, and they may share a database when it makes sense. This makes it easier to adopt than microservices while still providing better modularity than a monolith.
Think of it as a "distributed monolith done right"—structured decomposition with practical trade-offs.
2. Core Structure¶
┌─────────────────────────────────────────────────────────────────────────┐
│ USER INTERFACE │
│ (Single UI or Micro-frontends) │
└───────────────────────────────┬─────────────────────────────────────────┘
│
▼
┌─────────────────────────────────────────────────────────────────────────┐
│ API GATEWAY (Optional) │
└───────────────────────────────┬─────────────────────────────────────────┘
│
┌───────────────────────┼───────────────────────┐
▼ ▼ ▼
┌───────────────┐ ┌───────────────┐ ┌───────────────┐
│ Order │ │ Customer │ │ Inventory │
│ Service │ │ Service │ │ Service │
│ (Domain A) │ │ (Domain B) │ │ (Domain C) │
└───────┬───────┘ └───────┬───────┘ └───────┬───────┘
│ │ │
│ ▼ │
│ ┌───────────────┐ │
└──────────────▶│ SHARED │◀──────────────┘
│ DATABASE │
│ (or hybrid) │
└───────────────┘
Key Characteristics:
| Characteristic | Service-Based | Microservices | Monolith |
|---|---|---|---|
| Number of services | 4-12 | Dozens to hundreds | 1 |
| Service granularity | Coarse (domain) | Fine (capability) | N/A |
| Database sharing | Shared or hybrid | Strict per-service | Single |
| Deployment | Independent | Independent | Single unit |
| Team size per service | 5-15 people | 3-8 people | Entire team |
3. Service Granularity¶
Services in SBA are aligned with business domains rather than technical capabilities:
Microservices approach:
├── User Authentication Service
├── User Profile Service
├── User Preferences Service
├── User Activity Service
└── User Notification Service
Service-Based approach:
└── User Service (handles all user-related functionality)
Guidelines for Service Boundaries:
- Domain-Driven Design: One service per bounded context
- Team Ownership: One team can own and maintain the entire service
- Change Frequency: Group functionality that changes together
- Data Cohesion: Keep closely related data in the same service
4. Database Strategies¶
Shared Database (Common in SBA):
┌──────────┐ ┌──────────┐ ┌──────────┐
│Service A │ │Service B │ │Service C │
└────┬─────┘ └────┬─────┘ └────┬─────┘
│ │ │
└────────────┼────────────┘
▼
┌──────────────┐
│ Shared │
│ Database │
└──────────────┘
Pros: ACID transactions, simpler queries, no data duplication Cons: Schema coupling, database bottleneck, coordination for changes
Hybrid Approach:
┌──────────┐ ┌──────────┐ ┌──────────┐
│Service A │ │Service B │ │Service C │
└────┬─────┘ └────┬─────┘ └────┬─────┘
│ │ │
▼ ▼ │
┌─────────┐ ┌─────────┐ │
│ DB A │ │ DB B │ │
└─────────┘ └─────────┘ │
▼
┌──────────────┐
│ Shared DB │
│ (Reference │
│ Data) │
└──────────────┘
Some services have dedicated databases; others share a common database for reference data.
5. Communication Patterns¶
Synchronous (REST/gRPC):
- Service-to-service HTTP calls
- Simpler to implement
- Creates runtime dependencies
Asynchronous (Messaging):
- Event-driven communication via message queues
- Better decoupling
- More complex error handling
API Composition:
# Order Service composes data from multiple services
class OrderService:
def get_order_details(self, order_id):
order = self.order_repo.get(order_id)
customer = self.customer_service.get(order.customer_id)
products = self.inventory_service.get_products(order.product_ids)
return OrderDetails(order, customer, products)
6. Advantages¶
| Advantage | Explanation |
|---|---|
| Simpler than Microservices | Fewer services, less operational complexity |
| Better Modularity than Monolith | Clear service boundaries, independent deployment |
| Pragmatic Database Approach | Shared DB when sensible, separate when needed |
| Incremental Migration Path | Easy to evolve from monolith or toward microservices |
| Team Scalability | Services map to teams (4-12 services, 4-12 teams) |
| ACID Transactions | Shared database enables cross-domain transactions |
| Reduced Network Complexity | Fewer service-to-service calls |
7. Disadvantages¶
| Disadvantage | Explanation |
|---|---|
| Database Coupling | Shared schema changes affect multiple services |
| Less Scalability | Can't scale services independently if sharing DB |
| Partial Fault Tolerance | Shared DB is single point of failure |
| Service Boundary Drift | Services can grow into mini-monoliths |
| Testing Complexity | Integration tests need shared database setup |
8. When to Use Service-Based Architecture¶
| Scenario | Verdict |
|---|---|
| Medium-sized team (20-100 engineers) | Excellent fit |
| Enterprise applications with clear domains | Strong candidate |
| Migrating from monolith | Excellent first step |
| Need transactions across domains | Strong candidate (shared DB) |
| Extreme scalability requirements | Consider microservices |
| Very small team (<10 engineers) | Monolith may be simpler |
9. Service-Based vs. Microservices¶
| Aspect | Service-Based | Microservices |
|---|---|---|
| Services count | 4-12 | 20-200+ |
| Database | Often shared | Always per-service |
| Deployment | Independent | Independent |
| Complexity | Medium | High |
| Scalability | Good | Excellent |
| Transaction support | ACID possible | Eventual consistency |
| Team structure | Domain teams | Two-pizza teams |
| Migration effort | Moderate | Significant |
10. Best Practices¶
- Define clear domain boundaries: Use Domain-Driven Design to identify services
- Start with shared database: Split only when you have concrete scaling needs
- Minimize cross-service calls: Fat interfaces with batch operations
- Use API versioning: Enable independent service evolution
- Implement centralized logging: Tracing across services is essential
- Plan for database splitting: Design schemas to support future separation
11. Space-Based Architecture¶
1. Definition and Core Concept¶
Space-Based Architecture (SBA), also known as Cloud Architecture or Tuple Space Architecture, is designed to achieve extreme scalability by eliminating the database as a central bottleneck. Instead of relying on a traditional database for data storage and synchronization, the architecture uses in-memory data grids where all processing units share a virtualized, distributed memory space.
The name "space-based" comes from the concept of tuple spaces—a distributed shared memory paradigm where processes communicate by placing and retrieving tuples (data) from a shared space. This idea, originating from Linda coordination language (1985), enables massive parallelism.
The key insight is that the database is often the bottleneck in high-traffic systems. By moving data into memory and replicating it across processing units, you eliminate database contention and achieve linear horizontal scaling.
2. Core Structure¶
┌─────────────────────────────────────┐
│ VIRTUALIZED MIDDLEWARE │
│ ┌─────────────────────────────────┐ │
│ │ MESSAGING GRID │ │
│ │ (Event-driven messaging) │ │
│ └─────────────────────────────────┘ │
│ ┌─────────────────────────────────┐ │
│ │ DATA GRID │ │
│ │ (In-memory distributed cache) │ │
│ └─────────────────────────────────┘ │
│ ┌─────────────────────────────────┐ │
│ │ PROCESSING GRID │ │
│ │ (Manages processing units) │ │
│ └─────────────────────────────────┘ │
│ ┌─────────────────────────────────┐ │
│ │ DEPLOYMENT MANAGER │ │
│ │ (Scaling, health, deployment) │ │
│ └─────────────────────────────────┘ │
└──────────────────┬──────────────────┘
│
┌──────────────────────────────────────┼──────────────────────────────────────┐
│ │ │
▼ ▼ ▼
┌───────────────────┐ ┌───────────────────┐ ┌───────────────────┐
│ Processing Unit │ │ Processing Unit │ │ Processing Unit │
│ ┌─────────────┐ │ │ ┌─────────────┐ │ │ ┌─────────────┐ │
│ │ Application │ │ │ │ Application │ │ │ │ Application │ │
│ │ Modules │ │ │ │ Modules │ │ │ │ Modules │ │
│ └─────────────┘ │ │ └─────────────┘ │ │ └─────────────┘ │
│ ┌─────────────┐ │ │ ┌─────────────┐ │ │ ┌─────────────┐ │
│ │ In-Memory │◀─┼────sync────────┼─▶│ In-Memory │◀─┼────sync────────┼─▶│ In-Memory │ │
│ │ Data Grid │ │ │ │ Data Grid │ │ │ │ Data Grid │ │
│ └─────────────┘ │ │ └─────────────┘ │ │ └─────────────┘ │
│ ┌─────────────┐ │ │ ┌─────────────┐ │ │ ┌─────────────┐ │
│ │Data Replicat│ │ │ │Data Replicat│ │ │ │Data Replicat│ │
│ │ Engine │ │ │ │ Engine │ │ │ │ Engine │ │
│ └─────────────┘ │ │ └─────────────┘ │ │ └─────────────┘ │
└───────────────────┘ └───────────────────┘ └───────────────────┘
│ │ │
└──────────────────────────────────────┼──────────────────────────────────────┘
│
▼
┌─────────────────────┐
│ DATA PUMPS │
│ (Async DB writers) │
└──────────┬──────────┘
│
▼
┌─────────────────────┐
│ DATABASE │
│ (Async persistence) │
└─────────────────────┘
3. Key Components¶
Processing Units:
- Self-contained units with application logic, in-memory data, and replication engine
- Stateless processing with local in-memory data access
- Can be scaled horizontally by adding more units
- Each unit can handle any request (no sticky sessions)
Virtualized Middleware:
| Component | Purpose |
|---|---|
| Data Grid | Manages distributed in-memory data, replication, and consistency |
| Messaging Grid | Handles communication between processing units |
| Processing Grid | Coordinates processing requests across units |
| Deployment Manager | Handles scaling, health monitoring, and deployment |
Data Pumps:
- Asynchronously write data from in-memory grid to persistent storage
- Decouple processing from database latency
- Enable eventual consistency with the database
Data Writers/Readers:
- Handle asynchronous database operations
- Data readers populate in-memory grid on startup
- Data writers persist changes periodically
4. Data Replication Models¶
Replicated Cache:
All processing units have a complete copy of the data. Changes are synchronously replicated to all units.
┌──────────┐ ┌──────────┐ ┌──────────┐
│ Unit A │ │ Unit B │ │ Unit C │
│ [Full │◀─▶│ [Full │◀─▶│ [Full │
│ Data] │ │ Data] │ │ Data] │
└──────────┘ └──────────┘ └──────────┘
- Pros: Any unit can handle any request, simple
- Cons: Memory usage scales with data size, replication overhead
Distributed Cache:
Data is partitioned across units. Each unit owns a subset of the data.
┌──────────┐ ┌──────────┐ ┌──────────┐
│ Unit A │ │ Unit B │ │ Unit C │
│ [Data │ │ [Data │ │ [Data │
│ 1-100] │ │ 101-200]│ │ 201-300]│
└──────────┘ └──────────┘ └──────────┘
- Pros: Memory usage distributed, better for large datasets
- Cons: Cross-partition requests need routing, more complex
Hybrid (Replicated + Distributed):
Frequently accessed data is replicated; large datasets are distributed.
5. Advantages¶
| Advantage | Explanation |
|---|---|
| Extreme Scalability | Linear scaling by adding processing units |
| High Performance | In-memory data access eliminates database latency |
| No Database Bottleneck | Database removed from the critical path |
| Fault Tolerance | Data replicated across units; no single point of failure |
| Elastic Scaling | Add/remove units dynamically based on load |
| Variable Load Handling | Ideal for spiky traffic patterns (ticket sales, auctions) |
6. Disadvantages and Challenges¶
| Challenge | Explanation |
|---|---|
| High Memory Cost | All data in memory is expensive |
| Complexity | Distributed caching, replication, and consistency are hard |
| Data Consistency | Eventual consistency; not suitable for ACID requirements |
| Testing Difficulty | Hard to simulate distributed in-memory grid locally |
| Transactional Support | Cross-unit transactions are complex or impossible |
| Data Size Limits | Dataset must fit in aggregate memory of all units |
| Recovery Complexity | Rebuilding in-memory state after failure takes time |
7. When to Use Space-Based Architecture¶
| Scenario | Verdict |
|---|---|
| Concert ticket sales (extreme spikes) | Excellent fit |
| Online auctions | Excellent fit |
| Social media feeds | Strong candidate |
| Real-time bidding (advertising) | Excellent fit |
| High-frequency trading | Strong candidate |
| Simple CRUD applications | Massive overkill |
| Strong ACID requirements | Not suitable |
| Budget-constrained projects | Memory costs may be prohibitive |
The Classic Use Case: Concert Ticket Sales
When tickets go on sale for a popular concert: - Traffic spikes 1000x in seconds - Traditional databases collapse under load - Space-based architecture handles the spike by: - Pre-loading ticket inventory into memory - Processing requests entirely in memory - Scaling processing units horizontally - Asynchronously persisting purchases to database
8. Technologies (2025)¶
| Category | Technologies |
|---|---|
| In-Memory Data Grids | Hazelcast, Apache Ignite, Redis Cluster, Oracle Coherence |
| Distributed Caching | Redis, Memcached, AWS ElastiCache |
| Data Grid Platforms | GigaSpaces, VMware GemFire/Apache Geode |
| Cloud Solutions | AWS ElastiCache, Azure Cache for Redis, Google Memorystore |
9. Implementation Patterns¶
Collocated Processing:
Application logic runs in the same JVM as the data grid, enabling direct memory access.
// Data is local—no network hop
User user = dataGrid.get(userId);
user.updateLastLogin(now);
dataGrid.put(userId, user); // Replicated to other units
Near Cache Pattern:
Frequently accessed data cached locally in each processing unit for sub-millisecond access.
Write-Behind Pattern:
Changes queued and asynchronously written to database, improving write performance.
Processing Unit → In-Memory Grid → Write Queue → Data Pump → Database
(async)
10. Space-Based vs. Other Architectures¶
| Aspect | Space-Based | Microservices | Traditional |
|---|---|---|---|
| Primary bottleneck | None (distributed) | Network | Database |
| Data location | In-memory, replicated | Per-service DB | Central DB |
| Scalability | Extreme | High | Limited |
| Consistency | Eventual | Eventual | Strong (ACID) |
| Cost | High (memory) | Medium | Lower |
| Complexity | Very high | High | Low-Medium |
11. Best Practices¶
- Right-size data replication: Replicate hot data, distribute cold data
- Plan for recovery: Pre-load strategies, data snapshots, failover procedures
- Monitor memory usage: Alert before hitting limits
- Test at scale: Simulate production load in staging
- Design for eventual consistency: Application logic must handle stale reads
- Use collocated processing: Minimize network hops for performance
- Implement circuit breakers: Protect against cascade failures during grid issues
Fundamental Software Architecture Patterns¶
1. Inbox and Outbox Pattern¶
The Inbox and Outbox pattern structures an application to handle reliable message processing by using persistent storage for incoming (inbox) and outgoing (outbox) messages, ensuring guaranteed delivery and processing in distributed systems.
- Characteristics: Persistent message queues, inbox for incoming messages, outbox for outgoing messages, transactional integration
- Advantages: Reliable message delivery, fault tolerance, simplified integration with transactional systems
- Disadvantages: Increased storage overhead, complexity in managing message state, potential latency in message processing
- Use cases: Distributed systems, microservices requiring reliable messaging, event-driven applications with transactional consistency
2. Queue-Based Load Leveling (Storage First Pattern)¶
Queue-Based Load Leveling, also known as the Storage First Pattern, structures an application by using a queue as a buffer between an invoker service (e.g., API Gateway) and the destination (e.g., compute resources), smoothing out demand spikes and ensuring stable processing.
- Characteristics: Asynchronous message queue, buffering of requests, decoupled invoker and destination, rate limiting
- Advantages: Improved system stability, scalable processing, fault tolerance, reduced resource contention
- Disadvantages: Potential message processing delays, increased complexity in queue management, monitoring overhead
- Use cases: High-traffic systems, applications with variable workloads, microservices requiring decoupled processing
3. Backends for Frontends Pattern¶
The Backends for Frontends (BFF) pattern structures an application by creating dedicated backend services tailored to the specific needs of individual frontend clients, such as web, mobile, or other interfaces.
- Characteristics: Dedicated backend per frontend, client-specific APIs, decoupled frontend-backend interactions
- Advantages: Optimized client experience, simplified frontend development, independent scaling of backends
- Disadvantages: Increased backend complexity, potential duplication of logic, higher maintenance overhead
- Use cases: Applications with diverse client types, microservices architectures, systems requiring customized API responses
4. Public versus Published Interfaces Pattern¶
The Public versus Published Interfaces pattern structures an application by distinguishing between internal (public) interfaces, which are flexible and subject to change, and external (published) interfaces, which are stable and designed for third-party or cross-team consumption.
- Characteristics: Public interfaces for internal use, published interfaces for external stability, clear versioning, contract-based design
- Advantages: Flexibility for internal changes, stability for external consumers, improved maintainability
- Disadvantages: Increased design complexity, overhead in maintaining stable published interfaces, potential versioning challenges
- Use cases: APIs for third-party developers, cross-team service integrations, systems requiring long-term interface stability
5. Asynchronous Messaging Pattern¶
The Asynchronous Messaging pattern structures an application by enabling components to communicate through messages sent to a queue or broker, allowing senders and receivers to operate independently without waiting for immediate responses.
- Characteristics: Message queues or brokers, decoupled sender-receiver interactions, asynchronous communication
- Advantages: Improved scalability, fault tolerance, loose coupling, enhanced system responsiveness
- Disadvantages: Complexity in message handling, potential message loss or duplication, increased latency for some operations
- Use cases: Distributed systems, event-driven applications, workflows requiring decoupled processing
6. Batch Request (Request Bundle Pattern)¶
The Batch Request pattern, also known as the Request Bundle pattern, structures an application to combine multiple individual requests into a single batch request, reducing network overhead and improving efficiency in client-server communication.
- Characteristics: Bundled requests, single API call for multiple operations, optimized network usage
- Advantages: Reduced latency, lower network overhead, improved performance for high-volume requests
- Disadvantages: Increased complexity in request handling, potential for larger payload sizes, error handling challenges
- Use cases: APIs with frequent small requests, mobile applications with limited bandwidth, systems requiring optimized communication
7. Blackboard Design Pattern¶
The Blackboard design pattern structures an application around a shared data repository (the blackboard) where multiple independent components, or knowledge sources, collaborate to solve complex problems by reading from and writing to the blackboard.
- Characteristics: Centralized data repository, independent knowledge sources, collaborative problem-solving, decoupled components
- Advantages: High flexibility, modularity, supports incremental problem-solving, easy to add new knowledge sources
- Disadvantages: Potential performance bottlenecks at the blackboard, complexity in managing data consistency, coordination overhead
- Use cases: Complex problem-solving systems, AI and expert systems, collaborative data processing applications
8. Circuit Breaker Design Pattern¶
The Circuit Breaker design pattern structures an application to monitor and control interactions with external services, preventing cascading failures by stopping requests when a service fails repeatedly and resuming when it recovers.
- Characteristics: State-based control (closed, open, half-open), failure threshold tracking, automatic recovery attempts
- Advantages: Improved system resilience, prevention of cascading failures, graceful degradation of service
- Disadvantages: Configuration complexity, potential for premature tripping, monitoring overhead
- Use cases: Distributed systems, microservices architectures, applications requiring fault tolerance with external dependencies
9. Client–Server Model¶
The Client–Server model structures an application by separating responsibilities between clients, which initiate requests, and servers, which process those requests and return responses, typically communicating over a network.
- Characteristics: Distinct client and server roles, request-response communication, centralized server resources
- Advantages: Centralized management, scalability of server resources, simplified client logic
- Disadvantages: Potential server bottlenecks, network dependency, increased latency for remote clients
- Use cases: Web applications, database-driven systems, networked services requiring centralized processing
10. Competing Consumers Pattern¶
The Competing Consumers pattern structures an application by allowing multiple consumer instances to process messages concurrently from a shared message queue, enabling load balancing and scalable processing.
- Characteristics: Shared message queue, multiple concurrent consumers, message-based work distribution
- Advantages: Improved throughput, scalable processing, fault tolerance through consumer redundancy
- Disadvantages: Potential message contention, complexity in ensuring message order, consumer coordination overhead
- Use cases: High-volume message processing systems, distributed task queues, event-driven microservices
11. Model–View–Controller Pattern¶
The Model–View–Controller (MVC) pattern structures an application by separating data management (Model), user interface (View), and user input handling (Controller), promoting modular development and maintainability.
- Characteristics: Separation of concerns, Model for data logic, View for display, Controller for input handling
- Advantages: Improved modularity, easier maintenance, testable components, reusable models across views
- Disadvantages: Potential complexity in large applications, overhead in small projects, tight coupling if poorly implemented
- Use cases: Web applications, desktop GUI applications, systems requiring clear separation of UI and business logic
12. Claim-Check Pattern¶
The Claim-Check pattern structures an application by storing large message payloads in a separate storage system and sending a reference (claim check) through the messaging system, reducing message size and improving efficiency.
- Characteristics: Separation of payload and message, external storage for large data, claim check as a reference
- Advantages: Reduced message overhead, improved performance, scalable handling of large data
- Disadvantages: Increased complexity in managing external storage, potential latency in retrieving payloads, data consistency challenges
- Use cases: Messaging systems with large payloads, distributed systems, applications requiring efficient message transmission
13. Peer-to-Peer (P2P) Architecture Pattern¶
The Peer-to-Peer (P2P) architecture pattern structures a system as a network of equally privileged nodes (peers) that both provide and consume services. Unlike client-server systems, there is no central coordinating node; responsibility and resources are distributed across participants.
- Characteristics: Decentralized network of peers, each acting as both client and server, dynamic topology, direct peer-to-peer communication, resource sharing (bandwidth, storage, computing)
- Advantages: High scalability, fault tolerance with no single point of failure, efficient resource utilization, reduced reliance on central infrastructure
- Disadvantages: Complexity in coordination and consistency, potential security and trust issues, variable performance depending on peers, data availability challenges when peers disconnect
- Use cases: File sharing systems (BitTorrent, IPFS), blockchain and cryptocurrency networks (Bitcoin, Ethereum), decentralized communication (VoIP, messaging), collaborative applications (distributed computation, multiplayer gaming)
14. Publish–Subscribe Pattern¶
The Publish–Subscribe pattern structures an application by enabling components (publishers) to broadcast messages without knowing the subscribers. Subscribers register interest in specific events or topics and receive notifications asynchronously when those events occur.
- Characteristics: Asynchronous communication, event/topic-based message distribution, decoupling between publishers and subscribers, often implemented with message brokers
- Advantages: Loose coupling between components, high scalability, supports dynamic subscriber registration, improved system flexibility and extensibility
- Disadvantages: Increased complexity in managing message delivery and ordering, potential latency due to intermediaries, debugging and monitoring can be challenging, risk of message overload if subscribers can’t keep up
- Use cases: Event-driven systems, messaging platforms, real-time notifications, log aggregation, distributed microservices communication
15. Rate Limiting Pattern¶
The Rate Limiting pattern structures an application to control the number of requests or operations a client can perform within a defined time window. This helps protect systems from overuse, abuse, or denial-of-service attacks by ensuring fair resource consumption.
- Characteristics: Request throttling, quotas per client or user, configurable time windows, enforcement mechanisms such as token bucket or leaky bucket algorithms
- Advantages: Protects system stability, prevents abuse or resource exhaustion, ensures fair usage among clients, improves resilience against denial-of-service attacks
- Disadvantages: Added latency when requests are delayed, complexity in distributed enforcement, potential negative impact on user experience if limits are too restrictive
- Use cases: APIs exposed to external clients, SaaS platforms with tiered usage plans, microservices communication, systems requiring protection from burst traffic
16. Request–Response Pattern¶
The Request–Response pattern structures an application around synchronous communication, where a client sends a request to a service or component and waits for a corresponding response. This is one of the most fundamental interaction models in distributed systems.
- Characteristics: Synchronous communication, client-initiated interaction, direct response from the service, typically over protocols like HTTP, gRPC, or TCP
- Advantages: Simple and intuitive interaction model, immediate feedback to clients, widespread support in frameworks and protocols, easier error handling
- Disadvantages: Tight coupling between client and service, limited scalability under heavy load, potential bottlenecks and latency issues, less resilient to service failures
- Use cases: Web APIs, database queries, microservices requiring direct responses, traditional client-server applications
17. Retry Pattern¶
The Retry pattern structures an application to automatically reattempt a failed operation, typically with configurable intervals and limits, to handle transient faults in distributed systems or unreliable networks.
- Characteristics: Automatic re-execution of failed operations, configurable retry policies (e.g., fixed interval, exponential backoff, jitter), transient fault handling, integration with error handling mechanisms
- Advantages: Improves resilience and fault tolerance, reduces impact of temporary failures, enhances reliability of external service calls
- Disadvantages: Potential increased latency, risk of overwhelming services with repeated retries, complexity in tuning retry policies, may mask underlying systemic issues
- Use cases: Distributed systems with transient network errors, API calls to external services, database operations with occasional contention, microservices communication requiring reliability
18. Rule-Based Pattern¶
The Rule-Based pattern structures an application around a set of declarative rules that define system behavior. Instead of hardcoding logic into the application, rules are externalized and evaluated by a rule engine or framework, allowing flexible and dynamic decision-making.
- Characteristics: Declarative rules, separation of business logic from application code, inference engine or rule processor, dynamic rule evaluation and execution
- Advantages: High flexibility and adaptability, easier to update and maintain business logic, supports complex decision-making, empowers non-developers to modify rules
- Disadvantages: Added complexity in integrating and managing rule engines, potential performance overhead, debugging can be challenging, risk of inconsistent rules if not governed properly
- Use cases: Business process management systems, fraud detection, recommendation engines, policy enforcement, expert systems
19. Saga Pattern¶
The Saga pattern structures an application to manage distributed transactions by breaking them into a sequence of smaller local transactions, each with a corresponding compensating action to undo its effects if necessary. This ensures data consistency across multiple services without requiring a global transaction manager.
- Characteristics: Sequence of local transactions, compensating actions for rollback, coordination through choreography (event-based) or orchestration (central controller), eventual consistency
- Advantages: Enables distributed transaction management without two-phase commit, improves system resilience, supports long-running business processes, scalable across microservices
- Disadvantages: Increased complexity in design and error handling, difficult debugging and monitoring, compensating actions may not fully restore previous state, eventual consistency may not suit all use cases
- Use cases: Microservices requiring distributed transaction management, e-commerce order processing, payment workflows, travel booking systems, any system with multi-step business processes across services
20. Strangler Fig Pattern¶
The Strangler Fig pattern structures an application modernization approach by incrementally replacing parts of a legacy system with new services. New functionality is built around the old system, and over time the legacy components are “strangled” and retired, leaving only the modern architecture.
- Characteristics: Incremental migration, coexistence of legacy and new systems, routing layer to direct requests, gradual replacement of functionality
- Advantages: Reduced migration risk, allows continuous delivery of value, easier rollback compared to big-bang rewrites, minimizes disruption to users
- Disadvantages: Requires careful integration and routing, increased complexity during transition, longer migration timelines, potential duplication of functionality during overlap
- Use cases: Legacy system modernization, monolith-to-microservices migration, cloud adoption projects, gradual replacement of critical business systems
21. Throttling Pattern¶
The Throttling pattern structures an application to control the rate at which clients can access a service, limiting the number of requests over a given time period. This helps protect system resources, maintain quality of service, and prevent abuse or overload.
- Characteristics: Rate limits per client or user, configurable thresholds, enforcement mechanisms (e.g., token bucket, leaky bucket), monitoring of request patterns
- Advantages: Protects system stability, prevents resource exhaustion, ensures fair usage among clients, improves resilience under high load
- Disadvantages: Added latency or request rejection when limits are reached, complexity in distributed enforcement, potential negative impact on user experience if limits are too strict
- Use cases: APIs exposed to external clients, SaaS platforms, microservices communication, systems requiring protection from burst traffic or abusive behavior
Architectural Quality Attributes¶
Software architectures are evaluated based on various quality attributes:
Functional Attributes¶
- Correctness: The ability of the system to perform its required functions accurately
- Completeness: The extent to which all required functions are present
- Security: Protection against unauthorized access and data breaches
Non-Functional Attributes¶
- Performance: Response time, throughput, resource utilization
- Scalability: Ability to handle growing workloads
- Reliability: System's ability to function without failure
- Availability: System's accessibility when needed
- Maintainability: Ease of modifying the system
- Testability: Ease of testing the system
- Usability: Ease of use by end users
Architectural Documentation¶
Documenting software architecture effectively is crucial for communication and future maintenance:
Documentation Components¶
- Views: Different perspectives (structural, behavioral, etc.) of the architecture
- Context diagrams: Showing the system in its environment
- Component diagrams: Illustrating system components and their relationships
- Sequence diagrams: Depicting interactions between components
- Deployment diagrams: Showing the physical deployment of the system
Architectural Description Languages (ADLs)¶
Formal languages for describing software architecture:
- UML (Unified Modeling Language)
- ArchiMate
- SysML
- AADL (Architecture Analysis & Design Language)
Emerging Architectural Trends¶
1. Cloud-Native Architectures¶
Designed specifically for cloud environments, leveraging cloud services and capabilities.
- Key components: Containerization, orchestration, service meshes, immutable infrastructure
- Technologies: Kubernetes, Docker, Istio, Prometheus
2. DevOps-Oriented Architectures¶
Architectures designed to support DevOps practices and continuous delivery.
- Characteristics: Infrastructure as Code, deployment pipelines, monitoring integration
- Benefits: Faster deployment, improved collaboration, enhanced reliability
3. Edge Computing¶
Pushing computational capabilities closer to where data is generated.
- Applications: IoT, real-time analytics, content delivery
- Advantages: Reduced latency, bandwidth efficiency, offline capabilities
4. AI/ML-Driven Architectures¶
Architectures optimized for artificial intelligence and machine learning workloads.
- Considerations: Data pipelines, model training/serving, resource optimization
- Technologies: TensorFlow Serving, Kubeflow, MLflow
Architectural Decision Making¶
Trade-offs in Architecture¶
Architectural decisions often involve balancing competing concerns:
- Performance vs. Maintainability: Optimized code may be harder to maintain
- Flexibility vs. Simplicity: More flexible designs often increase complexity
- Security vs. Usability: Enhanced security may impact user experience
- Time-to-market vs. Quality: Faster delivery might compromise quality
Architectural Decision Records (ADRs)¶
Documenting architectural decisions and their context:
- Components: Title, status, context, decision, consequences, alternatives considered
- Benefits: Knowledge preservation, onboarding assistance, future decision support
Architecture Evaluation Methods¶
1. Architecture Tradeoff Analysis Method (ATAM)¶
A structured method for evaluating architecture against business goals and quality attributes.
- Steps: Present business drivers, present architecture, identify quality attribute scenarios, analyze approaches, create risk/sensitivity points
2. Cost Benefit Analysis Method (CBAM)¶
Evaluates architectural decisions based on economic considerations.
3. Active Reviews for Intermediate Designs (ARID)¶
A lightweight method for evaluating partial architectural designs.
Conclusion¶
Effective software architecture is crucial for building successful software systems. It requires balancing various quality attributes, applying sound design principles, and making informed decisions based on specific requirements and constraints. As technology evolves, software architectures continue to adapt, incorporating new patterns, technologies, and approaches to address emerging challenges and opportunities in software development.