Software Testing¶
Software Testing is a systematic process of evaluating and verifying that a software application or system meets specified requirements and functions correctly. It involves executing software components using manual or automated tools to identify defects, ensure quality, and validate that the product meets stakeholder expectations. Testing is a core quality control (QC) activity; for the broader process of building quality in (planning, standards, audits), see Software Quality Assurance.
Why Testing Matters¶
Testing is not merely about finding bugs—it's a critical quality assurance activity that:
- Risk Mitigation: Identifies defects early when they're cheaper to fix
- Quality Assurance: Validates software meets functional and non-functional requirements
- Customer Confidence: Ensures reliable, stable products reach end users
- Cost Reduction: Prevents expensive production failures and maintenance
- Documentation: Test cases serve as executable specifications
- Design Feedback: Tests often reveal architectural and design issues
- Regulatory Compliance: Many industries require documented testing (healthcare, finance, aerospace)
The Cost of Defects (The 1-10-100 Rule)¶
The cost of fixing defects increases exponentially as they progress through the SDLC:
| Stage | Relative Cost | Example |
|---|---|---|
| Requirements | 1x | $100 to clarify a requirement |
| Design | 3-6x | $300-600 to fix design flaw |
| Implementation | 10x | $1,000 to fix during coding |
| Testing | 15-40x | $1,500-4,000 to fix discovered bug |
| Production | 30-100x | $3,000-10,000+ including hotfix, deployment, reputation |
Fundamental Testing Concepts¶
The Seven Testing Principles (ISTQB)¶
These principles form the foundation of effective software testing:
| # | Principle | Explanation |
|---|---|---|
| 1 | Testing shows presence of defects | Testing can show defects exist, but cannot prove their absence |
| 2 | Exhaustive testing is impossible | Testing everything is impractical; risk-based prioritization is essential |
| 3 | Early testing | Start testing as early as possible (shift-left) |
| 4 | Defect clustering | A small number of modules typically contain most defects (Pareto principle) |
| 5 | Pesticide paradox | Repeated tests become ineffective; tests must evolve |
| 6 | Testing is context dependent | Testing approach depends on the application context |
| 7 | Absence-of-errors fallacy | Finding no defects doesn't mean the software is useful or meets user needs |
Verification vs Validation¶
| Aspect | Verification | Validation |
|---|---|---|
| Question | "Are we building the product right?" | "Are we building the right product?" |
| Focus | Process conformance | User satisfaction |
| Methods | Reviews, inspections, walkthroughs | Testing, demonstrations, prototypes |
| Timing | During development | End of development phases |
| Example | Code review checking standards | UAT confirming business requirements |
Test Oracle¶
A test oracle is a mechanism for determining whether a test has passed or failed. It's the source of expected results against which actual results are compared.
Types of Oracles:
- Specified: Requirements, specifications, documentation
- Derived: Previous system versions, competitor products
- Implicit: Common sense, domain expertise
- Statistical: Probabilistic expectations, performance baselines
- Consistent: Same input should produce same output
Test Levels (Testing Pyramid)¶
The Test Pyramid is a conceptual framework describing the ideal distribution of automated tests. First introduced by Mike Cohn in "Succeeding with Agile."
/\
/ \
/ E2E \ ~5-10%
/ Tests \ (Slow, Expensive, Brittle)
/──────────\
/ \
/ Integration \ ~15-20%
/ Tests \ (Moderate Speed/Cost)
/──────────────────\
/ \
/ Unit Tests \ ~70-80%
/ \ (Fast, Cheap, Stable)
/──────────────────────────\
Unit Testing¶
Unit testing verifies individual components (functions, methods, classes) in isolation from dependencies.
Characteristics¶
| Aspect | Description |
|---|---|
| Scope | Single function, method, or class |
| Isolation | Dependencies mocked/stubbed |
| Speed | Milliseconds per test |
| Ownership | Developers |
| Execution | Every commit, pre-commit hooks |
Unit Test Best Practices¶
The FIRST Principles:
- Fast: Hundreds of tests per second
- Independent: No test depends on another
- Repeatable: Same result every time
- Self-validating: Pass or fail, no manual inspection
- Timely: Written at the same time as code (or before in TDD)
Arrange-Act-Assert (AAA) Pattern:
def test_calculate_discount():
# Arrange - Set up test data and conditions
cart = ShoppingCart()
cart.add_item(Item("Widget", price=100))
discount = PercentageDiscount(10)
# Act - Execute the behavior being tested
result = cart.apply_discount(discount)
# Assert - Verify the expected outcome
assert result.total == 90
assert result.discount_applied == 10
Given-When-Then (BDD Style):
def test_user_login_with_valid_credentials():
# Given a registered user
user = create_user(email="test@example.com", password="secure123")
# When they attempt to login with correct credentials
result = login_service.authenticate("test@example.com", "secure123")
# Then they should be successfully authenticated
assert result.is_authenticated is True
assert result.user.email == "test@example.com"
What to Unit Test¶
| Test | Don't Test |
|---|---|
| Business logic | Framework code |
| Algorithms | Trivial getters/setters |
| Edge cases | Third-party libraries |
| Error handling | Private methods directly |
| State transitions | Database queries (integration) |
| Calculations | External service calls |
Mocking, Stubbing, and Faking¶
| Technique | Purpose | Example |
|---|---|---|
| Mock | Verify interactions (behavior verification) | Verify email service was called |
| Stub | Provide canned responses | Return fixed user data |
| Fake | Working implementation (simplified) | In-memory database |
| Spy | Record calls for later verification | Log all method calls |
| Dummy | Fill parameter requirements | Placeholder object |
Example with Mocking (Python/pytest):
from unittest.mock import Mock, patch
def test_order_sends_confirmation_email():
# Arrange
mock_email_service = Mock()
order_service = OrderService(email_service=mock_email_service)
order = Order(customer_email="user@example.com", items=["Widget"])
# Act
order_service.complete_order(order)
# Assert - Verify the email service was called correctly
mock_email_service.send_confirmation.assert_called_once_with(
to="user@example.com",
order_id=order.id
)
Integration Testing¶
Integration testing verifies that multiple components work together correctly, testing the interfaces and interactions between integrated units.
Types of Integration Testing¶
| Type | Description | Approach |
|---|---|---|
| Big Bang | Integrate all components at once | High risk, hard to isolate failures |
| Top-Down | Start from top-level modules | Stubs for lower modules |
| Bottom-Up | Start from lowest-level modules | Drivers for higher modules |
| Sandwich/Hybrid | Combine top-down and bottom-up | Balance of both approaches |
| Incremental | Integrate one component at a time | Easier fault isolation |
What to Integration Test¶
- Database interactions (repositories, DAOs)
- API endpoints with database
- Message queue producers/consumers
- Cache integrations
- Third-party service integrations
- File system operations
- Cross-module communication
Example Integration Test (Python/FastAPI):
import pytest
from httpx import AsyncClient
from sqlalchemy.ext.asyncio import create_async_engine, AsyncSession
@pytest.fixture
async def db_session():
engine = create_async_engine("postgresql+asyncpg://test:test@localhost/testdb")
async with AsyncSession(engine) as session:
yield session
await session.rollback()
@pytest.fixture
async def client(db_session):
app.dependency_overrides[get_db] = lambda: db_session
async with AsyncClient(app=app, base_url="http://test") as ac:
yield ac
@pytest.mark.asyncio
async def test_create_and_retrieve_user(client, db_session):
# Create user
response = await client.post("/users", json={
"email": "test@example.com",
"name": "Test User"
})
assert response.status_code == 201
user_id = response.json()["id"]
# Retrieve user
response = await client.get(f"/users/{user_id}")
assert response.status_code == 200
assert response.json()["email"] == "test@example.com"
# Verify database state
user = await db_session.get(User, user_id)
assert user is not None
assert user.email == "test@example.com"
Testcontainers¶
Testcontainers provides lightweight, disposable containers for integration testing:
import pytest
from testcontainers.postgres import PostgresContainer
from testcontainers.redis import RedisContainer
@pytest.fixture(scope="module")
def postgres_container():
with PostgresContainer("postgres:15") as postgres:
yield postgres
@pytest.fixture(scope="module")
def redis_container():
with RedisContainer("redis:7") as redis:
yield redis
def test_with_real_database(postgres_container):
connection_url = postgres_container.get_connection_url()
# Run tests with real PostgreSQL
System Testing (End-to-End Testing)¶
System testing validates the complete, integrated system against specified requirements, testing the entire application stack from the user's perspective.
Characteristics¶
| Aspect | Description |
|---|---|
| Scope | Entire application |
| Environment | Production-like |
| Speed | Seconds to minutes per test |
| Stability | More brittle than unit tests |
| Value | High confidence in user workflows |
E2E Testing Strategies¶
Critical Path Testing: Focus on the most important user journeys:
- User registration and login
- Core business transactions
- Payment processing
- Data creation and retrieval
Example E2E Test (Playwright/Python):
import pytest
from playwright.sync_api import Page, expect
def test_user_can_complete_purchase(page: Page):
# Navigate to home page
page.goto("https://myshop.example.com")
# Search for product
page.fill("[data-testid=search-input]", "laptop")
page.click("[data-testid=search-button]")
# Select first product
page.click("[data-testid=product-card]:first-child")
# Add to cart
page.click("[data-testid=add-to-cart]")
expect(page.locator("[data-testid=cart-count]")).to_have_text("1")
# Proceed to checkout
page.click("[data-testid=checkout-button]")
# Fill shipping information
page.fill("[data-testid=shipping-name]", "John Doe")
page.fill("[data-testid=shipping-address]", "123 Main St")
page.fill("[data-testid=shipping-city]", "New York")
# Complete payment
page.fill("[data-testid=card-number]", "4242424242424242")
page.fill("[data-testid=card-expiry]", "12/25")
page.fill("[data-testid=card-cvc]", "123")
page.click("[data-testid=pay-button]")
# Verify success
expect(page.locator("[data-testid=order-confirmation]")).to_be_visible()
expect(page.locator("[data-testid=order-number]")).to_be_visible()
Acceptance Testing¶
Acceptance testing determines whether the system satisfies the acceptance criteria and is ready for delivery.
Types of Acceptance Testing¶
| Type | Performed By | Purpose |
|---|---|---|
| User Acceptance Testing (UAT) | End users/customers | Validate business requirements |
| Business Acceptance Testing (BAT) | Business stakeholders | Verify business processes |
| Contract Acceptance Testing | Customer | Verify contract specifications |
| Regulatory Acceptance Testing | Compliance team | Ensure regulatory requirements |
| Alpha Testing | Internal users | Early feedback in dev environment |
| Beta Testing | External users | Real-world feedback before release |
Acceptance Criteria Format¶
User Story with Acceptance Criteria:
As a [user role]
I want [feature]
So that [business value]
Acceptance Criteria:
- Given [context]
When [action]
Then [expected result]
Example:
Feature: Shopping Cart
Scenario: Add item to cart
Given I am on the product page for "Laptop Pro"
And the product is in stock
When I click "Add to Cart"
Then the cart icon should show "1" item
And I should see a confirmation message "Laptop Pro added to cart"
And the product should appear in my cart with quantity 1
Scenario: Remove item from cart
Given I have "Laptop Pro" in my cart
When I click "Remove" on the cart item
Then the item should be removed from my cart
And the cart should show as empty
Test Types by Purpose¶
Functional Testing¶
Functional testing verifies that the system functions according to specified requirements.
| Type | Focus | Example |
|---|---|---|
| Smoke Testing | Critical functionality works | Can users log in? |
| Sanity Testing | Specific functionality after changes | Does the fixed bug stay fixed? |
| Regression Testing | No new defects from changes | Run full test suite |
| Feature Testing | New feature works as specified | Test new payment method |
Non-Functional Testing¶
Non-functional testing evaluates system attributes beyond functionality.
Performance Testing¶
Performance testing measures how the system performs under various conditions.
| Type | Purpose | Metrics |
|---|---|---|
| Load Testing | Behavior under expected load | Response time at 1000 concurrent users |
| Stress Testing | Breaking point identification | System behavior at 10x normal load |
| Spike Testing | Response to sudden load increases | Recovery after traffic spike |
| Endurance/Soak Testing | Behavior over extended time | Memory leaks over 24 hours |
| Scalability Testing | Capacity growth handling | Performance with added resources |
| Volume Testing | Large data handling | Performance with 10M records |
Key Performance Metrics:
| Metric | Description | Target Example |
|---|---|---|
| Response Time | Time to complete a request | < 200ms for API calls |
| Throughput | Requests processed per second | 1000 RPS |
| Error Rate | Percentage of failed requests | < 0.1% |
| Latency (p50/p95/p99) | Response time percentiles | p99 < 500ms |
| Concurrent Users | Simultaneous active users | Support 10,000 |
| Apdex Score | User satisfaction index | > 0.9 |
Example Load Test (Locust/Python):
from locust import HttpUser, task, between
class WebsiteUser(HttpUser):
wait_time = between(1, 5)
def on_start(self):
# Login once per user
self.client.post("/login", json={
"username": "testuser",
"password": "testpass"
})
@task(3)
def view_products(self):
self.client.get("/api/products")
@task(2)
def view_product_detail(self):
self.client.get("/api/products/123")
@task(1)
def add_to_cart(self):
self.client.post("/api/cart", json={
"product_id": 123,
"quantity": 1
})
@task(1)
def checkout(self):
with self.client.post("/api/checkout", catch_response=True) as response:
if response.status_code == 200:
response.success()
elif response.status_code == 503:
response.failure("Service unavailable during checkout")
Security Testing¶
Security testing identifies vulnerabilities and ensures data protection.
| Type | Focus | Tools |
|---|---|---|
| SAST (Static) | Source code vulnerabilities | SonarQube, Semgrep, CodeQL |
| DAST (Dynamic) | Runtime vulnerabilities | OWASP ZAP, Burp Suite |
| IAST (Interactive) | Real-time analysis | Contrast Security |
| Penetration Testing | Exploit vulnerabilities | Manual + automated |
| Dependency Scanning | Vulnerable libraries | Snyk, Dependabot, Trivy |
OWASP Top 10 Testing Checklist:
| # | Vulnerability | Test Approach |
|---|---|---|
| 1 | Broken Access Control | Test authorization bypasses |
| 2 | Cryptographic Failures | Verify encryption, key management |
| 3 | Injection | SQL, NoSQL, OS command injection tests |
| 4 | Insecure Design | Threat modeling, abuse case testing |
| 5 | Security Misconfiguration | Configuration audits |
| 6 | Vulnerable Components | Dependency scanning |
| 7 | Authentication Failures | Credential testing, session management |
| 8 | Data Integrity Failures | Verify signatures, integrity checks |
| 9 | Logging Failures | Verify security event logging |
| 10 | SSRF | Test server-side request handling |
Example Security Test:
import pytest
from httpx import AsyncClient
@pytest.mark.security
class TestAuthorizationBypass:
async def test_user_cannot_access_other_users_data(self, client: AsyncClient):
# Create two users
user1 = await create_user(email="user1@test.com")
user2 = await create_user(email="user2@test.com")
# Authenticate as user1
token = await get_auth_token(user1)
# Attempt to access user2's data
response = await client.get(
f"/api/users/{user2.id}/profile",
headers={"Authorization": f"Bearer {token}"}
)
# Should be forbidden
assert response.status_code == 403
async def test_sql_injection_prevention(self, client: AsyncClient):
malicious_inputs = [
"'; DROP TABLE users; --",
"1 OR 1=1",
"admin'--",
"1; SELECT * FROM users"
]
for payload in malicious_inputs:
response = await client.get(f"/api/search?q={payload}")
# Should not cause server error or return unexpected data
assert response.status_code in [200, 400]
assert "error" not in response.text.lower() or "sql" not in response.text.lower()
Usability Testing¶
Usability testing evaluates how easy and intuitive the software is for end users.
| Method | Description | When to Use |
|---|---|---|
| Heuristic Evaluation | Expert review against usability principles | Early design phase |
| User Testing | Real users perform tasks | Prototype and beta phases |
| A/B Testing | Compare two versions | Optimize specific features |
| Eye Tracking | Monitor user visual attention | Complex UI optimization |
| Think-Aloud Protocol | Users verbalize thoughts | Deep usability insights |
Nielsen's Usability Heuristics:
- Visibility of system status
- Match between system and real world
- User control and freedom
- Consistency and standards
- Error prevention
- Recognition rather than recall
- Flexibility and efficiency of use
- Aesthetic and minimalist design
- Help users recognize, diagnose, and recover from errors
- Help and documentation
Accessibility Testing (a11y)¶
Accessibility testing ensures software is usable by people with disabilities.
WCAG 2.1 Compliance Levels:
| Level | Description | Examples |
|---|---|---|
| A | Minimum accessibility | Alt text, keyboard navigation |
| AA | Standard compliance | Color contrast, resize text |
| AAA | Enhanced accessibility | Sign language, reading level |
Accessibility Testing Tools:
- axe-core: Automated accessibility testing
- WAVE: Web accessibility evaluation
- Pa11y: CI/CD accessibility testing
- Lighthouse: Chrome DevTools audit
Example Accessibility Test (Playwright + axe):
from playwright.sync_api import Page
from axe_playwright_python.sync_playwright import Axe
def test_homepage_accessibility(page: Page):
page.goto("https://mysite.example.com")
axe = Axe()
results = axe.run(page)
# Check for critical violations
violations = results.get("violations", [])
critical = [v for v in violations if v["impact"] == "critical"]
assert len(critical) == 0, f"Critical a11y violations: {critical}"
Compatibility Testing¶
Compatibility testing verifies software works across different environments.
| Type | Scope | Considerations |
|---|---|---|
| Browser | Chrome, Firefox, Safari, Edge | CSS, JavaScript compatibility |
| Device | Desktop, tablet, mobile | Screen sizes, touch vs mouse |
| OS | Windows, macOS, Linux, iOS, Android | Native behaviors, file paths |
| Database | Different DB versions | SQL dialects, features |
| Network | Various speeds, offline | Performance, resilience |
Test Design Techniques¶
Black-Box Testing Techniques¶
Black-box testing focuses on inputs and outputs without knowledge of internal implementation.
Equivalence Partitioning¶
Divides input data into partitions that should be treated equivalently.
Example: Age validation (valid range 18-65)
| Partition | Range | Test Value | Expected Result |
|---|---|---|---|
| Invalid (low) | < 18 | 10 | Rejected |
| Valid | 18-65 | 35 | Accepted |
| Invalid (high) | > 65 | 70 | Rejected |
Boundary Value Analysis¶
Tests at the edges of equivalence partitions where defects often occur.
Example: Age validation boundaries
| Boundary | Values to Test |
|---|---|
| Minimum boundary | 17, 18, 19 |
| Maximum boundary | 64, 65, 66 |
Decision Table Testing¶
Tests combinations of conditions and their resulting actions.
Example: Shipping cost calculation
| Condition | R1 | R2 | R3 | R4 |
|---|---|---|---|---|
| Order > $100 | Y | Y | N | N |
| Premium member | Y | N | Y | N |
| Action | ||||
| Free shipping | X | X | ||
| Standard shipping | X | |||
| Express shipping | X |
State Transition Testing¶
Tests system behavior across different states.
[Locked]
↑
(3 failed attempts)
|
[Logged Out] ←── (logout) ←── [Logged In]
| ↑
└──── (valid credentials) ───┘
State Transition Table:
| Current State | Event | Condition | Action | Next State |
|---|---|---|---|---|
| Logged Out | Login | Valid credentials | Grant access | Logged In |
| Logged Out | Login | Invalid (attempts < 3) | Show error | Logged Out |
| Logged Out | Login | Invalid (attempts = 3) | Lock account | Locked |
| Logged In | Logout | - | End session | Logged Out |
| Locked | Unlock | Admin action | Reset attempts | Logged Out |
Pairwise/Combinatorial Testing¶
Tests interactions between parameters without testing all combinations.
Example: 3 parameters with multiple values each
| Browser | OS | Language |
|---|---|---|
| Chrome | Windows | English |
| Firefox | macOS | Spanish |
| Safari | Linux | French |
| Edge | Windows | French |
| Chrome | Linux | Spanish |
| Firefox | Windows | English |
Tools: PICT (Microsoft), AllPairs
White-Box Testing Techniques¶
White-box testing uses knowledge of internal code structure.
Statement Coverage¶
Every line of code is executed at least once.
def calculate_discount(price, is_member):
discount = 0 # Line 1
if is_member: # Line 2
discount = price * 0.1 # Line 3
if price > 100: # Line 4
discount += price * 0.05 # Line 5
return price - discount # Line 6
# Test for 100% statement coverage:
# Test 1: calculate_discount(150, True) → covers all lines
Branch/Decision Coverage¶
Every decision branch (if/else, loops) is executed.
def process_order(quantity, is_express):
if quantity > 0: # Branch 1a (True), 1b (False)
if is_express: # Branch 2a (True), 2b (False)
return "Express shipping"
else:
return "Standard shipping"
else:
return "Invalid quantity"
# Tests for 100% branch coverage:
# Test 1: process_order(5, True) → 1a, 2a
# Test 2: process_order(5, False) → 1a, 2b
# Test 3: process_order(0, True) → 1b
Path Coverage¶
Every possible path through the code is executed.
Condition Coverage¶
Every boolean sub-expression is evaluated to both true and false.
# Condition: (a > 0 AND b < 10) OR c == 'X'
# Tests for full condition coverage:
# Test 1: a=1, b=5, c='Y' → (T AND T) OR F = T
# Test 2: a=0, b=15, c='X' → (F AND F) OR T = T
# Test 3: a=0, b=5, c='Y' → (F AND T) OR F = F
# Test 4: a=1, b=15, c='Y' → (T AND F) OR F = F
Experience-Based Techniques¶
Exploratory Testing¶
Simultaneous learning, test design, and execution.
Session-Based Test Management (SBTM):
- Charter: Mission statement for the session
- Time-box: Fixed duration (60-90 minutes)
- Notes: Observations, questions, bugs found
- Debrief: Share findings with team
Exploratory Testing Heuristics:
| Heuristic | Description |
|---|---|
| SFDPOT | Structure, Function, Data, Platform, Operations, Time |
| FEW HICCUPS | Features, Explorability, Workflows, History, Interoperability, Claims, Configuration, User scenarios, Platform, States |
| Goldilocks | Too big, too small, just right |
Error Guessing¶
Testing based on tester's experience and intuition about likely defects.
Common Error-Prone Areas:
- Null/empty inputs
- Boundary conditions
- Special characters
- Concurrent operations
- Error handling paths
- Data type conversions
- Date/time handling
Checklist-Based Testing¶
Testing against predefined checklists.
Test-Driven Development (TDD)¶
Test-Driven Development is a development methodology where tests are written before the implementation code.
The TDD Cycle (Red-Green-Refactor)¶
┌─────────────────────────────────────────────┐
│ │
▼ │
┌───────┐ ┌───────┐ ┌──────────┐ │
│ RED │ ──────► │ GREEN │ ──────► │ REFACTOR │ ─┘
│ │ │ │ │ │
│ Write │ │ Write │ │ Improve │
│failing│ │minimal│ │ code │
│ test │ │ code │ │ │
└───────┘ └───────┘ └──────────┘
TDD in Practice¶
Step-by-Step Example: Building a Password Validator
# Step 1: RED - Write a failing test
def test_password_must_be_at_least_8_characters():
validator = PasswordValidator()
assert validator.is_valid("short") == False
assert validator.is_valid("longenough") == True
# Run test → FAILS (PasswordValidator doesn't exist)
# Step 2: GREEN - Write minimal code to pass
class PasswordValidator:
def is_valid(self, password: str) -> bool:
return len(password) >= 8
# Run test → PASSES
# Step 3: RED - Add next requirement
def test_password_must_contain_uppercase():
validator = PasswordValidator()
assert validator.is_valid("lowercase") == False
assert validator.is_valid("HasUpper1") == True
# Run test → FAILS
# Step 4: GREEN - Extend implementation
class PasswordValidator:
def is_valid(self, password: str) -> bool:
if len(password) < 8:
return False
if not any(c.isupper() for c in password):
return False
return True
# Run test → PASSES
# Step 5: REFACTOR - Improve code structure
class PasswordValidator:
MIN_LENGTH = 8
def is_valid(self, password: str) -> bool:
return all([
self._has_minimum_length(password),
self._has_uppercase(password),
])
def _has_minimum_length(self, password: str) -> bool:
return len(password) >= self.MIN_LENGTH
def _has_uppercase(self, password: str) -> bool:
return any(c.isupper() for c in password)
# Continue adding tests for numbers, special chars, etc.
TDD Benefits¶
| Benefit | Explanation |
|---|---|
| Better Design | Forces modular, testable code |
| Living Documentation | Tests describe expected behavior |
| Confidence in Changes | Instant feedback on regressions |
| Reduced Debugging | Defects found immediately |
| Higher Coverage | Tests written for all new code |
TDD Challenges and Solutions¶
| Challenge | Solution |
|---|---|
| Initial slowdown | Productivity increases with practice |
| Legacy code | Use characterization tests first |
| UI testing | Separate logic from presentation |
| External dependencies | Use mocks and dependency injection |
| Team resistance | Start with pair programming |
TDD Anti-Patterns¶
| Anti-Pattern | Problem | Solution |
|---|---|---|
| The Giant | Tests are too large | One assertion per test |
| The Mockery | Everything is mocked | Test behavior, not implementation |
| The Inspector | Testing private methods | Test public interface |
| The Slow Poke | Tests are too slow | Isolate unit tests |
| The Nevergreen | Tests intermittently fail | Fix flaky tests immediately |
Behavior-Driven Development (BDD)¶
BDD extends TDD by writing tests in natural language that describes behavior from the user's perspective.
BDD Structure: Given-When-Then¶
Feature: User Authentication
Background:
Given the authentication service is running
And the database is seeded with test users
Scenario: Successful login with valid credentials
Given a registered user with email "user@example.com"
And the user's password is "SecurePass123!"
When the user submits the login form with correct credentials
Then the user should be redirected to the dashboard
And a success message "Welcome back!" should be displayed
And a session token should be created
Scenario: Failed login with invalid password
Given a registered user with email "user@example.com"
When the user submits the login form with password "WrongPass"
Then the login should be rejected
And an error message "Invalid credentials" should be displayed
And no session should be created
Scenario Outline: Password validation rules
Given a user is registering a new account
When they enter password "<password>"
Then the password validation should be "<result>"
Examples:
| password | result |
| short | invalid |
| NoNumber! | invalid |
| nonumber123 | invalid |
| ValidPass123! | valid |
BDD Implementation (Python/Behave)¶
Step Definitions:
# features/steps/auth_steps.py
from behave import given, when, then
from app.services import AuthService
from app.models import User
@given('a registered user with email "{email}"')
def step_create_user(context, email):
context.user = User.create(email=email, password="SecurePass123!")
@given("the user's password is \"{password}\"")
def step_set_password(context, password):
context.user.set_password(password)
@when('the user submits the login form with correct credentials')
def step_login_correct(context):
context.result = AuthService.login(
email=context.user.email,
password="SecurePass123!"
)
@when('the user submits the login form with password "{password}"')
def step_login_with_password(context, password):
context.result = AuthService.login(
email=context.user.email,
password=password
)
@then('the user should be redirected to the dashboard')
def step_check_redirect(context):
assert context.result.redirect_url == "/dashboard"
@then('a success message "{message}" should be displayed')
def step_check_success_message(context, message):
assert context.result.message == message
@then('the login should be rejected')
def step_check_rejected(context):
assert context.result.success is False
@then('an error message "{message}" should be displayed')
def step_check_error_message(context, message):
assert context.result.error == message
BDD vs TDD¶
| Aspect | TDD | BDD |
|---|---|---|
| Language | Code (developer-focused) | Natural language (stakeholder-readable) |
| Focus | Technical correctness | Business behavior |
| Audience | Developers | Developers, QA, Product, Business |
| Granularity | Unit/function level | Feature/scenario level |
| Documentation | Code serves as docs | Living documentation |
BDD Tools by Language¶
| Language | Framework | Notes |
|---|---|---|
| Python | Behave, pytest-bdd | Gherkin syntax |
| JavaScript | Cucumber.js, Jest-Cucumber | Full Cucumber support |
| Java | Cucumber-JVM, JBehave | Enterprise-ready |
| Ruby | Cucumber (original) | Created by Aslak Hellesøy |
| .NET | SpecFlow | Visual Studio integration |
Test Automation¶
Test Automation Pyramid (Extended)¶
/\
/ \ Manual Exploratory Testing
/────\ (Human intuition, edge cases)
/ \
/ Visual \ Visual Regression Tests
/ Tests \ (Screenshot comparison)
/────────────\
/ \
/ E2E Tests \ UI Automation
/ (User Journeys) \ (Playwright, Cypress)
/────────────────────\
/ \
/ API/Contract Tests \ Service Integration
/ (Consumer-driven) \ (Pact, REST Assured)
/────────────────────────────\
/ \
/ Integration Tests \ Component Integration
/ (Database, Services) \ (Testcontainers)
/────────────────────────────────────\
/ \
/ Unit Tests (Foundation) \ Pure Logic
/──────────────────────────────────────────\
Choosing What to Automate¶
| Automate | Don't Automate |
|---|---|
| Regression tests | Exploratory testing |
| Data-driven tests | Usability testing |
| Cross-browser tests | One-time tests |
| Performance tests | Tests that change frequently |
| Security scans | Tests requiring human judgment |
| CI/CD pipeline tests | Ad-hoc investigations |
Test Automation Best Practices¶
Page Object Model (POM):
# pages/login_page.py
class LoginPage:
URL = "/login"
def __init__(self, page):
self.page = page
self.email_input = page.locator("[data-testid=email]")
self.password_input = page.locator("[data-testid=password]")
self.submit_button = page.locator("[data-testid=submit]")
self.error_message = page.locator("[data-testid=error]")
def navigate(self):
self.page.goto(self.URL)
return self
def login(self, email: str, password: str):
self.email_input.fill(email)
self.password_input.fill(password)
self.submit_button.click()
return self
def get_error(self) -> str:
return self.error_message.text_content()
# tests/test_login.py
def test_invalid_login_shows_error(page):
login_page = LoginPage(page).navigate()
login_page.login("invalid@test.com", "wrongpass")
assert login_page.get_error() == "Invalid credentials"
Data-Driven Testing:
import pytest
@pytest.mark.parametrize("email,password,expected_error", [
("", "password", "Email is required"),
("invalid", "password", "Invalid email format"),
("test@test.com", "", "Password is required"),
("test@test.com", "short", "Password too short"),
])
def test_login_validation(client, email, password, expected_error):
response = client.post("/login", json={
"email": email,
"password": password
})
assert response.status_code == 400
assert response.json()["error"] == expected_error
Test Automation Frameworks by Layer¶
| Layer | Python | JavaScript | Java |
|---|---|---|---|
| Unit | pytest, unittest | Jest, Vitest, Mocha | JUnit 5, TestNG |
| Integration | pytest + requests | Supertest, axios | REST Assured, Spring Test |
| E2E (Browser) | Playwright, Selenium | Playwright, Cypress, Puppeteer | Selenium, Playwright |
| API | httpx, requests | axios, supertest | REST Assured, OkHttp |
| Mobile | Appium | Detox, Appium | Appium, Espresso |
| Performance | Locust, k6 | k6, Artillery | JMeter, Gatling |
Contract Testing¶
Contract testing verifies that services communicate according to agreed-upon contracts, without requiring full integration tests.
Consumer-Driven Contract Testing¶
┌──────────────┐ Contract ┌──────────────┐
│ Consumer │ ◄─────────────────────────► │ Provider │
│ (Frontend) │ │ (API) │
└──────────────┘ └──────────────┘
│ │
▼ ▼
┌──────────────┐ ┌──────────────┐
│ Consumer Test│ │Provider Test │
│ (generates │ ───────────────────────► │ (verifies │
│ contract) │ Pact Broker │ contract) │
└──────────────┘ └──────────────┘
Pact Example (Python)¶
Consumer Side:
# test_consumer.py
import pytest
from pact import Consumer, Provider
pact = Consumer('frontend').has_pact_with(
Provider('user-service'),
publish_to_broker=True,
broker_base_url='https://broker.example.com'
)
@pytest.fixture
def pact_setup():
pact.start_service()
yield pact
pact.stop_service()
def test_get_user(pact_setup):
expected = {
'id': 1,
'name': 'John Doe',
'email': 'john@example.com'
}
(pact_setup
.given('a user with id 1 exists')
.upon_receiving('a request for user 1')
.with_request('GET', '/users/1')
.will_respond_with(200, body=expected))
with pact_setup:
result = UserClient(pact_setup.uri).get_user(1)
assert result == expected
Provider Side:
# test_provider.py
from pact import Verifier
def test_provider():
verifier = Verifier(provider='user-service', provider_base_url='http://localhost:8000')
output, _ = verifier.verify_pacts(
broker_url='https://broker.example.com',
enable_pending=True,
provider_states_setup_url='http://localhost:8000/_pact/provider-states'
)
assert output == 0
Property-Based Testing¶
Property-based testing generates random test inputs to verify that properties hold true for all inputs, rather than specific examples.
Properties vs Examples¶
| Example-Based | Property-Based |
|---|---|
add(2, 3) == 5 |
add(a, b) == add(b, a) (commutative) |
| Single test case | Hundreds of generated cases |
| Tests specific values | Tests properties of behavior |
| May miss edge cases | Discovers unexpected edge cases |
Hypothesis Example (Python)¶
from hypothesis import given, strategies as st, assume
from hypothesis.stateful import RuleBasedStateMachine, rule, invariant
# Testing a sort function
@given(st.lists(st.integers()))
def test_sort_produces_sorted_output(lst):
result = my_sort(lst)
assert all(result[i] <= result[i+1] for i in range(len(result)-1))
@given(st.lists(st.integers()))
def test_sort_preserves_elements(lst):
result = my_sort(lst)
assert sorted(lst) == sorted(result)
assert len(lst) == len(result)
# Testing a stack data structure
class StackMachine(RuleBasedStateMachine):
def __init__(self):
super().__init__()
self.stack = []
self.model = []
@rule(value=st.integers())
def push(self, value):
self.stack.push(value)
self.model.append(value)
@rule()
def pop(self):
assume(len(self.model) > 0)
actual = self.stack.pop()
expected = self.model.pop()
assert actual == expected
@invariant()
def size_matches(self):
assert len(self.stack) == len(self.model)
TestStack = StackMachine.TestCase
When to Use Property-Based Testing¶
| Good Fit | Poor Fit |
|---|---|
| Pure functions | UI interactions |
| Data transformations | External service calls |
| Serialization/deserialization | Time-dependent behavior |
| Mathematical operations | Database operations |
| Parsers and validators | Non-deterministic systems |
Mutation Testing¶
Mutation testing measures test quality by introducing small changes (mutations) to the code and checking if tests detect them.
How Mutation Testing Works¶
┌─────────────┐
│ Source Code │
└──────┬──────┘
│
▼
┌─────────────┐ ┌─────────────────────────────────┐
│ Mutator │────►│ Mutants (modified code) │
│ Engine │ │ - Mutant 1: change > to >= │
└─────────────┘ │ - Mutant 2: change + to - │
│ - Mutant 3: remove statement │
└──────────────┬──────────────────┘
│
▼
┌─────────────────────────────────┐
│ Run Tests Against Mutants │
└──────────────┬──────────────────┘
│
┌───────────────────────────┴───────────────────────────┐
│ │
▼ ▼
┌─────────────┐ ┌─────────────┐
│ Killed │ Test failed (mutation detected) │ Survived │
│ Mutant │ ✓ Good - tests are effective │ Mutant │
└─────────────┘ └─────────────┘
Tests passed (mutation undetected)
✗ Bad - tests need improvement
Mutation Score¶
Mutation Score = (Killed Mutants / Total Mutants) × 100%
| Score | Interpretation |
|---|---|
| > 80% | Good test quality |
| 60-80% | Adequate, room for improvement |
| < 60% | Tests need significant improvement |
Mutation Testing Tools¶
| Language | Tool |
|---|---|
| Python | mutmut, Cosmic Ray |
| JavaScript | Stryker |
| Java | PIT (Pitest) |
| C#/.NET | Stryker.NET |
| Go | go-mutesting |
Example (Python/mutmut)¶
# Run mutation testing
mutmut run
# View results
mutmut results
# Show survived mutants (need better tests)
mutmut show 42
Chaos Engineering and Resilience Testing¶
Chaos Engineering is the discipline of experimenting on a system to build confidence in its capability to withstand turbulent conditions.
Principles of Chaos Engineering¶
- Build a hypothesis around steady state behavior
- Vary real-world events (server failures, network issues)
- Run experiments in production (carefully)
- Automate experiments to run continuously
- Minimize blast radius with controlled experiments
Types of Chaos Experiments¶
| Experiment | What It Tests | Tools |
|---|---|---|
| Kill service | Service discovery, failover | Chaos Monkey |
| Network latency | Timeouts, circuit breakers | Toxiproxy, tc |
| Network partition | Split-brain handling | iptables, Chaos Mesh |
| CPU/Memory stress | Resource limits, autoscaling | stress-ng |
| Clock skew | Time-dependent logic | libfaketime |
| Disk full | Error handling, cleanup | fallocate |
| DNS failure | Fallback mechanisms | Chaos Mesh |
Chaos Engineering Tools¶
| Tool | Platform | Features |
|---|---|---|
| Chaos Monkey | AWS | Random instance termination |
| Gremlin | Multi-cloud | Managed chaos platform |
| Chaos Mesh | Kubernetes | K8s-native chaos |
| LitmusChaos | Kubernetes | Open-source K8s chaos |
| Toxiproxy | Any | Network condition simulation |
| Pumba | Docker | Container chaos |
Chaos Experiment Example¶
# chaos-mesh experiment: network delay
apiVersion: chaos-mesh.org/v1alpha1
kind: NetworkChaos
metadata:
name: payment-service-delay
spec:
action: delay
mode: one
selector:
namespaces:
- production
labelSelectors:
app: payment-service
delay:
latency: "500ms"
correlation: "100"
jitter: "100ms"
duration: "5m"
scheduler:
cron: "@every 1h"
Test Metrics and Coverage¶
Code Coverage Types¶
| Type | Measures | Example Target |
|---|---|---|
| Line Coverage | % of lines executed | 80%+ |
| Branch Coverage | % of branches taken | 75%+ |
| Function Coverage | % of functions called | 90%+ |
| Statement Coverage | % of statements executed | 80%+ |
| Condition Coverage | % of boolean conditions | 70%+ |
| Path Coverage | % of execution paths | 50%+ (often impractical for 100%) |
Coverage Metrics Interpretation¶
High Coverage
│
┌────────────────┴────────────────┐
│ │
High Quality Low Quality
Tests Tests
│ │
"Real confidence" "False confidence"
Tests verify behavior Tests may just execute
without meaningful assertions
Beyond Code Coverage¶
| Metric | Description | Formula |
|---|---|---|
| Mutation Score | % of mutants killed | Killed / Total |
| Defect Density | Bugs per KLOC | Bugs / (LOC / 1000) |
| Test-to-Code Ratio | Test code vs production | Test LOC / Prod LOC |
| Flaky Test Rate | % of inconsistent tests | Flaky / Total Tests |
| Test Execution Time | Total suite duration | Sum of test durations |
| MTTR (Tests) | Mean time to fix failures | Avg fix time |
Quality Gates¶
Example CI/CD Quality Gate:
# .github/workflows/test.yml
- name: Run Tests with Coverage
run: pytest --cov=app --cov-fail-under=80
- name: Check Mutation Score
run: |
mutmut run --CI
SCORE=$(mutmut results | grep "score" | awk '{print $2}')
if [ "$SCORE" -lt "70" ]; then exit 1; fi
- name: SonarQube Analysis
uses: sonarqube/scan-action@v1
with:
args: |
-Dsonar.qualitygate.wait=true
Testing in CI/CD Pipelines¶
Test Strategy by Pipeline Stage¶
┌─────────────────────────────────────────────────────────────────────────┐
│ CI/CD Pipeline │
├─────────────────────────────────────────────────────────────────────────┤
│ │
│ Commit ───► Build ───► Test ───► Deploy ───► Release │
│ │ │ │ │ │ │
│ ▼ ▼ ▼ ▼ ▼ │
│ ┌──────┐ ┌──────┐ ┌──────────┐ ┌──────┐ ┌──────────┐ │
│ │Lint │ │Unit │ │Integration│ │Smoke │ │Canary │ │
│ │Pre- │ │Tests │ │Tests │ │Tests │ │Tests │ │
│ │commit│ │ │ │ │ │ │ │ │ │
│ └──────┘ │SAST │ │Contract │ │E2E │ │Synthetic │ │
│ │ │ │Tests │ │Tests │ │Monitoring│ │
│ └──────┘ │ │ │ │ │ │ │
│ │DAST │ │Perf │ │A/B Tests │ │
│ └──────────┘ └──────┘ └──────────┘ │
│ │
│ Speed: ◄────────────────────────────────────────────────────► Slower │
│ Faster │
│ │
│ Confidence: ◄───────────────────────────────────────────────► Higher │
│ Lower │
└─────────────────────────────────────────────────────────────────────────┘
Parallel Test Execution¶
# GitHub Actions: Parallel test jobs
jobs:
unit-tests:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v4
- run: pytest tests/unit -n auto
integration-tests:
runs-on: ubuntu-latest
services:
postgres:
image: postgres:15
env:
POSTGRES_PASSWORD: test
steps:
- uses: actions/checkout@v4
- run: pytest tests/integration
e2e-tests:
runs-on: ubuntu-latest
strategy:
matrix:
browser: [chromium, firefox, webkit]
steps:
- uses: actions/checkout@v4
- run: playwright install ${{ matrix.browser }}
- run: pytest tests/e2e --browser ${{ matrix.browser }}
security-scan:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v4
- run: semgrep scan --config auto
Handling Flaky Tests¶
Strategies:
| Strategy | Implementation |
|---|---|
| Quarantine | Mark flaky tests, track separately |
| Retry mechanism | Retry failed tests (with limits) |
| Root cause analysis | Investigate and fix underlying issues |
| Test isolation | Ensure tests don't share state |
| Deterministic data | Use fixed test data, avoid randomness |
| Timing controls | Use explicit waits, avoid sleeps |
# pytest retry configuration
# pytest.ini
[pytest]
addopts = --reruns 2 --reruns-delay 1
# Mark test as flaky
@pytest.mark.flaky(reruns=3)
def test_external_api_integration():
...
Mobile Testing¶
Mobile Testing Types¶
| Type | Focus | Tools |
|---|---|---|
| Unit | Business logic | XCTest (iOS), JUnit (Android) |
| UI Automation | User interactions | Appium, Detox, XCUITest, Espresso |
| Performance | Speed, memory, battery | Instruments, Android Profiler |
| Compatibility | Device/OS variations | BrowserStack, Sauce Labs |
| Network | Offline, slow connections | Charles Proxy, Network Link Conditioner |
Mobile Test Automation with Appium¶
from appium import webdriver
from appium.webdriver.common.appiumby import AppiumBy
def test_login_flow():
capabilities = {
"platformName": "Android",
"platformVersion": "13",
"deviceName": "Pixel 7",
"app": "/path/to/app.apk",
"automationName": "UiAutomator2"
}
driver = webdriver.Remote("http://localhost:4723", capabilities)
try:
# Find and interact with elements
email_field = driver.find_element(AppiumBy.ACCESSIBILITY_ID, "email-input")
email_field.send_keys("test@example.com")
password_field = driver.find_element(AppiumBy.ACCESSIBILITY_ID, "password-input")
password_field.send_keys("password123")
login_button = driver.find_element(AppiumBy.ACCESSIBILITY_ID, "login-button")
login_button.click()
# Verify navigation to dashboard
dashboard = driver.find_element(AppiumBy.ACCESSIBILITY_ID, "dashboard-title")
assert dashboard.is_displayed()
finally:
driver.quit()
API Testing¶
REST API Testing¶
Key Areas to Test:
| Area | Tests |
|---|---|
| Status Codes | Correct codes for success/failure |
| Response Body | Schema validation, correct data |
| Headers | Content-Type, Authorization, CORS |
| Error Handling | Proper error messages, codes |
| Performance | Response time, throughput |
| Security | Authentication, authorization, injection |
Comprehensive API Test Example:
import pytest
from httpx import AsyncClient
from jsonschema import validate
USER_SCHEMA = {
"type": "object",
"required": ["id", "email", "name", "created_at"],
"properties": {
"id": {"type": "integer"},
"email": {"type": "string", "format": "email"},
"name": {"type": "string", "minLength": 1},
"created_at": {"type": "string", "format": "date-time"}
}
}
@pytest.mark.asyncio
class TestUserAPI:
async def test_create_user_returns_201(self, client: AsyncClient):
response = await client.post("/api/users", json={
"email": "new@example.com",
"name": "New User",
"password": "SecurePass123!"
})
assert response.status_code == 201
assert "Location" in response.headers
data = response.json()
validate(data, USER_SCHEMA)
assert data["email"] == "new@example.com"
async def test_create_user_duplicate_email_returns_409(self, client: AsyncClient, existing_user):
response = await client.post("/api/users", json={
"email": existing_user.email, # Duplicate
"name": "Another User",
"password": "SecurePass123!"
})
assert response.status_code == 409
assert response.json()["error"] == "Email already exists"
async def test_get_user_requires_authentication(self, client: AsyncClient):
response = await client.get("/api/users/1")
assert response.status_code == 401
assert response.json()["error"] == "Authentication required"
async def test_get_user_returns_404_for_nonexistent(self, authenticated_client: AsyncClient):
response = await authenticated_client.get("/api/users/99999")
assert response.status_code == 404
@pytest.mark.parametrize("invalid_data,expected_error", [
({"email": "invalid", "name": "Test", "password": "Pass123!"}, "Invalid email format"),
({"email": "test@test.com", "name": "", "password": "Pass123!"}, "Name is required"),
({"email": "test@test.com", "name": "Test", "password": "short"}, "Password too weak"),
])
async def test_create_user_validation(self, client: AsyncClient, invalid_data, expected_error):
response = await client.post("/api/users", json=invalid_data)
assert response.status_code == 400
assert expected_error in response.json()["error"]
GraphQL Testing¶
import pytest
from httpx import AsyncClient
@pytest.mark.asyncio
class TestGraphQLAPI:
async def test_query_user(self, client: AsyncClient, test_user):
query = """
query GetUser($id: ID!) {
user(id: $id) {
id
email
name
posts {
title
}
}
}
"""
response = await client.post("/graphql", json={
"query": query,
"variables": {"id": str(test_user.id)}
})
assert response.status_code == 200
data = response.json()
assert "errors" not in data
assert data["data"]["user"]["email"] == test_user.email
async def test_mutation_create_post(self, authenticated_client: AsyncClient):
mutation = """
mutation CreatePost($input: CreatePostInput!) {
createPost(input: $input) {
id
title
content
author {
id
}
}
}
"""
response = await authenticated_client.post("/graphql", json={
"query": mutation,
"variables": {
"input": {
"title": "Test Post",
"content": "Test content"
}
}
})
assert response.status_code == 200
data = response.json()
assert "errors" not in data
assert data["data"]["createPost"]["title"] == "Test Post"
Test Management and Documentation¶
Test Plan Components¶
| Section | Contents |
|---|---|
| Overview | Scope, objectives, approach |
| Test Items | Features to be tested |
| Features Not Tested | Explicit exclusions |
| Test Strategy | Types, levels, techniques |
| Entry/Exit Criteria | When to start/stop |
| Test Environment | Hardware, software, data |
| Schedule | Milestones, deadlines |
| Resources | Team, tools, budget |
| Risks | Identified risks, mitigations |
| Deliverables | Reports, artifacts |
Test Case Documentation¶
Test Case Template:
## TC-001: User Login with Valid Credentials
**Priority:** High
**Type:** Functional
**Automated:** Yes
### Preconditions
- User account exists with email "test@example.com"
- User is not currently logged in
### Test Data
- Email: test@example.com
- Password: ValidPass123!
### Steps
| # | Action | Expected Result |
|---|--------|-----------------|
| 1 | Navigate to login page | Login form is displayed |
| 2 | Enter email in email field | Email is accepted |
| 3 | Enter password in password field | Password is masked |
| 4 | Click "Login" button | Form is submitted |
| 5 | Observe result | User is redirected to dashboard |
### Expected Result
- User sees dashboard
- Welcome message displays user's name
- Session is created
### Postconditions
- User is authenticated
- Session token is stored
Bug Report Template¶
## BUG-123: Login fails with special characters in password
**Severity:** High
**Priority:** P1
**Status:** Open
**Environment:** Production, Chrome 120, macOS
### Description
Users cannot login when their password contains certain special characters (< > &).
### Steps to Reproduce
1. Create user with password "Test<>Pass&123"
2. Attempt to login with those credentials
3. Observe error
### Expected Behavior
User should be able to login successfully.
### Actual Behavior
Error message: "Invalid credentials" even with correct password.
### Root Cause Analysis
Password field is HTML-encoding special characters before validation.
### Screenshots/Logs
[Attach relevant screenshots and log snippets]
### Workaround
Users can reset password to one without < > & characters.
Testing Anti-Patterns¶
Common Anti-Patterns and Solutions¶
| Anti-Pattern | Problem | Solution |
|---|---|---|
| Ice Cream Cone | Too many E2E, few unit tests | Follow test pyramid |
| Testing Implementation | Tests break on refactoring | Test behavior, not code |
| Flaky Tests | Intermittent failures ignored | Fix or quarantine immediately |
| Slow Test Suite | Long feedback cycles | Parallelize, optimize, split |
| Test Data Coupling | Tests depend on shared data | Isolate test data |
| No Assertions | Tests execute but don't verify | Require meaningful assertions |
| Testing Third-Party | Testing framework/library code | Trust vendors, test integration |
| Manual Testing Addiction | Everything tested manually | Automate regression tests |
| 100% Coverage Goal | Coverage over quality | Focus on critical paths |
| Copy-Paste Tests | Duplicated test code | Use fixtures, parameterization |
The Testing Trophy (Kent C. Dodds)¶
An alternative to the test pyramid, emphasizing integration tests:
/\
/ \ Static Analysis (TypeScript, ESLint)
/────\
/ \
/ E2E \
/ Tests \
/────────────\
/ \
/ Integration \
/ Tests \ ← Emphasis here
/────────────────────\
/ \
/ Unit Tests \
/──────────────────────────\
Modern Testing Trends (2025)¶
AI-Assisted Testing¶
| Application | Tools/Techniques |
|---|---|
| Test Generation | AI-generated test cases from code/specs |
| Visual Testing | AI-powered visual regression (Percy, Applitools) |
| Maintenance | Auto-healing locators, self-updating tests |
| Analysis | AI-driven test prioritization, impact analysis |
| Exploration | AI-guided exploratory testing |
Shift-Left and Shift-Right Testing¶
Traditional Testing
│
◄──────────────┴──────────────►
Shift-Left Shift-Right
(Earlier) (Later)
│ │
▼ ▼
┌─────────────┐ ┌─────────────────┐
│ Requirements│ │ Production │
│ - Reviews │ │ - Monitoring │
│ - Static │ │ - A/B Testing │
│ analysis │ │ - Chaos Eng │
│ - TDD │ │ - Observability │
└─────────────┘ └─────────────────┘
Testing in Production¶
| Technique | Description |
|---|---|
| Feature Flags | Test features with subset of users |
| Canary Releases | Gradual rollout with monitoring |
| A/B Testing | Compare versions with real users |
| Synthetic Monitoring | Automated tests against production |
| Dark Launching | Deploy without enabling |
| Traffic Shadowing | Mirror production traffic to test |
Test Observability¶
# Example: OpenTelemetry test instrumentation
from opentelemetry import trace
from opentelemetry.trace import Status, StatusCode
tracer = trace.get_tracer("test-suite")
def test_user_registration():
with tracer.start_as_current_span("test_user_registration") as span:
span.set_attribute("test.type", "integration")
span.set_attribute("test.priority", "high")
try:
# Test implementation
user = create_user(...)
assert user.is_active
span.set_status(Status(StatusCode.OK))
span.set_attribute("test.result", "passed")
except AssertionError as e:
span.set_status(Status(StatusCode.ERROR, str(e)))
span.set_attribute("test.result", "failed")
raise
Testing Tools Reference¶
By Category¶
| Category | Tools |
|---|---|
| Unit Testing | pytest, Jest, JUnit, NUnit, Go test |
| Integration | Testcontainers, WireMock, LocalStack |
| E2E Browser | Playwright, Cypress, Selenium, Puppeteer |
| API Testing | Postman, Insomnia, httpx, REST Assured |
| Performance | k6, Locust, JMeter, Gatling, Artillery |
| Security | OWASP ZAP, Burp Suite, Snyk, Semgrep |
| Contract | Pact, Spring Cloud Contract |
| Mocking | WireMock, MockServer, Responses, httpretty |
| Coverage | Coverage.py, Istanbul, JaCoCo |
| Mutation | mutmut, Stryker, PIT |
| Visual | Percy, Applitools, Chromatic |
| Accessibility | axe-core, Pa11y, WAVE |
| Mobile | Appium, Detox, XCUITest, Espresso |
| Chaos | Chaos Mesh, Gremlin, LitmusChaos |
Tool Selection Criteria¶
| Factor | Considerations |
|---|---|
| Language Support | Native or binding availability |
| CI/CD Integration | Pipeline compatibility |
| Learning Curve | Team expertise, documentation |
| Community | Active development, support |
| Performance | Execution speed, resource usage |
| Reporting | Output formats, integrations |
| Cost | Open source vs commercial |
| Maintenance | Update frequency, stability |
Conclusion¶
Effective software testing is a multifaceted discipline that requires:
- Strategic Thinking: Choose the right tests for the right level
- Technical Skills: Master automation frameworks and tools
- Process Integration: Embed testing throughout the SDLC
- Continuous Improvement: Evolve tests with the codebase
- Quality Culture: Testing is everyone's responsibility
Key Takeaways¶
| Principle | Action |
|---|---|
| Shift Left | Start testing early in development |
| Automate Wisely | Automate regression, not exploration |
| Test Pyramid | More unit tests, fewer E2E tests |
| Coverage vs Quality | Meaningful tests over metrics |
| Fast Feedback | Quick test execution in CI/CD |
| Fix Flaky Tests | Don't ignore intermittent failures |
| Production Testing | Monitor and test in production |
Testing is not about finding bugs—it's about building confidence that your software delivers value to users reliably and consistently. The best testing strategy is one that evolves with your team, technology, and product needs.