Schema

Standards Security Notes Jan 9, 2026 JSON
schema json-schema validation data-modeling api-design

Definition

A schema is a blueprint that defines what data looks like - its structure, types, required fields, validation rules, and constraints. In API contexts, schemas are typically written in JSON Schema, a vocabulary that lets you annotate and validate JSON documents. Think of a schema as a contract: it tells API clients exactly what shape the data must have for requests to succeed, and what shape to expect in responses.

Schemas are both human-readable documentation and machine-executable validation logic. A developer can read a schema to understand that a “user” object requires an email field (string, email format) and an age field (integer, 18-120). Tools can automatically validate incoming data against that same schema, rejecting invalid requests before they reach business logic.

In OpenAPI, schemas define request bodies, response bodies, parameters, and reusable data models. In databases, schemas define table structures. In GraphQL, schemas define queries and mutations. Everywhere you find structured data, you’ll find schemas describing it.

Example

E-commerce Product Schema: An online store’s API defines a Product schema requiring fields like SKU (string, pattern: ^[A-Z0-9-]+$), price (number, minimum: 0.01), stock (integer, minimum: 0), and optional fields like description. This schema validates products from suppliers before importing them into the catalog.

User Registration Schema: A social network defines a schema for user signup requiring email (format: email), password (minLength: 12, pattern: must include uppercase, lowercase, number, symbol), username (pattern: alphanumeric, 3-20 chars), and optional fields like bio (maxLength: 500). Invalid registrations are rejected before database writes.

Payment Processing Schema: Stripe’s API schemas validate credit card data - cardNumber (string, pattern: 13-19 digits), expiryDate (pattern: MM/YY), CVV (pattern: 3-4 digits). The schema uses conditional validation: if paymentMethod is “card”, these fields are required; if “bank_transfer”, different fields apply.

IoT Sensor Data Schema: A smart home platform defines schemas for sensor readings. Temperature sensors must send readings as floats between -50.0 and 150.0 with a timestamp (ISO 8601 format) and sensorId (UUID). Invalid readings are flagged before storage.

Configuration File Schema: Docker Compose files validate against a JSON Schema. The schema requires version (string), services (object with specific properties), and optionally networks, volumes. IDEs use this schema to provide autocomplete and validation while editing docker-compose.yml files.

Analogy

The Building Code: Just as building codes specify that electrical wiring must be 14-gauge copper, run through conduits, and terminate at properly grounded outlets, a schema specifies that an “address” field must be a string, max 200 characters, and match a pattern that looks like a real address. Inspectors (validators) check buildings against the code just like API gateways check requests against schemas.

The Recipe Card: A recipe schema defines ingredients (array of objects with name, quantity, unit), instructions (array of strings), prepTime (integer, minutes), servings (integer, positive). Just as you can’t bake a cake without flour (required field), you can’t submit a recipe without ingredients. The schema ensures every recipe follows the same structure.

The Job Application Form: A schema is like a standardized job application form - it specifies which fields are required (name, email), which are optional (cover letter), what formats are acceptable (phone number must be 10 digits), and what values are valid (graduation year must be between 1950 and current year). Applications that don’t follow the form get rejected immediately.

Code Example


{
  "$schema": "https://json-schema.org/draft/2020-12/schema",
  "$id": "https://example.com/schemas/user.json",
  "title": "User",
  "description": "A registered user in the system",
  "type": "object",
  "required": ["id", "email", "username"],
  "properties": {
    "id": {
      "type": "string",
      "format": "uuid",
      "description": "Unique identifier for the user"
    },
    "email": {
      "type": "string",
      "format": "email",
      "description": "User's email address"
    },
    "username": {
      "type": "string",
      "pattern": "^[a-zA-Z0-9_]{3,20}$",
      "description": "Username (3-20 alphanumeric characters or underscores)"
    },
    "age": {
      "type": "integer",
      "minimum": 18,
      "maximum": 120,
      "description": "User's age"
    },
    "role": {
      "type": "string",
      "enum": ["admin", "moderator", "user"],
      "default": "user"
    },
    "preferences": {
      "type": "object",
      "properties": {
        "newsletter": {
          "type": "boolean",
          "default": false
        },
        "theme": {
          "type": "string",
          "enum": ["light", "dark", "auto"],
          "default": "auto"
        }
      }
    },
    "createdAt": {
      "type": "string",
      "format": "date-time",
      "description": "Account creation timestamp"
    }
  },
  "additionalProperties": false
}

Validating data against schema (JavaScript):


const Ajv = require("ajv");
const addFormats = require("ajv-formats");

const ajv = new Ajv();
addFormats(ajv);

const schema = {
  type: "object",
  required: ["email", "username"],
  properties: {
    email: { type: "string", format: "email" },
    username: {
      type: "string",
      pattern: "^[a-zA-Z0-9_]{3,20}$"
    },
    age: {
      type: "integer",
      minimum: 18,
      maximum: 120
    }
  }
};

const validate = ajv.compile(schema);

// Valid data
const validUser = {
  email: "[email protected]",
  username: "alice_123",
  age: 25
};
console.log(validate(validUser)); // true

// Invalid data
const invalidUser = {
  email: "not-an-email",
  username: "ab", // too short
  age: 15 // too young
};
console.log(validate(invalidUser)); // false
console.log(validate.errors);
// [
//   { instancePath: '/email', message: 'must match format "email"' },
//   { instancePath: '/username', message: 'must match pattern ...' },
//   { instancePath: '/age', message: 'must be >= 18' }
// ]

Diagram

graph TB
    subgraph "Schema Validation Flow"
        CLIENT[Client] -->|Sends Data| GATE[API Gateway]
        GATE -->|Validates| SCHEMA[JSON Schema]

        SCHEMA -->|Valid| HANDLER[Request Handler]
        SCHEMA -->|Invalid| ERROR[400 Bad Request
Validation Errors] HANDLER -->|Response| SCHEMA2[Response Schema] SCHEMA2 -->|Validates| RESP[Send Response] SCHEMA2 -->|Invalid| LOG[Log Schema Violation
Alert Developers] end subgraph "Schema Definition" DEF[Schema File] DEF --> TYPES[Type Definitions
string, number, object] DEF --> RULES[Validation Rules
min, max, pattern] DEF --> REQ[Required Fields] DEF --> FORMAT[Formats
email, uuid, date-time] end SCHEMA -.References.- DEF SCHEMA2 -.References.- DEF style SCHEMA fill:#90EE90 style ERROR fill:#FF6B6B style HANDLER fill:#87CEEB

Security Notes

SECURITY NOTES

CRITICAL: API schemas define contract. Validate requests against schema to prevent attacks.

Schema Types:

  • JSON Schema: Define JSON structure and constraints
  • OpenAPI: REST API specification (formerly Swagger)
  • GraphQL Schema: Type definitions for GraphQL APIs
  • WSDL: SOAP web service definition
  • Protobuf: Binary format with schema validation

Schema Validation:

  • Enforce validation: Validate all requests against schema
  • Reject invalid: Return 400 Bad Request for schema violations
  • Type checking: Validate data types (string, number, boolean)
  • Format validation: Validate formats (email, URI, date-time)
  • Constraint checking: Validate min/max length, ranges, patterns

Security Benefits:

  • Input validation: Schema validation prevents injection attacks
  • Type safety: Enforced types prevent type confusion attacks
  • Denial of service: Limit payloads to prevent resource exhaustion
  • Data integrity: Ensure data meets requirements

Schema Evolution:

  • Backward compatible: New schema versions should accept old data
  • Forward compatible: Old clients should ignore new fields
  • Versioning: Version schema alongside API
  • Deprecation: Deprecate fields before removal
  • Migration path: Provide migration tools for major changes

API Documentation:

  • Machine readable: Schema is machine-parseable
  • Auto-generated docs: Generate documentation from schema
  • Example values: Include examples in schema
  • Validation rules: Document all validation constraints
  • Error examples: Provide error response examples

Best Practices:

  • Use schema registry: Centralize schema management
  • Validate at boundary: Validate at API entry point
  • Default values: Define sensible defaults
  • Required fields: Clearly mark required vs optional
  • Extensibility: Design schema for future extensibility

Best Practices

  1. Fail fast with validation - Validate at API gateway before expensive business logic runs
  2. Use strict validation - Set additionalProperties: false to reject unexpected fields
  3. Provide clear error messages - Map validation errors to human-readable messages for developers
  4. Version your schemas - Schemas evolve; version them like APIs (breaking vs. non-breaking changes)
  5. Reuse schema definitions - Use $ref to reference common schemas, avoiding duplication
  6. Document with descriptions - Add description fields to every property for auto-generated docs
  7. Test edge cases - Validate boundary values (min, max), empty arrays, null handling
  8. Use format validators - Leverage built-in formats (email, uuid, date-time) rather than custom patterns

Common Mistakes

Over-validation: Making every field required with strict patterns when flexibility is needed. Start permissive, tighten based on actual data issues.

Under-validation: Accepting any string when you need emails, or any number when you need positive integers. This pushes validation into business logic where it’s harder to maintain.

Allowing additional properties: Leaving additionalProperties: true (the default) lets clients send junk fields that might cause security issues or break future updates.

Vague error messages: Returning “validation failed” instead of specific field-level errors. Developers need to know exactly what’s wrong.

Not validating responses: Only validating incoming requests but ignoring response schemas. Response validation catches backend bugs early.

Regex catastrophic backtracking: Using complex regex patterns that cause exponential time complexity on certain inputs (ReDoS attacks).

Mixing validation layers: Validating the same data in schema, business logic, and database constraints with different rules. Pick one source of truth.

Standards & RFCs

Standards & RFCs
1)- JSON Schema Draft 2020-12
2)- JSON Schema Validation
3)- JSON Schema Core
4)- [OpenAPI 3](https://reference.apios.info/terms/openapi-3/).1.0 (uses JSON Schema)