Definition
Error Handling in APIs encompasses strategies for detecting failures, communicating them to clients, logging diagnostic information, and implementing recovery mechanisms. Effective error handling distinguishes between transient errors (temporary network issues) and permanent failures (invalid requests), provides actionable error messages, and maintains system stability during partial outages.
Key components include:
- Detection - Identifying errors at application, network, and infrastructure layers
- Classification - Distinguishing client errors (4xx) from server errors (5xx)
- Communication - Returning structured, actionable error responses
- Logging - Recording errors with context for debugging
- Recovery - Implementing retries, fallbacks, circuit breakers
- Monitoring - Tracking error rates, patterns, and trends
Example
Stripe API Error Handling:
Stripe returns structured error objects with consistent formats:
{
"error": {
"type": "card_error",
"code": "card_declined",
"decline_code": "insufficient_funds",
"message": "Your card has insufficient funds.",
"param": "payment_method",
"request_id": "req_abc123"
}
}
Features:
- type: Error category (card_error, api_error, invalid_request_error)
- code: Machine-readable error code for programmatic handling
- message: Human-readable description
- param: Which field caused the error
- request_id: Unique ID for support inquiries
Client behavior:
- 4xx errors β Don’t retry, fix request
- 5xx errors β Retry with exponential backoff
- 429 errors β Respect Retry-After header
Code Example
// Comprehensive error handling pattern
class APIClient {
constructor(baseURL, apiKey) {
this.baseURL = baseURL;
this.apiKey = apiKey;
}
async request(endpoint, options = {}) {
const maxRetries = 3;
let lastError;
for (let attempt = 0; attempt <= maxRetries; attempt++) {
try {
const response = await fetch(`${this.baseURL}${endpoint}`, {
...options,
headers: {
'Authorization': `Bearer ${this.apiKey}`,
'Content-Type': 'application/json',
...options.headers
}
});
// Success
if (response.ok) {
return await response.json();
}
// Parse error response
const errorData = await response.json();
// Client errors (4xx) - don't retry
if (response.status >= 400 && response.status < 500) {
if (response.status === 429) {
// Rate limited - check Retry-After header
const retryAfter = response.headers.get('Retry-After');
const delay = retryAfter ? parseInt(retryAfter) * 1000 : 60000;
if (attempt < maxRetries) {
await this.sleep(delay);
continue;
}
}
// Other 4xx - don't retry
throw new APIClientError(
errorData.error?.message || 'Client error',
response.status,
errorData.error?.code,
errorData.error?.param
);
}
// Server errors (5xx) - retry with backoff
if (response.status >= 500) {
lastError = new APIServerError(
errorData.error?.message || 'Server error',
response.status,
errorData.error?.code
);
if (attempt < maxRetries) {
const backoff = Math.pow(2, attempt) * 1000; // Exponential backoff
await this.sleep(backoff);
continue;
}
throw lastError;
}
} catch (error) {
// Network errors - retry
if (error instanceof TypeError && error.message.includes('fetch')) {
lastError = new NetworkError('Network request failed', error);
if (attempt < maxRetries) {
const backoff = Math.pow(2, attempt) * 1000;
await this.sleep(backoff);
continue;
}
}
throw error;
}
}
throw lastError;
}
sleep(ms) {
return new Promise(resolve => setTimeout(resolve, ms));
}
}
// Custom error classes
class APIClientError extends Error {
constructor(message, status, code, param) {
super(message);
this.name = 'APIClientError';
this.status = status;
this.code = code;
this.param = param;
this.retryable = false;
}
}
class APIServerError extends Error {
constructor(message, status, code) {
super(message);
this.name = 'APIServerError';
this.status = status;
this.code = code;
this.retryable = true;
}
}
class NetworkError extends Error {
constructor(message, cause) {
super(message);
this.name = 'NetworkError';
this.cause = cause;
this.retryable = true;
}
}
// Usage
const client = new APIClient('https://api.example.com', 'sk_live_...');
try {
const payment = await client.request('/v1/payments', {
method: 'POST',
body: JSON.stringify({
amount: 1000,
currency: 'usd'
})
});
console.log('Payment created:', payment.id);
} catch (error) {
if (error instanceof APIClientError) {
console.error(`Client error (${error.status}): ${error.message}`);
if (error.param) {
console.error(`Invalid parameter: ${error.param}`);
}
} else if (error instanceof APIServerError) {
console.error(`Server error (${error.status}): ${error.message}`);
// Alert operations team
} else if (error instanceof NetworkError) {
console.error('Network error:', error.message);
// Show offline UI
}
}
Diagram
graph TB
A[API Request] --> B{Response Status}
B -->|2xx Success| C[Return Data]
B -->|4xx Client Error| D{Error Type}
B -->|5xx Server Error| E[Retry Logic]
B -->|Network Error| E
D -->|400/422| F[Validation Error
Don't Retry]
D -->|401/403| GAuth Error
[Refresh Token]
D -->|404| H[Not Found
Don't Retry]
D -->|429| I[Rate Limit
Wait Retry-After]
E -->|Attempt 1| J[Backoff 1s]
J -->|Attempt 2| K[Backoff 2s]
K -->|Attempt 3| L[Backoff 4s]
L -->|Max Retries| M[Throw Error]
F --> N[Log Error]
G --> O[Update Auth]
H --> N
I --> P[Sleep]
P --> A
M --> N
N --> Q[Error Monitoring]
Q --> R[Alert if threshold exceeded]
style C fill:#90EE90
style F fill:#FFD700
style G fill:#FFD700
style H fill:#FFD700
style I fill:#FFA500
style M fill:#FF6B6B
style R fill:#FF6B6B
Best Practices
1. Use Standard HTTP Status Codes Return appropriate status codes: 400 (bad request), 401 (unauthorized), 404 (not found), 429 (rate limited), 500 (server error), 503 (service unavailable).
2. Return Structured Error Responses Always return errors in a consistent JSON format with type, code, message, and contextual fields.
3. Distinguish Retryable vs Non-Retryable Errors Client errors (4xx except 429) should not be retried. Server errors (5xx) and network errors should retry with exponential backoff.
4. Include Request IDs Return unique request IDs in error responses to correlate with server logs for debugging.
5. Implement Circuit Breakers After N consecutive failures, stop sending requests to failing services for a cooldown period to prevent cascading failures.
6. Log Errors with Context Include request/response data, user context, timestamps, and stack traces in error logs (sanitize sensitive data).
7. Respect Rate Limit Headers When receiving 429 errors, honor the Retry-After header and implement exponential backoff.
8. Provide Actionable Error Messages Error messages should explain what went wrong and how to fix it. Avoid generic “Something went wrong” messages.
9. Monitor Error Rates Track error rates per endpoint, status code, and error type. Alert when thresholds are exceeded.
10. Implement Graceful Degradation When dependent services fail, degrade functionality rather than failing entirely (e.g., use cached data, disable non-critical features).
Security Notes
CRITICAL: Never expose internal error details in production. Sanitize all error responses to prevent information disclosure attacks.
Information Disclosure Prevention:
- No stack traces: Never include stack traces in production responses
- No database details: Sanitize database queries, table names, column names from error messages
- Hide file paths: Don’t expose server filesystem paths or internal directory structure
- Generic server errors: Use “Internal server error” for 5xx errors; log details internally only
- No internal service names: Don’t reveal microservice architecture or internal service names
Error Response Sanitization:
- Validate error input: Sanitize error messages to prevent XSS injection
- Don’t echo user input: If showing “invalid X”, don’t display the user’s input directly
- Consistent error format: Use standardized error responses to minimize information leakage
- Exclude implementation details: Don’t reveal libraries, versions, or frameworks used
Authentication & Enumeration Prevention:
- Identical responses for credentials: Return same response for invalid credentials and non-existent users
- Prevent user enumeration: Don’t reveal which emails are registered via error messages
- Rate limit auth failures: Limit failed login attempts to prevent brute force
- No information on failure: “Login failed” not “Invalid email” or “Wrong password”
Attack Detection & Monitoring:
- Separate security logging: Log authentication failures, injection attempts, enumeration attacks
- Monitor error patterns: Track suspicious error patterns (credential stuffing, path traversal attempts)
- Alert on anomalies: Abnormal error rates could indicate attacks
- Security context: Include user ID, IP address, timestamp in security error logs
Logging Best Practices:
- Log internally only: Store detailed errors in server logs, not client responses
- Sanitize logs: Don’t log passwords, API keys, or PII
- Structured logging: Use consistent format for error logs (timestamp, severity, error code, context)
- Retention policy: Define how long error logs are retained for audit/forensics
Common Mistakes
1. Exposing Stack Traces in Production Returning full stack traces reveals internal implementation details and file paths, aiding attackers.
2. Retrying Non-Retryable Errors Retrying 400/404 errors wastes resources and delays failure reporting to users.
3. No Exponential Backoff Fixed retry intervals can overload recovering services. Always use exponential backoff with jitter.
4. Generic Error Messages “Error occurred” provides no actionable information. Specify what failed and how to fix it.
5. Not Logging Error Context Logging only error messages without request IDs, user context, or timestamps makes debugging impossible.
6. Ignoring Rate Limit Headers Not respecting Retry-After headers leads to aggressive retries that worsen rate limiting.
7. No Circuit Breaker Continuously retrying a failing service causes cascading failures and resource exhaustion.
8. Inconsistent Error Formats Different endpoints returning different error structures breaks client error handling logic.