Definition
It’s 3 AM and your production system is down. Customers are complaining, your team is scrambling, and you need to figure out what happened - fast. You open your logs and see… nothing useful. Just a wall of unstructured text with no correlation between entries, no context about what was happening when things broke. This nightmare scenario is why logging matters: logs are your system’s memory, and good logging practices can mean the difference between a 5-minute fix and a 5-hour investigation.
Logging is the practice of recording discrete events that happen in your application - requests received, errors caught, state changes, performance data. Unlike metrics (which aggregate data over time) or traces (which follow request paths), logs capture specific moments with rich context. A log entry might say “User [email protected] failed login attempt from IP 192.168.1.1 at 2024-01-15T10:30:00Z - reason: invalid password.”
Modern logging has evolved from simple text files to structured data (JSON), centralized log aggregation (ELK, Splunk, Datadog), and intelligent analysis. Structured logging means each log entry is a parseable object with fields like timestamp, level, service, correlation_id, and message - making logs searchable, filterable, and analyzable at scale. When done right, logs answer the question “what happened?” with precision.
Example
Debugging a Production Incident: Netflix experiences slow video starts. Logs show a spike in “connection timeout” entries from a specific region, all with the same upstream service. Cross-referencing correlation IDs reveals the CDN in that region is failing. Total time to diagnosis: 5 minutes, because the logs told the story.
Security Audit Trail: A bank needs to prove what happened during a disputed transaction. Logs show: user authenticated at 10:00 AM, viewed account at 10:01, initiated transfer at 10:02, confirmed with 2FA at 10:02:30, transfer completed at 10:02:35. Each entry has timestamps, user IDs, IP addresses, and transaction details - a complete audit trail.
API Rate Limiting Investigation: A customer complains they’re being rate limited unfairly. Logs show their API key made 10,000 requests in the last minute, all from the same IP, 90% returning the same cached response - suggesting a misconfigured retry loop. The logs proved the rate limiting was correct.
Performance Regression Detection: After a deployment, logs show increased “database query slow” warnings. Filtering by correlation ID reveals a new endpoint is executing N+1 queries. The log entries include query timing, SQL statements, and the request that triggered them - enough to identify and fix the regression.
Third-Party Integration Troubleshooting: Your Stripe webhook isn’t working. Logs show the webhook arriving, signature validation passing, then a timeout calling your inventory service. Without logs, you’d blame Stripe; with logs, you found your own bug.
Analogy
The Flight Recorder (Black Box): Aircraft have flight recorders that capture everything happening during a flight. After an incident, investigators analyze the recordings to understand what happened. Your application logs serve the same purpose - they’re the black box for your software.
The Security Camera System: Security cameras don’t prevent theft, but they let you see exactly what happened, when, and by whom. Logs are your application’s security cameras - recording events for later investigation.
The Ship’s Log: Historically, ships kept detailed logs of their voyages - weather, position, incidents, decisions. If something went wrong, the log told the story. The captain didn’t log “sailed somewhere” - they logged “departed London at 0800, heading SW at 12 knots, crew of 45, cargo: 200 barrels of rum.”
The Medical Record: Doctors don’t just remember your history - they write it down. Every visit, test result, and prescription is logged. Years later, a new doctor can understand your complete health history. Application logs work the same way for your software’s “health.”
Code Example
// Structured logging with correlation and context
import winston from 'winston';
// Configure structured JSON logging
const logger = winston.createLogger({
level: process.env.LOG_LEVEL || 'info',
format: winston.format.combine(
winston.format.timestamp(),
winston.format.json()
),
defaultMeta: {
service: 'user-api',
version: process.env.APP_VERSION,
environment: process.env.NODE_ENV
},
transports: [
new winston.transports.Console(),
new winston.transports.File({ filename: 'logs/error.log', level: 'error' }),
new winston.transports.File({ filename: 'logs/combined.log' })
]
});
// Request logging middleware
function requestLogger(req: Request, res: Response, next: NextFunction) {
const correlationId = req.headers['x-correlation-id'] || uuid();
const startTime = Date.now();
// Attach to request for use in handlers
req.correlationId = correlationId;
req.logger = logger.child({ correlationId });
// Log request start
req.logger.info('Request started', {
method: req.method,
path: req.path,
query: req.query,
userAgent: req.headers['user-agent'],
ip: req.ip
});
// Log response on finish
res.on('finish', () => {
const duration = Date.now() - startTime;
const level = res.statusCode >= 500 ? 'error' :
res.statusCode >= 400 ? 'warn' : 'info';
req.logger.log(level, 'Request completed', {
method: req.method,
path: req.path,
statusCode: res.statusCode,
durationMs: duration,
contentLength: res.get('content-length')
});
});
next();
}
// Application logging with context
async function createUser(req: Request, res: Response) {
const { email, name } = req.body;
req.logger.info('Creating user', { email });
try {
const user = await db.users.create({ email, name });
req.logger.info('User created successfully', {
userId: user.id,
email
});
res.status(201).json(user);
} catch (error) {
req.logger.error('Failed to create user', {
email,
error: error.message,
stack: error.stack,
code: error.code
});
res.status(500).json({ error: 'Failed to create user' });
}
}
// Log levels for different purposes
req.logger.debug('Detailed debugging info', { rawData }); // Development only
req.logger.info('Normal operations', { action: 'user_login' }); // Business events
req.logger.warn('Potential problems', { memoryUsage: '85%' }); // Warnings
req.logger.error('Errors occurred', { error, stack }); // Errors
req.logger.fatal('System critical', { service: 'down' }); // Critical failures
// Output example (JSON structured log):
// {
// "timestamp": "2024-01-15T10:30:00.000Z",
// "level": "info",
// "service": "user-api",
// "version": "1.2.3",
// "environment": "production",
// "correlationId": "abc-123-xyz",
// "message": "User created successfully",
// "userId": "user_456",
// "email": "[email protected]"
// }
Diagram
flowchart LR
subgraph Applications["Applications"]
A1[Web API]
A2[Worker]
A3[Mobile Backend]
end
subgraph LogPipeline["Log Pipeline"]
B[Log Shipper
Filebeat/Fluentd]
CMessage Queue
[Kafka]
D[Log Processor
Logstash]
end
subgraph Storage["Storage & Analysis"]
E[(Elasticsearch
Index)]
F[Kibana
Dashboard]
G[Alerts
PagerDuty]
end
A1 --> B
A2 --> B
A3 --> B
B --> C
C --> D
D --> E
E --> F
E --> G
style A1 fill:#93c5fd
style A2 fill:#93c5fd
style A3 fill:#93c5fd
style E fill:#86efac
style F fill:#fcd34d
Security Notes
CRITICAL: Logging exposes sensitive data. Never log passwords, tokens, PII, or API keys.
Sensitive Data Redaction:
- Never log credentials: Exclude passwords, API keys, tokens from logs
- Redact PII: Mask credit cards, SSNs, email addresses, phone numbers
- Redact headers: Exclude Authorization, Cookie, or Set-Cookie headers
- Redact request body: Exclude sensitive fields from request/response logs
- Redact URLs: Exclude sensitive query parameters from logged URLs
Log Injection Prevention:
- Sanitize input: Sanitize user input before logging
- Prevent CRLF injection: Don’t allow in log entries
- Escape special characters: Properly escape special characters
- Structured logging: Use JSON or structured format to prevent injection
- Log parser security: Ensure log parsers are secure
Log Access Control:
- Restrict log access: Only authorized personnel can view logs
- Encryption at rest: Encrypt logs on disk
- Encryption in transit: Encrypt logs in transit to log storage
- Access logging: Log who accessed what logs and when
- Multi-factor auth: Require MFA for log access
Audit & Retention:
- Security events: Log authentication failures, authorization denials, anomalies
- Retention policy: Define how long logs are retained
- Immutability: Prevent log tampering or deletion
- Backup: Maintain backups of important logs
- Monitoring: Alert on suspicious log activity
Performance & Scalability:
- Asynchronous logging: Log asynchronously to avoid blocking requests
- Log levels: Use appropriate log levels (INFO, WARN, ERROR)
- Sampling: Sample high-volume logs to avoid overwhelming storage
- Compression: Compress logs to reduce storage requirements
- Monitoring: Monitor log storage to prevent disk exhaustion
Compliance:
- Regulatory requirements: Meet GDPR, HIPAA, PCI-DSS logging requirements
- Log comprehensiveness: Log sufficient detail for forensics and audits
- Tamper evidence: Ensure logs cannot be tampered with
- Data minimization: Only log necessary information
Best Practices
- Use structured logging (JSON) - Makes logs searchable, parseable, and analyzable at scale
- Include correlation IDs - Every request should have an ID that follows it through all services
- Use appropriate log levels - DEBUG for development, INFO for normal operations, WARN for problems, ERROR for failures
- Add context to every log - Include relevant data like user IDs, request IDs, timing information
- Centralize logs - Aggregate logs from all services into one searchable system (ELK, Splunk, Datadog)
- Redact sensitive data - Never log passwords, tokens, PII, or payment information
- Log at service boundaries - Entry and exit points of your service are critical logging locations
- Include timestamps with timezone - Use ISO 8601 format with UTC for consistency
- Don’t log too much - Excessive logging impacts performance and costs money; log what matters
- Make logs actionable - A log entry should help someone understand what happened and what to do
Common Mistakes
Unstructured text logs: console.log("error: " + error) becomes impossible to search at scale. Use structured JSON.
Missing correlation IDs: Without correlation IDs, you can’t trace a request across services. Every log entry needs one.
Logging sensitive data: Passwords, tokens, and PII in logs are security and compliance nightmares.
Wrong log levels: Using ERROR for everything (or DEBUG in production) makes important logs impossible to find.
No context: “Error occurred” tells you nothing. “Failed to process payment for order_id=123, user_id=456, error=card_declined” tells you everything.
Log volume extremes: Logging nothing leaves you blind; logging everything is expensive and noisy. Find the balance.
Not logging errors with stack traces: When you catch an exception, log the full stack trace - you’ll need it for debugging.
Inconsistent log formats: Different formats across services make centralized analysis difficult. Standardize on a format.