How to Design a URL Shortener Like TinyURL: A Complete System Design Guide
Table of Contents
How to Design a URL Shortener Like TinyURL: A Complete System Design Guide
Introduction: The Humble Link That Scaled to Billions
Picture this: it’s 2002. You want to share a research paper with a colleague over email. The URL is 180 characters long, wraps across three lines, and breaks when your colleague tries to click it. A computer science student named Kevin Gilbertson had the same frustration — and TinyURL was born.Fast forward two decades. We have Bitly handling 10 billion clicks per month. Twitter’st.co redirecting every shared link in real time.
Instagram’s ig.me tracking conversions for advertisers.
What started as a convenience tool became one of the most quietly critical pieces of internet infrastructure.Designing a URL shortener looks deceptively simple — “just store a mapping of short code to long URL” —
but when you peel back the layers, you find a system that touches nearly every hard problem in distributed systems:
high-throughput reads, collision-free ID generation, cache stampedes, global latency, abuse prevention,
and analytics at massive scale.This guide walks through designing a production-grade URL shortener from a blank whiteboard
to a globally distributed system.What Are We Building?
Before drawing boxes and arrows, we should agree on exactly what the system needs to do.Functional Requirements
| Feature | Description |
|---|---|
| URL Shortening | Generate a unique short code from a long URL |
| URL Redirection | Redirect users from short URL to original URL |
| Custom Aliases | Allow users to choose their own short code |
| Expiry Support | Expire URLs after date or click threshold |
| Analytics | Track clicks, devices, geo-location, and referrers |
| User Accounts | Authenticated users can manage links |
| Link Deletion | Users can deactivate URLs |
Non-Functional Requirements
| Requirement | Target |
|---|---|
| Availability | 99.99% uptime |
| Latency | < 10ms P99 redirect latency |
| Write Throughput | 10,000 URL writes/second |
| Read/Write Ratio | 100:1 |
| Durability | Zero data loss |
| Scalability | Handle 10x spikes |
| Security | Prevent abuse and phishing |
Scale Estimation
Every distributed system starts with scale estimation because architecture decisions are meaningless without traffic assumptions.Traffic Assumptions
Daily Active Users (DAU): 100 million
New URLs/day: 10 million
URL redirects/day: 10 billion
Writes/sec ≈ 115
Reads/sec ≈ 115,000
Storage Estimation
Average record size ≈ 260 bytes
10 million URLs/day ≈ 2.6 GB/day
5 years ≈ 4.75 TB
Bandwidth Estimation
Read traffic ≈ 30 MB/sec
Analytics stream ≈ 57 MB/sec
URL Namespace Estimation
Base62 namespace:
62^7 = 3.5 trillion unique short codes
High-Level Architecture
Production-Grade URL Shortener Architecture
A globally distributed, low-latency, cache-first architecture designed to handle billions of redirects, massive traffic spikes, real-time analytics, and enterprise-grade reliability.
⚡ Redirect Read Path (Critical Hot Path)
User / Browser
Users access shortened links from browsers, mobile applications, emails, or social platforms.
CDN Edge Cache
Cloudflare or Fastly serves globally cached redirects from edge locations nearest to users.
WAF & DDoS Layer
Blocks malicious traffic, phishing attacks, bot abuse, and traffic floods.
Global Load Balancer
Routes requests to nearest healthy region using Anycast DNS and geo-routing.
Redirect Service
Ultra-low latency service optimized entirely for read-heavy redirect traffic.
Local Memory Cache
Per-instance in-memory cache for ultra-hot URLs with microsecond lookup speed.
Redis Cluster
Distributed cache layer serving most redirect lookups without touching databases.
Read Replica
Handles fallback lookups for cache misses while protecting primary databases.
302 Redirect
Returns redirect response while analytics are processed asynchronously.
✍️ URL Creation Write Path
Write API Service
Validates long URLs, handles aliases, expiration rules, and request authentication.
Snowflake ID Generator
Generates globally unique distributed IDs converted into Base62 short codes.
PostgreSQL Primary
Stores durable short-to-long URL mappings with strong consistency guarantees.
Redis Population
Preloads Redis immediately after writes for fast future redirects.
📊 Analytics & Async Processing
Kafka Cluster
Processes click analytics, fraud detection, and real-time event streams asynchronously.
Analytics Consumers
Aggregates click counts, device data, geo-location, and referrer information.
ClickHouse
Stores massive clickstream analytics efficiently using columnar storage.
📈 Monitoring & Reliability
Prometheus + Grafana
Tracks latency, cache hit rates, traffic spikes, SLA metrics, and system health.
Jaeger
Distributed tracing for debugging latency bottlenecks across services.
ELK Stack
Centralized logging, operational debugging, and production troubleshooting.
PagerDuty
Critical incident notifications and on-call escalation workflows.
This architecture is fundamentally cache-first because URL shorteners are extremely read-heavy systems. More than 95% of redirect traffic should ideally be served directly from CDN or Redis cache without hitting databases. The redirect path is optimized separately from the write path to scale reads independently from writes.
Database Design
SQL vs NoSQL
PostgreSQL is an excellent starting point because URL shorteners require strong consistency, reliable transactions, and mature indexing.At very large scale, systems often move hot lookup paths to Cassandra or DynamoDB.Schema Design
CREATE TABLE url_mappings (
short_code VARCHAR(12) PRIMARY KEY,
long_url TEXT NOT NULL,
user_id BIGINT,
created_at TIMESTAMPTZ DEFAULT NOW(),
expires_at TIMESTAMPTZ,
is_active BOOLEAN DEFAULT TRUE,
click_count BIGINT DEFAULT 0
);
Indexing Strategy
The primary index onshort_code ensures extremely fast lookups.
Since reads dominate heavily, keeping indexes memory-resident is critical.API Design
Create Short URL
POST /api/v1/urls
{
"long_url": "https://example.com",
"custom_alias": "my-blog"
}
Redirect API
GET /{short_code}
Using 302 redirects allows analytics tracking because browsers do not permanently cache the redirect.
Detailed Component Design
ID Generator Service
Generating globally unique short codes is one of the hardest problems in the system.Common Approaches
- UUID + Base62 encoding
- Global auto-increment counter
- Snowflake IDs
- Pre-generated code pool
Write API Service
The Write API validates URLs, generates codes, stores mappings, and immediately populates Redis cache.Read Service
The redirect service is optimized entirely around cache-hit performance.
L1: CDN Edge Cache
L2: Redis Cluster
L3: Database Read Replica
Caching Strategy
Redis Cluster
Key: url:aB3xKz
Value: https://example.com
TTL: 86400 seconds
Cache Strategies
- Write Through: Populate cache during writes
- Cache Aside: Populate cache on misses
- Sliding TTL: Extend cache for frequently accessed links
Hot Key Problem
A viral URL can overload a single Redis node. Solutions include local LRU caches, key replication, and aggressive CDN caching.Load Balancing
Layer 7 Load Balancing
Layer 7 load balancing allows routing based on URL paths.Strategies
- Round Robin
- Least Connections
- Consistent Hashing
Message Queues and Async Processing
Analytics processing should never block redirect responses.
sequenceDiagram
participant Browser
participant ReadService
participant Kafka
Browser->>ReadService: GET /aB3xKz
ReadService->>Browser: 302 Redirect
ReadService->>Kafka: Publish analytics event
Why Kafka?
- High throughput
- Event replay support
- Consumer group scalability
- Fault tolerance
Scalability Strategies
Stateless Services
Stateless APIs make horizontal scaling simple.Microservices
- URL Service
- Redirect Service
- Analytics Service
- User Service
- ID Generator Service
Auto Scaling
Use CPU and latency-based auto scaling for redirect services.Database Scaling
Read Replicas
All redirect reads should go through replicas.Sharding
- Shard by short code prefix
- Shard using consistent hashing
Multi-Region Deployment
Deploy databases across multiple regions to reduce latency.Fault Tolerance and High Availability
Failover
- Redis Sentinel
- PostgreSQL failover
- Kafka replication factor = 3
Thundering Herd Problem
If Redis fails, massive traffic suddenly hits the database. Use request coalescing and circuit breakers to protect the system.Security Considerations
- Google Safe Browsing API integration
- Rate limiting
- JWT authentication
- TLS 1.3 encryption
- Phishing prevention
Monitoring and Observability
Monitoring Stack
- Prometheus
- Grafana
- Jaeger
- ELK Stack
Critical Metrics
- P99 latency
- Cache hit rate
- Error rate
- Kafka consumer lag
Bottlenecks and Trade-offs
| Trade-off | Decision | Reason |
|---|---|---|
| 301 vs 302 | 302 | Enable analytics tracking |
| Sync vs Async Analytics | Async | Keep redirects fast |
| Long vs Short TTL | Sliding TTL | Balance freshness and hit rate |
Real-World Technology Stack
| Company | Technology |
|---|---|
| Bitly | Cassandra + Redis |
| Custom distributed KV store | |
| Kafka + Espresso | |
| TAO + Memcached |
Interview Perspective
What Interviewers Expect
- Requirements clarification
- Scale estimation
- Caching discussion
- Database sharding strategy
- Trade-off analysis
Common Mistakes
- Ignoring cache design
- Making analytics synchronous
- Skipping scale estimation
- Ignoring abuse prevention
Advanced Improvements
- AI-based phishing detection
- Predictive caching
- Edge computing with Cloudflare Workers
- Multi-region active-active architecture
Final Architecture Diagram
graph TD
CDN[CDN Edge]
LB[Load Balancer]
API[Redirect API]
Redis[Redis Cluster]
DB[PostgreSQL]
Kafka[Kafka]
CDN --> LB
LB --> API
API --> Redis
Redis --> DB
API --> Kafka
Conclusion
Key Takeaways
- URL shorteners are fundamentally cache-first systems.
- Read-heavy architectures require aggressive CDN and Redis optimization.
- Analytics should always be asynchronous.
- Snowflake IDs are one of the best distributed ID generation strategies.
- Security and phishing prevention are mandatory from day one.
Frequently Asked Questions
What is the best algorithm for generating short codes?
Snowflake IDs combined with Base62 encoding provide globally unique IDs without coordination.Why use 302 redirects instead of 301?
302 redirects preserve analytics because browsers do not permanently cache the redirect.How do URL shorteners scale?
Using CDN edge caching, Redis distributed cache, and stateless services.Internal Linking Suggestions
- Design a Rate Limiter
- Design an API Gateway
- Consistent Hashing Explained
- Redis Caching Patterns