The master-slave architecture, also known as the primary-replica architecture, is a widely used database replication pattern. It involves a primary server (the master) handling all write operations and one or more secondary servers (the slaves) that replicate data from the master. This design offers many benefits, but also comes with its own set of limitations and challenges. This post will look at the details of this architecture, exploring its advantages, disadvantages, and various implementation aspects.
How Master-Slave Architecture Works
The core principle is simple: the master server is the single source of truth. All write operations – INSERT, UPDATE, DELETE – are directed exclusively to the master. The master then propagates these changes to the slave servers through a replication process. Slave servers, in turn, primarily handle read operations, thereby offloading the read load from the master. This distribution of workload improves performance and scalability, especially for applications with a high read-to-write ratio.
Here’s a visual representation using a Diagram:
graph TD
Client[("Client Applications")]
LB["Load Balancer"]
Master[("Master DB<br/>Primary Node<br/>Handles Writes")]
S1[("Slave DB 1<br/>Read Replica")]
S2[("Slave DB 2<br/>Read Replica")]
S3[("Slave DB 3<br/>Read Replica")]
Client -->|Requests| LB
LB -->|Write Queries| Master
LB -->|Read Queries| S1
LB -->|Read Queries| S2
LB -->|Read Queries| S3
Master -->|Replication| S1
Master -->|Replication| S2
Master -->|Replication| S3
subgraph "Write Operations"
Master
end
subgraph "Read Operations"
S1
S2
S3
end
class Master master
class S1,S2,S3 slave
class Client client
class LB lb
This diagram shows the master server handling all write operations and distributing the data to multiple slave servers. Read operations are then directed to the slaves.
Key components of the master-slave architecture:
Client Applications
Entry point for all database requests
Connect through load balancer
Load Balancer
Directs write operations to master
Distributes read operations across slaves
Ensures high availability
Master Node
Handles all write operations
Maintains data consistency
Replicates changes to slaves
Slave Nodes
Read-only replicas
Receive updates from master
Handle read queries for load distribution
Provide redundancy and fault tolerance
Benefits:
Improved read performance through distribution
High availability
Data redundancy
Scalable read operations
Limitations:
Write operations limited to master capacity
Replication lag possible
Complex failover mechanisms needed
Database Replication Methods
Replication ensures data consistency across distributed database systems. Three primary methods exist for replicating data from primary (master) to secondary (slave) nodes:
1. Statement-Based Replication (SBR)
Primary node sends SQL statements to replicas.
Implementation Example
-- On PrimaryBEGINTRANSACTION;INSERTINTO users (name, email) VALUES ('John', 'john@example.com');UPDATE products SET stock = stock -1WHEREid=100;COMMIT;-- Replicated to secondaries with transaction boundaries
Advantages
Minimal network bandwidth consumption
Human-readable logs for debugging
Maintains stored procedures and triggers
Smaller binary logs
Limitations
Issues with non-deterministic functions (RAND(), UUID(), NOW())
Potential inconsistencies with concurrent transactions
Problems with AUTO_INCREMENT columns
LIMIT operations may produce varying results
2. Row-Based Replication (RBR)
Primary node replicates actual data modifications.
Improved Read Performance: By offloading read operations to the slave servers, the master can focus on write operations, leading to improved read performance.
Increased Scalability: Adding more slave servers allows for handling an increasing number of read requests.
High Availability (with limitations): In some configurations, if the master fails, one of the slaves can be promoted to become the new master (though this requires careful planning and implementation).
Data Backup and Recovery: Slaves can serve as backups of the master database, enabling data recovery in case of master failure.
Disadvantages of Master-Slave Architecture
Single Point of Failure: The master server is a single point of failure. If the master fails, write operations are disrupted until a new master is elected.
Write Bottleneck: All write operations go through the master, which can become a bottleneck if the write load is high.
Replication Lag: There is often a delay (replication lag) between the master and the slaves. This lag can be problematic for applications requiring real-time data consistency.
Complexity: Setting up and maintaining a master-slave configuration can be complex, requiring careful planning and monitoring.
Write limitations on Slaves: Slaves, by design, don’t typically accept write operations. This fundamental limitation needs to be accounted for in application design.