Lokasi ngalangkungan proxy:   [ UP ]  
[Ngawartoskeun bug]   [Panyetelan cookie]                
Skip to content

Commit 5fa2946

Browse files
authored
Providing Detailed on Lease Mechanism
- Cuases for Lease Timeout -Explanation of what it really is
1 parent 9e722cc commit 5fa2946

1 file changed

Lines changed: 2 additions & 2 deletions

File tree

docs/database-engine/availability-groups/windows/availability-group-lease-healthcheck-timeout.md

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -33,13 +33,13 @@ If health detection fails to report an update to the resource DLL for multiple i
3333

3434
## Lease mechanism
3535

36-
Unlike other failover mechanisms, The SQL server instance plays an active role in the lease mechanism. When bringing the AG online as the primary replica, the SQL Server instance spawns a dedicated lease worker thread for the AG. The lease worker shares a small region of memory with the resource host containing lease renewal and lease stop events. The lease worker and resource host work in a circular fashion, signaling their respective lease renewal event and then sleeping, waiting for the other party to signal its own lease renewal event or the stop event. Both the resource host and the SQL Server lease thread maintain a time-to-live value, which is updated each time the thread wakes up after being signaled by the other thread. If the time-to-live is reached while waiting for the signal, the lease expires and then replica transitions to the resolving state for that specific AG. If the lease stop event is signaled, then the replica transitions to a resolving role.
36+
Unlike other failover mechanisms, the SQL Server instance plays an active role in the lease mechanism. The Lease mechanism is used as a Looks-Alive validation between the Cluster resource host and the SQL Server process. The mechanism is used to ensure that the two sides -Cluster Serice and SQL Server are in frequent contact, checking each other's state and ultimately preventing a split-brain scenario. When bringing the AG online as the primary replica, the SQL Server instance spawns a dedicated lease worker thread for the AG. The lease worker shares a small region of memory with the resource host containing lease renewal and lease stop events. The lease worker and resource host work in a circular fashion, signaling their respective lease renewal event and then sleeping, waiting for the other party to signal its own lease renewal event or the stop event. Both the resource host and the SQL Server lease thread maintain a time-to-live value, which is updated each time the thread wakes up after being signaled by the other thread. If the time-to-live is reached while waiting for the signal, the lease expires and then replica transitions to the resolving state for that specific AG. If the lease stop event is signaled, then the replica transitions to a resolving role.
3737

3838
![image](media/availability-group-lease-healthcheck-timeout/image1.png)
3939

4040
The lease mechanism enforces synchronization between SQL Server and Windows Server Failover Cluster. When a failover command, is issued the cluster service makes an offline call to the resource DLL of the current primary replica. The resource DLL first attempts to take the AG offline using a stored procedure. If this stored procedure fails or times-out, the failure is reported back to the cluster service, which then issues a terminate command. The terminate again attempts to execute the same stored procedure, but the cluster this time does not wait for the resource DLL to report success or failure before bringing the AG online on a new replica. If this second procedure call fails, then the resource host will have to rely on the lease mechanism to take the instance offline. When the resource DLL is called to take the AG offline, the resource DLL signals the lease stop event, waking up the SQL Server lease worker thread to take the AG offline. Even if this stop event is not signaled, the lease will expire, and the replica will transition to the resolving state.
4141

42-
The lease is primarily a synchronization mechanism between the primary instance and the cluster, but it can also create failure conditions where there was otherwise no need to fail over. For example, high CPU or tempdb pressure can starve the lease worker thread, preventing lease renewal from the SQL instance and causing a failover.
42+
The lease is primarily a synchronization mechanism between the primary instance and the cluster, but it can also create failure conditions where there was otherwise no need to fail over. For example, high CPU, out-of-memory conditions, SQL process not responding due to a memory dump being generated, system-wide hang, or tempdb pressure can starve the lease worker thread, preventing lease renewal from the SQL instance and causing a failover.
4343

4444
## Guidelines for cluster timeout values
4545

0 commit comments

Comments
 (0)