In larger environments with thousands of users, you can often find multiple Domain Controller offering authentication and authorization services. For Windows-based endpoints, UCS utilizes Samba 4 to provide these services. In between the different Samba 4 servers, UCS uses the Domain Replication Service (DRS) to keep the server data synchronized. While Samba 4 does a superb job in replicating the data, there are some tweaks you can utilize to optimize the replication, to provide better performance in distributed environments. Let us have a look!
Replication Topology
When installing a second or further Domain Controller (DC), you join the server into the domain created by the UCS DC Master. The join process automatically generates relationships between the servers and takes care that the data gets synchronized between the servers. The synchronization of Samba 4 can take the topology of your network into account.
By default, each server prefers as many direct connections as possible, even if that is not the most efficient way to talk to each other.
Let us imagine a scenario with 4 servers — one Master and one Backup in a central office and one Slave at each remote location. When going from one remote location to the other one, all connections are routed through the central office, where the firewall introduces a significant penalty on these connections. However, data introduced in the first location still replicates from the Slave to the Master and Backup and to the second Slave. Only in case there where no direct connection from one Slave to the other is possible, the server would replicate the data indirectly. This replication scenario is called the full mesh replication.
In environments with 4 to 6 Domain Controllers, this works out of the box and reasonably well. Once the topology gets more complicated than a simple star around a central office or the number of controllers grows, the simple default might not produce the most optimal results. In this case, you can switch to sparse network replication. If we go back to the scenario above, the difference would be that the offsite server would not at all talk to each other. Instead, all communications would run via the central servers. In more complex scenarios, this can significantly reduce the background chatter which limits the replication load on both the network and most of the servers. The downsides are that any segmentation of the network has more severe results than the full mesh, and some central servers might experience higher loads.
Further, during topology negotiations, you might also see load spikes while the server determines the optimal paths and whether they can reach more than half of all servers. Thus, the sparse topology only makes sense when the reduction in the replication load is more significant than the additional topology load. Thus the positive effects typically are only seen with 7 or more servers or in complex networks.
To switch to the sparse replication, you should switch one UCR variable on all servers.
<pre>ucr set samba4/kccsrv/samba_kcc=yes</pre>
Afterward, you should restart Samba.
<pre>systemctl restart samba-ad-dc.service</pre>
Please note, if you only set it on some of the Domain Controllers, you combine the downside of both topology methods.