Synchronous v/s Asynchronous aka Synchrony v/s Asynchrony is a very important fundamental concept in distributed systems.
The way we reason about properties of a distributed system or attempt to develop a mental model or abstraction for the system more or less depends on the nature of distributed system — whether synchronous or asynchronous.
Synchronous Distributed System
A synchronous distributed system comes with strong guarantees about properties and nature of the system. Because the system makes strong guarantees, it usually comes with strong assumptions and certain constraints.
Synchronous nature by itself is multi-faceted, and the following points will elaborate more on this:
Upper Bound on Message Delivery
There is a known upper bound on message transmission delay from one process to another process OR one machine/node to another machine/node. Messages are not expected to be delayed for arbitrary time periods between any given set of participating nodes.
Ordered Message Delivery
The communication channels between two machines are expected to deliver the messages in FIFO order. It means that the network will never deliver messages in an arbitrary or random order that can’t be predicted by the participating processes.
Notion of Globally Synchronized Clocks
Each node has a local clock, and the clocks of all nodes are always in sync with each other. This makes it trivial to establish a global real time ordering of events not only on a particular node, but also across the nodes.
Lock Step Based Execution
The participating processes execute in lock-step. An example will make it more clear. Consider a distributed system having a coordinator node that dispatches a message to other follower nodes, and each follower node is expected to process the message once the message is received. It cannot be the case that different follower nodes process the input message independently at different times and thus generate output state at different times. This is why we say processes execute in lock step synchrony a la lock step marching.
The main thing to remember about synchronous systems is that they allow us to make assumptions about time and order of events in a distributed system. This comes from the fact that clocks are in sync and there is a hard upper bound on message transmission delay between nodes.
The problem with synchronous distributed systems is that they are not really practical. Any software system based on strong assumptions tends to be less robust in real world settings and begins to break in practical/common workloads. For example, relying on the network that it is definitely going to deliver the message in a fixed amount of time is not really a practical assumption. In real world, software system is subjected to multiple kinds of failure.
Asynchronous Distributed System
The most important thing about an asynchronous distributed system is that it is more suitable for real world scenarios since it does not make any strong assumptions about time and order of events in a distributed system.
Clock may not be accurate, clocks can be out of sync
Clocks of different nodes in a distributed system can drift apart. Thus it is not at all trivial to reason about the global real time ordering of events across all the machines in the cluster. Machine local timestamps will no longer help here since the clocks are no longer assumed to be always in sync.
Messages can be delayed for arbitrary period of times
Unlike synchronous distributed system, there is no known upper limit on message transmission delay between nodes.
Asynchronous distributed system is tough to understand since it is not based on strong assumptions and does not really impose any constraints on time and ordering of events. It is also tough to design and implement such a system since the algorithms should tolerate different kinds of failures.
Our algorithms can no longer be designed to handle only a subset of failure conditions by ruling out some failure scenarios using strong assumptions. The onus and challenge of developing robust distributed algorithms is more in asynchronous distributed system.