Tuesday, July 16, 2024

BigData - Distributed System Design Patterns

Below are thlist of interesting topic/technics which are used/considered while building Hadoop or any Distributed systems.

Bloom Filters  - used to check key available in big/huge dataset

High-water mark index -  index refers, index all followers are written

Lease - lock release on resource

Heartbeat - Worker periodically send signal to master to indicate there availability. 

Fence -  Put a 'Fence' around the previous leader to prevent it from doing any damage or causing corruption. Fencing is the idea of putting a fence around a previously active leader so that it cannot access cluster resources and hence stop serving any read/write request. The following two techniques are used:

  • Resource fencing
  • Node fencing

Examples: HDFS uses fencing to stop the previously active NameNode from accessing cluster resources, thereby stopping it from servicing requests.

High-water mark index

Distributed systems keep multiple copies of data for fault tolerance and higher availability. To achieve strong consistency, one of the options is to use a leader-follower setup, where the leader is responsible for entertaining all the writes, and the followers replicate data from the leader.

Quorum - used for HA. 

Write-ahead log (WAL) -  log file where mater node write/append data 

No comments: