Hadoop is an open-source framework that allows to store and process big data in a distributed environment across clusters of computers using simple programming models. It is designed to scale up from single servers to thousands of machines, each offering local computation and storage. Hadoop is a software framework used for storing and processing large datasets across clusters of computers. It is designed to handle big data, which refers to extremely large and complex datasets that traditional databases cannot manage efficiently. Hadoop Distributed File System (HDFS) − This is a storage system that breaks large files into smaller pieces and distributes them across multiple computers in a cluster. It ensures data reliability and enables parallel processing of data across the cluster. MapReduce − This is a programming model used for processing and analyzing large datasets in parallel across the cluster. It consists of two main tasks: Map, which processes and transforms input data into intermediate key-value pairs, and Reduce, which aggregates and summarizes the intermediate data to produce the final output. YARN (Yet Another Resource Negotiator) − YARN is a resource management and job scheduling component of Hadoop. It allocates resources (CPU, memory) to various applications running on the cluster and manages their execution efficiently. Hadoop Common − This includes libraries and utilities used by other Hadoop components. It provides tools and infrastructure for the entire Hadoop ecosystem, such as authentication, configuration, and logging. ADVANTAGES: The biggest advantage of Hadoop is its ability to handle and process large volumes of data efficiently. Hadoop is designed to distribute data and processing tasks across multiple computers in a cluster, allowing it to scale easily to handle massive datasets that traditional databases or processing systems struggle to manage. This enables organizations to store, process, and analyze huge amounts of data, gaining valuable insights and making informed decisions that would not be possible with conventional technologies.