Support Ukrain
Adaltas logoAdaltasAdaltas logoAdaltas

Apache Hadoop HDFS

HDFS (Hadoop Distributed File System) is a highly available, distributed file system for storing large amounts of data. Data is stored on several computers (nodes) within a cluster. This is done by dividing the files into data blocks of fixed length and distributing them redundantly across the nodes.

The HDFS architecture is composed of master and worker nodes. The master node, called NameNode, is responsible for processing all incoming requests and organizes the storage of files and their associated metadata in the worder nodes, called DataNodes. HDFS is one of the main components of the Hadoop framework.

Related articles

H2O in practice: a protocol combining AutoML with traditional modeling approaches

H2O in practice: a protocol combining AutoML with traditional modeling approaches

Categories: Data Science, Learning | Tags: Automation, Cloud, H2O, Machine Learning, MLOps, On-premises, Open source, Python, XGBoost

H20 comes with a lot of functionalities. The second part of the series H2O in practice proposes a protocol to combine AutoML modeling with traditional modeling and optimization approach. The objective…

Canada - Morocco - France

International locations

10 rue de la Kasbah
2393 Rabbat
Canada

We are a team of Open Source enthusiasts doing consulting in Big Data, Cloud, DevOps, Data Engineering, Data Science…

We provide our customers with accurate insights on how to leverage technologies to convert their use cases to projects in production, how to reduce their costs and increase the time to market.

If you enjoy reading our publications and have an interest in what we do, contact us and we will be thrilled to cooperate with you.