Cloudera Distributed Hadoop (CDH)
CDH is an open source distribution of Hadoop and complementary compoenents made by Cloudera. With HDP packaged by Hortonworks, there are the most common, complete, tested, and widely deployed distribution of Apache Hadoop.
Related articles

Introducing Trunk Data Platform: the Open-Source Big Data Distribution Curated by TOSIT
Categories: Big Data, DevOps & SRE, Infrastructure | Tags: Ansible, Ranger, DevOps, Hortonworks, Hadoop, HBase, Knox, Spark, Cloudera, CDP, CDH, Open source, TDP
Ever since Cloudera and Hortonworks merged, the choice of commercial Hadoop distributions for on-prem workloads essentially boils down to CDP Private Cloud. CDP can be seen as the “best of both worlds…
Apr 14, 2022

An overview of Cloudera Data Platform (CDP)
Categories: Big Data, Cloud Computing, Data Engineering | Tags: SDX, Data Analytics, Big Data, Cloud, Cloudera, CDP, CDH, Data Hub, Data Lake, Data Warehouse
Cloudera Data Platform (CDP) is a cloud computing platform for businesses. It provides integrated and multifunctional self-service tools in order to analyze and centralize data. It brings security and…
Jul 19, 2021

Notes on the Cloudera Open Source licensing model
Categories: Big Data | Tags: CDSW, License, Cloudera Manager, Open source
Following the publication of its Open Source licensing strategy on July 10, 2019 in an article called “our Commitment to Open Source Software”, Cloudera broadcasted a webinar yesterday October 2…
By David WORMS
Oct 25, 2019

Present and future of Hadoop workflow scheduling: Oozie 5.x
Categories: Big Data, DataWorks Summit 2018 | Tags: Hive, Sqoop, HDP, REST, Hadoop, Oozie, CDH
During the DataWorks Summit Europe 2018 in Berlin, I had the opportunity to attend a breakout session on Apache Oozie. It covers the new features released in Oozie 5.0, including future features of…
May 23, 2018

Ambari - How to blueprint
Categories: Big Data, DevOps & SRE | Tags: Ambari, Ranger, Automation, DevOps, Operation, REST
As infrastructure engineers at Adaltas, we deploy Hadoop clusters. A lot of them. Let’s see how to automate this process with REST requests. While really handy for deploying one or two clusters, the…
Jan 17, 2018

Cloudera Sessions Paris 2017
Categories: Big Data, Events | Tags: EC2, Altus, CDSW, SDX, PaaS, Azure, Cloudera, CDH, Data Science
Adaltas was at the Cloudera Sessions on October 5, where Cloudera showcased their new products and offerings. Below you’ll find a summary of what we witnessed. Note: the information were aggregated in…
Oct 16, 2017

Exposing Kafka on two different networks
Categories: Infrastructure | Tags: Cyber Security, VLAN, Kafka, Cloudera, CDH, Network
A Big Data setup usually requires you to have multiple networking interface, let’s see how to set up Kafka on more than one of them. Kafka is a open-source stream processing software platform system…
Jul 22, 2017

Managing authorizations with Apache Sentry
Categories: Data Governance | Tags: Ansible, Hue, Database, LDAP, Nikita, Sentry, CDH, Deployment
Apache Sentry is a system for enforcing fine grained role based authorization to data and metadata stored on a Hadoop cluster. With this article, we will show you how we are using Apache Sentry at…
By Axel JACQIN
Jul 24, 2017

Composants for CDH and HDP
Categories: Big Data | Tags: Flume, Hive, Sqoop, Zookeeper, Hortonworks, HDP, Hadoop, Oozie, Cloudera, CDH
I was interested to compare the different components distributed by Cloudera and HortonWorks. This also gives us an idea of the versions packaged by the two distributions. At the time of this writting…
By David WORMS
Sep 22, 2013

Testing the Oracle SQL Connector for Hadoop HDFS
Categories: Data Engineering | Tags: Database, File system, Oracle, HDFS, CDH, SQL
Using Oracle SQL Connector for HDFS, you can use Oracle Database to access and analyze data residing in HDFS files or a Hive table. You can also query and join data in HDFS or a Hive table with other…
By David WORMS
Jul 15, 2013