Cloudera Distributed Hadoop (CDH)

CDH is an open source distribution of Hadoop and complementary compoenents made by Cloudera. With HDP packaged by Hortonworks, there are the most common, complete, tested, and widely deployed distribution of Apache Hadoop.

Related articles

Notes on the Cloudera Open Source licensing model

Categories: Big Data | Tags: CDSW, License, Open source, Cloudera Manager

Following the publication of its Open Source licensing strategy on July 10, 2019 in an article called “our Commitment to Open Source Software”, Cloudera broadcasted a webinar yesterday October 2…

David WORMS

By David WORMS

Oct 25, 2019

Present and future of Hadoop workflow scheduling: Oozie 5.x

Categories: Big Data, DataWorks Summit 2018 | Tags: Hive, Oozie, Sqoop, HDP, REST, Hadoop, CDH

During the DataWorks Summit Europe 2018 in Berlin, I had the opportunity to attend a breakout session on Apache Oozie. It covers the new features released in Oozie 5.0, including future features of…

Leo SCHOUKROUN

By Leo SCHOUKROUN

May 23, 2018

Ambari - How to blueprint

Categories: Big Data, DevOps & SRE | Tags: Ambari, Ranger, Automation, DevOps, Operation, REST

As infrastructure engineers at Adaltas, we deploy Hadoop clusters. A lot of them. Let’s see how to automate this process with REST requests. While really handy for deploying one or two clusters, the…

Joris RUMMENS

By Joris RUMMENS

Jan 17, 2018

Cloudera Sessions Paris 2017

Categories: Big Data, Events | Tags: EC2, Cloudera, Altus, CDSW, SDX, PaaS, CDH, Azure

Adaltas was at the Cloudera Sessions on October 5, where Cloudera showcased their new products and offerings. Below you’ll find a summary of what we witnessed. Note: the information were aggregated in…

César BEREZOWSKI

By César BEREZOWSKI

Oct 16, 2017

Managing authorizations with Apache Sentry

Categories: Data Governance | Tags: Ansible, Hue, Database, LDAP, Nikita, Sentry, CDH, Deployment

Apache Sentry is a system for enforcing fine grained role based authorization to data and metadata stored on a Hadoop cluster. With this article, we will show you how we are using Apache Sentry at…

Axel JACQIN

By Axel JACQIN

Jul 24, 2017

Exposing Kafka on two different networks

Categories: Infrastructure | Tags: Kafka, Cloudera, Cyber Security, Network, VLAN, CDH

A Big Data setup usually requires you to have multiple networking interface, let’s see how to set up Kafka on more than one of them. Kafka is a open-source stream processing software platform system…

César BEREZOWSKI

By César BEREZOWSKI

Jul 22, 2017

Composants for CDH and HDP

Categories: Big Data | Tags: Flume, Hive, Oozie, Sqoop, Zookeeper, Cloudera, Hortonworks, HDP, Hadoop, CDH

I was interested to compare the different components distributed by Cloudera and HortonWorks. This also gives us an idea of the versions packaged by the two distributions. At the time of this writting…

David WORMS

By David WORMS

Sep 22, 2013

Testing the Oracle SQL Connector for Hadoop HDFS

Categories: Data Engineering | Tags: HDFS, Database, File system, Oracle, CDH, SQL

Using Oracle SQL Connector for HDFS, you can use Oracle Database to access and analyze data residing in HDFS files or a Hive table. You can also query and join data in HDFS or a Hive table with other…

David WORMS

By David WORMS

Jul 15, 2013

Canada - Morocco - France

International locations

10 rue de la Kasbah
2393 Rabbat
Canada

We are a team of Open Source enthusiasts doing consulting in Big Data, Cloud, DevOps, Data Engineering, Data Science…

We provide our customers with accurate insights on how to leverage technologies to convert their use cases to projects in production, how to reduce their costs and increase the time to market.

If you enjoy reading our publications and have an interest in what we do, contact us and we will be thrilled to cooperate with you.