Cloudera

Related articles

Cloudera CDP and Cloud migration of your Data Warehouse

Categories: Big Data, Cloud Computing | Tags: Cloudera, Data Hub, Data Lake, Data Warehouse, Azure

While one of our customer is anticipating a move to the Cloud and with the recent announcement of Cloudera CDP availability mi-september during the Strata conference, it seems like the appropriate…

David WORMS

By David WORMS

Dec 16, 2019

Notes on the Cloudera Open Source licensing model

Categories: Big Data | Tags: CDSW, License, Open source, Cloudera Manager

Following the publication of its Open Source licensing strategy on July 10, 2019 in an article called “our Commitment to Open Source Software”, Cloudera broadcasted a webinar yesterday October 2…

David WORMS

By David WORMS

Oct 25, 2019

Running Apache Hive 3, new features and tips and tricks

Categories: Big Data, Business Intelligence, DataWorks Summit 2019 | Tags: Druid, Hive, Kafka, JDBC, LLAP, Release and features, Hadoop

Apache Hive 3 brings a bunch of new and nice features to the data warehouse. Unfortunately, like many major FOSS releases, it comes with a few bugs and not much documentation. It is available since…

Gauthier LEONARD

By Gauthier LEONARD

Jul 25, 2019

Introduction to Cloudera Data Science Workbench

Categories: Data Science | Tags: Cloudera, Docker, Git, Kubernetes, Machine Learning, Azure, Notebook

Cloudera Data Science Workbench is a platform that allows Data Scientists to create, manage, run and schedule data science workflows from their browser. Thus it enables them to focus on their main…

Mehdi ELALAMI

By Mehdi ELALAMI

Feb 28, 2019

Apache Hadoop YARN 3.0 – State of the union

Categories: Big Data, DataWorks Summit 2018 | Tags: HDFS, MapReduce, YARN, Cloudera, Docker, GPU, Hortonworks, Release and features, Hadoop

This article covers the ”Apache Hadoop YARN: state of the union” talk held by Wangda Tan from Hortonworks during the Dataworks Summit 2018. What is Apache YARN? As a reminder, YARN is one of the two…

Lucas BAKALIAN

By Lucas BAKALIAN

May 31, 2018

Cloudera Sessions Paris 2017

Categories: Big Data, Events | Tags: EC2, Cloudera, Altus, CDSW, SDX, PaaS, CDH, Azure

Adaltas was at the Cloudera Sessions on October 5, where Cloudera showcased their new products and offerings. Below you’ll find a summary of what we witnessed. Note: the information were aggregated in…

César BEREZOWSKI

By César BEREZOWSKI

Oct 16, 2017

Exposing Kafka on two different networks

Categories: Infrastructure | Tags: Kafka, Cloudera, Cyber Security, Network, VLAN, CDH

A Big Data setup usually requires you to have multiple networking interface, let’s see how to set up Kafka on more than one of them. Kafka is a open-source stream processing software platform system…

César BEREZOWSKI

By César BEREZOWSKI

Jul 22, 2017

MiNiFi: Data at Scales & the Values of Starting Small

Categories: Big Data, DevOps & SRE, Infrastructure | Tags: MiNiFi, NiFi, Cloudera, C++, HDP, HDF, IOT

This conference presented rapidly Apache NiFi and explained where MiNiFi came from: basically it’s a NiFi minimal agent to deploy on small devices to bring data to a cluster’s NiFi pipeline (ex: IoT…

César BEREZOWSKI

By César BEREZOWSKI

Jul 8, 2017

Composants for CDH and HDP

Categories: Big Data | Tags: Flume, Hive, Oozie, Sqoop, Zookeeper, Cloudera, Hortonworks, HDP, Hadoop, CDH

I was interested to compare the different components distributed by Cloudera and HortonWorks. This also gives us an idea of the versions packaged by the two distributions. At the time of this writting…

David WORMS

By David WORMS

Sep 22, 2013

The state of Hadoop distributions

Categories: Big Data | Tags: Cloudera, Hortonworks, Intel, Oracle, Hadoop

Apache Hadoop is of course made available for download on its official webpage. However, downloading and installing the several components that make a Hadoop cluster is not an easy task and is a…

David WORMS

By David WORMS

May 11, 2013

Hadoop development cluster of virtual machines with static IP using VirtualBox

Categories: Infrastructure | Tags: Ambari, Cloudera, Hortonworks, Network, Red Hat, VirtualBox, VM, VMware

A few days ago, I explained how to set up a cluster of virtual machine with static IPsand Internet access suitable to host your Hadoop cluster locally for development. At the time I made use of VMWare…

David WORMS

By David WORMS

Mar 14, 2013

Virtual machines with static IP for your Hadoop development cluster

Categories: Infrastructure | Tags: Ambari, Cloudera, Hortonworks, Network, Red Hat, VirtualBox, VM, VMware

While I am about to install and test Ambari, this article is the occasion to illustrate how I set up my development environment with multiple virtual machines. Ambari, the deployment and monitoring…

David WORMS

By David WORMS

Feb 27, 2013

Storage and massive processing with Hadoop

Categories: Big Data | Tags: HDFS, Hadoop, Storage

Apache Hadoop is a system for building shared storage and processing infrastructures for large volumes of data (multiple terabytes or petabytes). Hadoop clusters are used by a wide range of projects…

David WORMS

By David WORMS

Nov 26, 2010

Canada - Morocco - France

International locations

10 rue de la Kasbah
2393 Rabbat
Canada

We are a team of Open Source enthusiasts doing consulting in Big Data, Cloud, DevOps, Data Engineering, Data Science…

We provide our customers with accurate insights on how to leverage technologies to convert their use cases to projects in production, how to reduce their costs and increase the time to market.

If you enjoy reading our publications and have an interest in what we do, contact us and we will be thrilled to cooperate with you.