Cloudera

Introduction to Cloudera Data Science Workbench

Cloudera Data Science Workbench is a platform that allows Data Scientists to create, manage, run and schedule data science workflows from their browser. Thus it enables them to focus on their main task that is deriving insights from data, without thinking about the complexity that lies in the background. CDSW was released after Cloudera’s acquisition of [...]

Storage and massive processing with Hadoop

Apache Hadoop is a system for building shared storage and processing infrastructures for large volumes of data (multiple terabytes or petabytes). Hadoop clusters are used by a wide range of projects for a growing number of web players (Yahoo!, EBay, Facebook, LinkedIn, Twitter) and their size continues to increase. Yahoo! has 45,000 machines with the [...]

By |2019-06-23T21:31:57+00:00November 26th, 2010|Categories: Big Data|Tags: , , , , |0 Comments