Cloud computing

Obtenir de l'agilité, de l'efficacité, un contrôle des coûts et une meilleure analyse en déployant une infrastructure de données volumineuses dans le cloud tout en tenant compte des impératifs de sécurité et de l'héritage, n'est pas une mince tâche. La gestion d'un pool élastique de ressources dans un environnement multi-tenant tout en respectant les SLAs, l'intégrité des données et le budget sous contrôle ne l'est pas non plus.

Nous concevons, déployons et exploitons quotidiennement des solutions de cloud hybrides publiques et privées basées sur de multiples offres. Nous avons été impliqués dans différentes approches de la migration vers le cloud, de «Lift & Shift» à la refonte complète de la plateforme. Ces expériences apportent à nos consultants toute la profondeur et l’éventail des compétences nécessaires pour vous aider à naviguer, personnaliser et exploiter la nouvelle norme.

Nos consultants interviennent sur l'ensemble du cycle de vie d'un projet, de l'étude de faisabilité jusqu'à sa mise en production

Cloud migration

Rassembler et documenter les exigences (fonctionnelles et non fonctionnelles)
Architecture de la solution en fonction des exigences
Définition de la roadmap et planification de projet
Test, optimisation et procédures de cut-off
Comparaison des services et offres de cloud public

Exploitation et optimisation

Audit d'infrastructure, des processus and des coûts
Automatisation du déploiement de l'infrastructure
Définition et respect des objectifs (SLOs, SLAs)
Infrastructure, réseau et exploitation des services
Analyse, calcul et optimisation des coûts (Total Cost of Ownership, TCO)

Intégration et développement dans le Cloud

Qualification et validation de technologies et de services
Ingestion et préparation des pipelines de données
Chargement des données et connection des systèmes
Algorithmes d'apprentissage automatique (Machine Learning, ML)
Traitements sur architecture Stream and Batch

Articles associés au Cloud

CDP part 5: user permissions management on CDP Public Cloud

Catégories : Big Data, Cloud Computing, Data Governance | Tags : Ranger, Cloudera, CDP, Data Warehouse

When you create a user or a group in CDP, it requires permissions to access resources and use the Data Services. This article is the fifth in a series of six: CDP part 1: introduction to end-to-end…

Par Tobias CHAVARRIA

18 juil. 2023

CDP part 4: user management on CDP Public Cloud with Keycloak

Catégories : Big Data, Cloud Computing, Data Governance | Tags : EC2, Big Data, CDP, Docker Compose, Keycloak, SSO

Previous articles of the serie cover the deployment of a CDP Public Cloud environment. All the components are ready for use and it is time to make the environment available to other users to explore…

Par Tobias CHAVARRIA

4 juil. 2023

CDP part 3: Data Services activation on CDP Public Cloud environment

Catégories : Big Data, Cloud Computing, Infrastructure | Tags : Infrastructure, AWS, Big Data, Cloudera, CDP

One of the big selling points of Cloudera Data Platform (CDP) is their mature managed service offering. These are easy to deploy on-premises, in the public cloud or as part of a hybrid solution. The…

Par Albert KONRAD

27 juin 2023

CDP part 2: CDP Public Cloud deployment on AWS

Catégories : Big Data, Cloud Computing, Infrastructure | Tags : Infrastructure, AWS, Big Data, Cloud, Cloudera, CDP, Cloudera Manager

The Cloudera Data Platform (CDP) Public Cloud provides the foundation upon which full featured data lakes are created. In a previous article, we introduced the CDP platform. This article is the second…

Par Albert KONRAD

19 juin 2023

CDP part 1: introduction to end-to-end data lakehouse architecture with CDP

Catégories : Cloud Computing, Data Engineering, Infrastructure | Tags : Data Engineering, Hortonworks, Iceberg, AWS, Azure, Big Data, Cloud, Cloudera, CDP, Cloudera Manager, Data Warehouse

Cloudera Data Platform (CDP) is a hybrid data platform for big data transformation, machine learning and data analytics. In this series we describe how to build and use an end-to-end big data…

Par Stephan BAUM

8 juin 2023

Keycloak deployment in EC2

Catégories : Cloud Computing, Data Engineering, Infrastructure | Tags : Security, EC2, Authentication, AWS, Docker, Keycloak, SSL/TLS, SSO

Why use Keycloak Keycloak is an open-source identity provider (IdP) using single sign-on (SSO). An IdP is a tool to create, maintain, and manage identity information for principals and to provide…

Par Stephan BAUM

14 mars 2023

Databricks logs collection with Azure Monitor at a Workspace Scale

Catégories : Cloud Computing, Data Engineering, Adaltas Summit 2021 | Tags : Metrics, Monitoring, Spark, Azure, Databricks, Log4j

Databricks is an optimized data analytics platform based on Apache Spark. Monitoring Databricks plateform is crucial to ensure data quality, job performance, and security issues by limiting access to…

Par Claire PLAYE

10 mai 2022

Using Cloudera Deploy to install Cloudera Data Platform (CDP) Private Cloud

Catégories : Big Data, Cloud Computing | Tags : Ansible, Cloudera, CDP, Cluster, Data Warehouse, Vagrant, IaC

Following our recent Cloudera Data Platform (CDP) overview, we cover how to deploy CDP private Cloud on you local infrastructure. It is entirely automated with the Ansible cookbooks published by…

Par Alexander HOFFMANN

23 juil. 2021

An overview of Cloudera Data Platform (CDP)

Catégories : Big Data, Cloud Computing, Data Engineering | Tags : SDX, Big Data, Cloud, Cloudera, CDP, CDH, Data Analytics, Data Hub, Data Lake, Data lakehouse, Data Warehouse

Cloudera Data Platform (CDP) is a cloud computing platform for businesses. It provides integrated and multifunctional self-service tools in order to analyze and centralize data. It brings security and…

Par Alexander HOFFMANN

19 juil. 2021

Find your way into data related Microsoft Azure certifications

Catégories : Cloud Computing, Data Engineering | Tags : Data Governance, Azure, Data Science

Microsoft Azure has certification paths for many technical job roles such as developer, Data Engineer, Data Scientist and solution architect among others. Each of these certifications consists of…

Par Barthelemy NGOM

14 avr. 2021

Connecting to ADLS Gen2 from Hadoop (HDP) and Nifi (HDF)

Catégories : Big Data, Cloud Computing, Data Engineering | Tags : Hadoop, HDFS, NiFi, Authentication, Authorization, Azure, Azure Data Lake Storage (ADLS), OAuth2

As data projects built in the Cloud are becoming more and more frequent, a common use case is to interact with Cloud storage from an existing on premise Big Data platform. Microsoft Azure recently…

Par Gauthier LEONARD

5 nov. 2020

Automate a Spark routine workflow from GitLab to GCP

Catégories : Big Data, Cloud Computing, Containers Orchestration | Tags : Learning and tutorial, Airflow, Spark, CI/CD, GitLab, GitOps, GCP, Terraform

A workflow consists in automating a succession of tasks to be carried out without human intervention. It is an important and widespread concept which particularly apply to operational environments…

Par Ferdinand DE BAECQUE

16 juin 2020

Introducing Apache Airflow on AWS

Catégories : Big Data, Cloud Computing, Containers Orchestration | Tags : PySpark, Learning and tutorial, Airflow, Oozie, Spark, AWS, Docker, Python

Apache Airflow offers a potential solution to the growing challenge of managing an increasingly complex landscape of data management tools, scripts and analytics processes. It is an open-source…

Par Aargan COINTEPAS

5 mai 2020

Snowflake, the Data Warehouse for the Cloud, introduction and tutorial

Catégories : Business Intelligence, Cloud Computing | Tags : Cloud, Data Lake, Data Science, Data Warehouse, Snowflake

Snowflake is a SaaS-based data-warehousing platform that centralizes, in the cloud, the storage and processing of structured and semi-structured data. The increasing generation of data produced over…

Par Jules HAMELIN-BOYER

7 avr. 2020

Cloudera CDP and Cloud migration of your Data Warehouse

Catégories : Big Data, Cloud Computing | Tags : Azure, Cloudera, Data Hub, Data Lake, Data Warehouse

While one of our customer is anticipating a move to the Cloud and with the recent announcement of Cloudera CDP availability mi-september during the Strata conference, it seems like the appropriate…

Par David WORMS

16 déc. 2019

Should you move your Big Data and Data Lake to the Cloud

Catégories : Big Data, Cloud Computing | Tags : DevOps, AWS, Azure, Cloud, CDP, Databricks, GCP

Should you follow the trend and migrate your data, workflows and infrastructure to GCP, AWS and Azure? During the Strata Data Conference in New-York, a general focus was put on moving customer’s Big…

Par Joris RUMMENS

9 déc. 2019

Insert rows in BigQuery tables with complex columns

Catégories : Cloud Computing, Data Engineering | Tags : GCP, BigQuery, Schema, SQL

Google’s BigQuery is a cloud data warehousing system designed to process enormous volumes of data with several features available. Out of all those features, let’s talk about the support of Struct…

Par César BEREZOWSKI

22 nov. 2019

Running Enterprise Workloads in the Cloud with Cloudbreak

Catégories : Big Data, Cloud Computing, DataWorks Summit 2018 | Tags : Cloudbreak, Operation, Hadoop, AWS, Azure, GCP, HDP, OpenStack

This article is based on Peter Darvasi and Richard Doktorics’ talk Running Enterprise Workloads in the Cloud at the DataWorks Summit 2018 in Berlin. It presents Hortonworks’ automated deployment tool…

Par Joris RUMMENS

28 mai 2018

Micro Services

Catégories : Cloud Computing, Containers Orchestration, Open Source Summit Europe 2017 | Tags : Mesos, DNS, Encryption, gRPC, Linkerd, Micro Services, MITM, Service Mesh, CNCF, Istio, Kubernetes, Proxy, SPOF, SSL/TLS

Back in the days, applications were monolithic and we could use an IP address to access a service. With virtual machines (VM), multiple hosts started to appear on the same machine with multiple apps…

Par David WORMS

14 nov. 2017

Kubernetes Storage Primitives for Stateful Workloads

Catégories : Cloud Computing, Containers Orchestration, Open Source Summit Europe 2017 | Tags : Container Storage Interface (CSI), PVC, Azure, Docker, GCE, Kubernetes, Storage

This article is based on the presentation “Introduction to Kubernetes Storage Primitives for Stateful Workloads” from the OSS Convention Prague 2017 by the {Code} team. So, let’s start, what is…

Par Pierre SAUVAGE

28 oct. 2017

Multi-Repo, Multi-Node Gating at Massive Scale

Catégories : Cloud Computing, DevOps & SRE, Open Source Summit Europe 2017 | Tags : Infrastructure, Jenkins, Red Hat, Zuul, Ansible, CI/CD, OpenStack

This is a recap and personal review of Monty Taylor’s presentation of OpenStack’s Continuous Integration tool Zuul at the OpenSource Summit 2017 in Prague (not to mix with Netflix’ Zuul project…

Par Joris RUMMENS

28 oct. 2017

Node.js is now integrated to the Microsoft Azure platform

Catégories : Cloud Computing, Tech Radar | Tags : Linux, Azure, Cloud, Node.js

Node is now a first class citizen in the Microsoft Azure cloud environment alongside .Net, Java and PHP. This integration is the logical consequence of Microsoft’s involvement in the development of…

Par David WORMS

11 déc. 2011