Automation

Related articles

H2O in practice: a protocol combining AutoML with traditional modeling approaches

H2O in practice: a protocol combining AutoML with traditional modeling approaches

Categories: Data Science, Learning | Tags: Automation, Cloud, H2O, Machine Learning, MLOps, On-premises, Open source, Python, XGBoost

H20 comes with a lot of functionalities. The second part of the series H2O in practice proposes a protocol to combine AutoML modeling with traditional modeling and optimization approach. The objective…

H2O in practice: a Data Scientist feedback

H2O in practice: a Data Scientist feedback

Categories: Data Science, Learning | Tags: Automation, Cloud, H2O, Machine Learning, MLOps, On-premises, Open source, Python

Automated machine learning (AutoML) platforms are gaining popularity and becoming a new important tool in the data scientists’ toolbox. A few months ago, I introduced H2O, an open-source platform for…

Storage size and generation time in popular file formats

Storage size and generation time in popular file formats

Categories: Data Engineering, Data Science | Tags: Avro, HDFS, Hive, ORC, Parquet, Big Data, Data Lake, File Format, JavaScript Object Notation (JSON)

Choosing an appropriate file format is essential, whether your data transits on the wire or is stored at rest. Each file format comes with its own advantages and disadvantages. We covered them in a…

Barthelemy NGOM

By Barthelemy NGOM

Mar 22, 2021

JS monorepos in prod 4: unit testing with Mocha and Should.js

JS monorepos in prod 4: unit testing with Mocha and Should.js

Categories: DevOps & SRE, Front End | Tags: Automation, CI/CD, Git, GitOps, Monorepo, Node.js, Unit tests

Unit testing is essential for every long-term project and allows you to pull down functionalities of your code into isolated testable units. Indeed the main goal of a unit test is to verify if an…

David WORMS

By David WORMS

Feb 25, 2021

Faster model development with H2O AutoML and Flow

Faster model development with H2O AutoML and Flow

Categories: Data Science, Learning | Tags: Automation, Cloud, H2O, Machine Learning, MLOps, On-premises, Open source, Python

Building Machine Learning (ML) models is a time-consuming process. It requires expertise in statistics, ML algorithms, and programming. On top of that, it also requires the ability to translate a…

InfraOps & DevOps Internship - build a Big Data & Kubernetes PaaS

InfraOps & DevOps Internship - build a Big Data & Kubernetes PaaS

Categories: Big Data, Containers Orchestration | Tags: DevOps, LXD, NoSQL, Hadoop, Kafka, Spark, Ceph, Internship, Kubernetes

Context The acquisition of a high-capacity cluster is in line with Adaltas’ desire to build a PAAS-type offering to use and to provide Big Data and container orchestration platforms. The platforms are…

David WORMS

By David WORMS

Nov 26, 2019

Internship Data Science & Data Engineer - ML in production and streaming data ingestion

Internship Data Science & Data Engineer - ML in production and streaming data ingestion

Categories: Data Engineering, Data Science | Tags: Flink, DevOps, Hadoop, HBase, Kafka, Spark, Internship, Kubernetes, Python

Context The exponential evolution of data has turned the industry upside down by redefining data storage, processing and data ingestion pipelines. Mastering these methods considerably facilitates…

David WORMS

By David WORMS

Nov 26, 2019

Jumbo, the Hadoop cluster bootstrapper

Jumbo, the Hadoop cluster bootstrapper

Categories: Infrastructure | Tags: Ambari, Automation, HDP, REST, Ansible, Cluster, Vagrant

Introducing Jumbo, a Hadoop cluster bootstrapper for developers. Jumbo helps you deploy development environments for Big Data technologies. It takes a few minutes to get a custom virtualized Hadoop…

Gauthier LEONARD

By Gauthier LEONARD

Nov 29, 2018

Hadoop cluster takeover with Apache Ambari

Hadoop cluster takeover with Apache Ambari

Categories: Big Data, DevOps & SRE, Adaltas Summit 2018 | Tags: Ambari, Automation, HDP, iptables, Kerberos, Nikita, REST, Systemd, Cluster, Node, Node.js

We recently migrated a large production Hadoop cluster from a “manual” automated install to Apache Ambari, we called this the Ambari Takeover. This is a risky process and we will detail why this…

Leo SCHOUKROUN

By Leo SCHOUKROUN

Nov 15, 2018

Ambari - How to blueprint

Ambari - How to blueprint

Categories: Big Data, DevOps & SRE | Tags: Ambari, Ranger, Automation, DevOps, Operation, REST

As infrastructure engineers at Adaltas, we deploy Hadoop clusters. A lot of them. Let’s see how to automate this process with REST requests. While really handy for deploying one or two clusters, the…

Joris RUMMENS

By Joris RUMMENS

Jan 17, 2018

Remote connection with SSH

Remote connection with SSH

Categories: Cyber Security | Tags: Automation, HTTP, SSH

While teaching Big Data and Hadoop, a student asks me about SSH and how to use. I’ll discuss about the protocol and the tools to benefit from it. Lately, I automate the deployment of Hadoop clusters…

David WORMS

By David WORMS

Oct 2, 2013

Announcing Mecano, a set of functions for system deployment

Announcing Mecano, a set of functions for system deployment

Categories: DevOps & SRE, Node.js | Tags: Automation, Infrastructure, CoffeeScript, JavaScript, Open source

Update July 2016, Mecano is now renamed Nikita. We are releasing Node Mecano on GitHub which gather common functions used while deploying systems. The idea was to group those functions into a…

David WORMS

By David WORMS

Feb 12, 2012

Canada - Morocco - France

We are a team of Open Source enthusiasts doing consulting in Big Data, Cloud, DevOps, Data Engineering, Data Science…

We provide our customers with accurate insights on how to leverage technologies to convert their use cases to projects in production, how to reduce their costs and increase the time to market.

If you enjoy reading our publications and have an interest in what we do, contact us and we will be thrilled to cooperate with you.

Support Ukrain