Internship in Data Engineering

Job Description

Data is a valuable business asset. Some call it the new oil. The data engineer collects, transform and refine raw data into information that can be used by business analysts and data scientists.

As part of your internship, you will be trained in the different aspects of the data engineer activities. You will build a real-time, end-to-end data streaming ingestion pipeline combining metric collections, data cleansing and aggregation, storage to multiple data warehouses, (near) real-time analysis by exposure key metrics in a dashboard, and the usage of machine learning models applied to the prediction and detection of weak signals.

You will participate in the application architecture and the implementation of the pipeline with the goal of going into production. You will join an agile team led by a Big Data expert.

In addition, you will obtain at the end of the internship a certification from a Cloud provider, and a Databricks certification.

Company presentation

Adaltas specializes in the processing and storage of data. We work on-premise and in the cloud to operate Big Data platforms and strengthen our clients’ teams in the areas of architecture, operations, data engineering, data science and DevOps. Partner with Cloudera and Databricks, we are also open source contributors. We invite you to browse our site and our many technical publications to learn more about Adaltas.

Responsibilities

Collecting system and application metrics
Supplying a distributed data warehouse with OLAP-type column storage
Cleansing, enrichment, aggregation of data flows
Real-time analysis in SQL
Dashboards creation
Putting machine learning models into production in an MLOps cycle
Deployment in an Azure cloud infrastructure and on-premise

Expected qualifications

Engineering school, end of studies internship
Analytical and structured
Autonomous and curious
You are an open-minded person who enjoys sharing, communicating and learning from others
Good knowledge of Python, Spark and Linux systems

You will be in charge of designing the technical architecture. We are looking for a person who masters or who will develop skills on the following tools and solutions:

All complementary experiences are valuable.

Additional information

Location: Boulogne Billancourt, France
Languages: French or English
Start: February 2022
Duration: 6 months
Teleworking: possibility of working 2 days a week remotely

Available hardware

A laptop with the following characteristics:

32GB RAM
1TB SSD
8c/16t CPU

A cluster made up of:

3x 28c/56t Intel Xeon Scalable Gold 6132
3x 192TB RAM DDR4 ECC 2666MHz
3x 14 SSD 480GB SATA Intel S4500 6Gbps

Platforms, components, tools

A Kubernetes cluster and a Hadoop cluster.

Remuneration

Salary 1200 € / month
Restaurant tickets
Transportation pass
Participation in one international conference

In the past, the conferences which we attended include the KubeCon organized by the CNCF foundation, the Open Source Summit from the Linux Foundation and the Fosdem.

Contact

For any request for additional information and to submit your application, please contact David Worms:

david@adaltas.com
+33 6 76 88 72 13
https://www.linkedin.com/in/david-worms/

Share this article