Internship in Big Data infrastructure with TDP

Internship in Big Data infrastructure with TDP

Daniel HARTY

By Daniel HARTY

Oct 25, 2021

Categories: Infrastructure, Learning | Tags: Cyber Security, DevOps, Java, Hadoop, IaC, TDP [more][less]

Job Description

Big Data and distributed computing is at Adaltas’ core. We support our partners in the deployment, maintenance and optimisation of some of France’s largest clusters. Adaltas is also an advocate and active contributor to Open Source with our latest focus being a new Hadoop distribution which is fully open source. This project is the TOSIT Data Platform (TDP).

During this internship, you will join the TDP project team and contribute to the development of the project. You will deploy and test production ready Hadoop TDP clusters, you will contribute code in the form of iterative improvements on the existing codebase, you will contribute your knowledge of TDP in the form of customer ready support resources and you will gain experience in the usage of core Hadoop components like HDFS, YARN, Ranger, Spark, Hive, and Zookeeper.

This will be a serious challenge, with a large number of new technologies and development practices for you to tackle from day one. In return for your dedication, you will finish your internship fully equipped to take on a role in the domain of Big Data.

Company presentation

Adaltas specialises in Big Data, Open Source and DevOps. We operate both on-premise and in the cloud. We are proud of our Open Source culture and our contributions have aided users and companies across the world. Adaltas is built on an open culture. Our articles share our knowledge on Big Data, DevOps and multiple complementary topics.

Skills required and to be acquired

The development of the TDP platform requires an understanding of Hadoop’s distributed computation model and how its core components (HDFS, YARN etc.) work together to solve Big Data problems. A working knowledge of using Linux and the command line is required.

During the course of the internship you will learn:

  • Hadoop cluster governance
  • Hadoop cluster security including Kerberos and SSL/TLS certificates
  • Highly availability (HA) of services
  • Scalability in Hadoop clusters
  • Monitoring and health assessment of services and jobs
  • Fault tolerant Hadoop cluster with recoverability of lost data on infrastructure failure
  • Infrastructure as Code (IaC) via DevOps tools such as Ansible and Vagrant
  • Code collaboration using Git in both Gitlab and Github

Responsibilities

  • Become familiar with the TDP distribution’s architecture and configuration methods
  • Deploy and test secure and fault tolerant TDP clusters
  • Contribute to the TDP knowledge-base with troubleshooting guides, FAQs and articles
  • Participate in the debates about the TDP project objectives and roadmap strategies
  • Actively contribute ideas and code to make iterative improvements on the TDP ecosystem
  • Research and analyse the differences between the major Hadoop distributions

Additional information

  • Location: Boulogne Billancourt, France
  • Languages: French or English
  • Starting date: mars 2022
  • Duration: 6 mois

Much of the digital world runs on Open Source software and the Big Data industry is booming. This internship is an opportunity to gain valuable experience in both domains. TDP is now the only truly Open Source Hadoop distribution. This is a great momentum. As part of the TDP team, you will have the possibility to learn one of the core big data processing models and participate in the development and the future roadmap of TDP. We believe that this is an exciting opportunity and that on completion of the internship, you will be ready for a successful career in Big Data.

Equipment available

A laptop with the following characteristics:

  • 32GB RAM
  • 1TB SSD
  • 8c/16t CPU

A cluster made up of:

  • 3x 28c/56t Intel Xeon Scalable Gold 6132
  • 3x 192TB RAM DDR4 ECC 2666MHz
  • 3x 14 SSD 480GB SATA Intel S4500 6Gbps

Platforms, components, tools

A Kubernetes cluster and a Hadoop cluster.

Remuneration

  • Salary 1200 € / month
  • Restaurant tickets
  • Transportation pass
  • Participation in one international conference

In the past, the conferences which we attended include the KubeCon organized by the CNCF foundation, the Open Source Summit from the Linux Foundation and the Fosdem.

Contact

For any request for additional information and to submit your application, please contact David Worms:

Canada - Morocco - France

International locations

10 rue de la Kasbah
2393 Rabbat
Canada

We are a team of Open Source enthusiasts doing consulting in Big Data, Cloud, DevOps, Data Engineering, Data Science…

We provide our customers with accurate insights on how to leverage technologies to convert their use cases to projects in production, how to reduce their costs and increase the time to market.

If you enjoy reading our publications and have an interest in what we do, contact us and we will be thrilled to cooperate with you.