Installation Guide to TDP, the 100% open source big data platform

Installation Guide to TDP, the 100% open source big data platform

Do you like our work......we hire!

Never miss our publications about Open Source, big data and distributed systems, low frequency of one email every two months.

The Trunk Data Platform (TDP) is a 100% open source big data distribution, based on Apache Hadoop and compatible with HDP 3.1. Initiated in 2021 by EDF, the DGFiP and Adaltas, the project is governed by the TOSIT - an association under the 1901 law with the objective of promoting open source to major companies and institutions.

Version 1.1, which release is expected duing the 4th quarter of 2023, adds features necessary for managing a production cluster (see #308). Support and training offers are already available from some consulting firms like Adaltas with Alliage.

TDP is aimed at anyone wishing to:

  • Create their data platform (Data Lake, Data Hub, Data Warehouse, Data Science Platform, etc.).
  • Migrate their current solution to a 100% open source (and free) solution.
  • Develop on big data services (HDFS, Hive, Spark, etc.).
  • Explore Hadoop technologies.

Architecture

TDP can be broken down into 2 main parts:

  • A stack, based on Apache Hadoop and compatible with HDP 3.1.
  • A cluster manager, based on Ansible, that allows deploying and managing a TDP cluster via a library, a REST API, or a graphical interface (see tdp-lib, tdp-server and tdp-ui).

TDP Architecture

The project was designed in a modular way. This is true for both the stack and the manager. It is thus possible to add components, to not use the UI, etc.

Try TDP

Adaltas, through its Alliage offer, provides support and expertise on TDP. On its website, you will find the publication of a guide that allows you to deploy a TDP cluster locally, using Vagrant and VirtualBox. Its purpose is to discover the platform’s functionalities.

This guide provides a development environment. It does not apply to production deployments, the documentation for which is currently being written, see PR #88.

Build the data platform that suits you

Adaltas is a consulting company specialized in big data and open source technologies. We are partners with Cloudera, Dremio, and Databricks. Our clients trust our consultants to contribute to the development of TDP.

We will thus be able to assist you in setting up your data platform, from design to production. Do not hesitate to contact us for more information.

Share this article

Canada - Morocco - France

We are a team of Open Source enthusiasts doing consulting in Big Data, Cloud, DevOps, Data Engineering, Data Science…

We provide our customers with accurate insights on how to leverage technologies to convert their use cases to projects in production, how to reduce their costs and increase the time to market.

If you enjoy reading our publications and have an interest in what we do, contact us and we will be thrilled to cooperate with you.

Support Ukrain