Batch processing
Related articles
Comparaison of different file formats in Big Data
Categories: Big Data, Data Engineering | Tags: Analytics, Avro, HDFS, Hive, Kafka, MapReduce, ORC, Spark, Batch processing, Big Data, CSV, Data Analytics, Data structures, Database, JSON, Protocol Buffers, Hadoop, Parquet, Kubernetes, XML
In data processing, there are different types of files formats to store your data sets. Each format has its own pros and cons depending upon the use cases and exists to serve one or several purposes…
By Aida NGOM
Jul 23, 2020
Apache Flink: past, present and future
Categories: Data Engineering | Tags: Flink, Pipeline, Streaming, Kubernetes, Machine Learning, SQL
Apache Flink is a little gem which deserves a lot more attention. Let’s dive into Flink’s past, its current state and the future it is heading to by following the keynotes and presentations at Flink…
Nov 5, 2018
Apache Beam: a unified programming model for data processing pipelines
Categories: Data Engineering, DataWorks Summit 2018 | Tags: Apex, Beam, Flink, Spark, Pipeline
In this article, we will review the concepts, the history and the future of Apache Beam, that may well become the new standard for data processing pipelines definition. At Dataworks Summit 2018 in…
May 24, 2018