File Format

Related articles

Introduction to Ludwig and how to deploy a Deep Learning model via Flask

Categories: Data Science, Tech Radar | Tags: Deep Learning, Learning and tutorial, Ludwig Deep Learning Toolbox, Machine Learning, Python

Over the past decade, Machine Learning and deep learning models have proven to be very effective in performing a wide variety of tasks such as fraud detection, product recommendation, autonomous…

Robert Walid SOARES

By Robert Walid SOARES

Mar 2, 2020

Spark Streaming part 2: run Spark Structured Streaming pipelines in Hadoop

Categories: Data Engineering, Learning | Tags: Spark, Apache Spark Streaming, Streaming, Python

Spark can process streaming data on a multi-node Hadoop cluster relying on HDFS for the storage and YARN for the scheduling of jobs. Thus, Spark Structured Streaming integrates well with Big Data…

Oskar RYNKIEWICZ

By Oskar RYNKIEWICZ

May 28, 2019

Data Lake ingestion best practices

Categories: Big Data, Data Engineering | Tags: Avro, Hive, NiFi, ORC, Spark, File Format, Data Governance, HDF, Operation, Protocol Buffers, Registry, Schema, Data Lake

Creating a Data Lake requires rigor and experience. Here are some good practices around data ingestion both for batch and stream architectures that we recommend and implement with our customers…

David WORMS

By David WORMS

Jun 18, 2018

State of the Hadoop open-source ecosystem in early 2013

Categories: Big Data | Tags: Flume, Kafka, Mesos, Phoenix, Pig, Hadoop, Mahout

Hadoop is already a large ecosystem and my guess is that 2013 will be the year where it grows even larger. There are some pieces that we no longer need to present. ZooKeeper, hbase, Hive, Pig, Flume…

David WORMS

By David WORMS

Jul 8, 2013

Apache Hive Essentials How-to by Darren Lee

Categories: Business Intelligence, Learning | Tags: Hive, File Format, UDF, Hadoop, SQL

Recently, I’ve been ask to review a new book on Apache Hive called “Apache Hive Essentials How-to” written by Darren Lee and published by Packt Publishing. To say it short, I sincerely recommend it. I…

David WORMS

By David WORMS

Apr 23, 2013

Convert .flac music files to .mp3 on osx

Categories: Hack | Tags: File Format, OS X

As an osx user for years now, one should know by then that iTunes doesn’t support the flac format. We are now in 2012, I’ve been waiting for this to happen since years know. Loosing patience, dark…

David WORMS

By David WORMS

Jul 20, 2012

HDFS and Hive storage - comparing file formats and compression methods

Categories: Big Data | Tags: Analytics, Hive, ORC, Parquet, File Format

A few days ago, we have conducted a test in order to compare various Hive file formats and compression methods. Among those file formats, some are native to HDFS and apply to all Hadoop users. The…

David WORMS

By David WORMS

Mar 13, 2012

Two Hive UDAF to convert an aggregation to a map

Categories: Data Engineering | Tags: Hive, File Format, Java, HBase

I am publishing two new Hive UDAF to help with maps in Apache Hive. The source code is available on GitHub in two Java classes: “UDAFToMap” and “UDAFToOrderedMap” or you can download the jar file. The…

David WORMS

By David WORMS

Mar 6, 2012

Timeseries storage in Hadoop and Hive

Categories: Data Engineering | Tags: HDFS, Hive, CRM, File Format, timeseries, Tuning, Hadoop

In the next few weeks, we will be exploring the storage and analytic of a large generated dataset. This dataset is composed of CRM tables associated to one timeserie table of about 7,000 billiard rows…

David WORMS

By David WORMS

Jan 10, 2012

Canada - Morocco - France

International locations

10 rue de la Kasbah
2393 Rabbat
Canada

We are a team of Open Source enthusiasts doing consulting in Big Data, Cloud, DevOps, Data Engineering, Data Science…

We provide our customers with accurate insights on how to leverage technologies to convert their use cases to projects in production, how to reduce their costs and increase the time to market.

If you enjoy reading our publications and have an interest in what we do, contact us and we will be thrilled to cooperate with you.