Oskar RYNKIEWICZ

Data Engineer and Data Scientist

Published articles

Machine Learning model deployment

Machine Learning model deployment

Categories: Big Data, Data Engineering, Data Science, DevOps & SRE | Tags: DevOps, Operation, AI, Cloud, Machine Learning, MLOps, On-premises, Schema

“Enterprise Machine Learning requires looking at the big picture […] from a data engineering and a data platform perspective,” lectured Justin Norman during the talk on the deployment of Machine…

Oskar RYNKIEWICZ

By Oskar RYNKIEWICZ

Sep 30, 2019

Spark Streaming part 4: clustering with Spark MLlib

Spark Streaming part 4: clustering with Spark MLlib

Categories: Data Engineering, Data Science, Learning | Tags: Spark, Apache Spark Streaming, Big Data, Clustering, Machine Learning, Scala, Streaming

Spark MLlib is an Apache’s Spark library offering scalable implementations of various supervised and unsupervised Machine Learning algorithms. Thus, Spark framework can serve as a platform for…

Oskar RYNKIEWICZ

By Oskar RYNKIEWICZ

Jun 27, 2019

Spark Streaming part 3: DevOps, tools and tests for Spark applications

Spark Streaming part 3: DevOps, tools and tests for Spark applications

Categories: Big Data, Data Engineering, DevOps & SRE | Tags: DevOps, Learning and tutorial, Spark, Apache Spark Streaming

Whenever services are unavailable, businesses experience large financial losses. Spark Streaming applications can break, like any other software application. A streaming application operates on data…

Oskar RYNKIEWICZ

By Oskar RYNKIEWICZ

May 31, 2019

Spark Streaming part 2: run Spark Structured Streaming pipelines in Hadoop

Spark Streaming part 2: run Spark Structured Streaming pipelines in Hadoop

Categories: Data Engineering, Learning | Tags: Spark, Apache Spark Streaming, Python, Streaming

Spark can process streaming data on a multi-node Hadoop cluster relying on HDFS for the storage and YARN for the scheduling of jobs. Thus, Spark Structured Streaming integrates well with Big Data…

Oskar RYNKIEWICZ

By Oskar RYNKIEWICZ

May 28, 2019

Spark Streaming part 1: build data pipelines with Spark Structured Streaming

Spark Streaming part 1: build data pipelines with Spark Structured Streaming

Categories: Data Engineering, Learning | Tags: Kafka, Spark, Apache Spark Streaming, Big Data, Streaming

Spark Structured Streaming is a new engine introduced with Apache Spark 2 used for processing streaming data. It is built on top of the existing Spark SQL engine and the Spark DataFrame. The…

Oskar RYNKIEWICZ

By Oskar RYNKIEWICZ

Apr 18, 2019

Publish Spark SQL DataFrame and RDD with Spark Thrift Server

Publish Spark SQL DataFrame and RDD with Spark Thrift Server

Categories: Data Engineering | Tags: Thrift, JDBC, Hadoop, Hive, Spark, SQL

The distributed and in-memory nature of the Spark engine makes it an excellent candidate to expose data to clients which expect low latencies. Dashboards, notebooks, BI studios, KPIs-based reports…

Oskar RYNKIEWICZ

By Oskar RYNKIEWICZ

Mar 25, 2019

Canada - Morocco - France

We are a team of Open Source enthusiasts doing consulting in Big Data, Cloud, DevOps, Data Engineering, Data Science…

We provide our customers with accurate insights on how to leverage technologies to convert their use cases to projects in production, how to reduce their costs and increase the time to market.

If you enjoy reading our publications and have an interest in what we do, contact us and we will be thrilled to cooperate with you.

Support Ukrain