Published articles
![Automate a Spark routine workflow from GitLab to GCP Automate a Spark routine workflow from GitLab to GCP](/static/43afa468240e83392130929739374fc6/0fd76/gcp_gitlab_ci_spark.png)
Automate a Spark routine workflow from GitLab to GCP
Categories: Big Data, Cloud Computing, Containers Orchestration | Tags: Learning and tutorial, Airflow, Spark, CI/CD, GitLab, GitOps, GCP, Terraform
A workflow consists in automating a succession of tasks to be carried out without human intervention. It is an important and widespread concept which particularly apply to operational environmentsā¦
Jun 16, 2020
![Optimization of Spark applications in Hadoop YARN Optimization of Spark applications in Hadoop YARN](/static/71af0e123ea67549f0f4aa67cf421599/0fd76/spark-resources.png)
Optimization of Spark applications in Hadoop YARN
Categories: Data Engineering, Learning | Tags: Tuning, Hadoop, Spark, Python
Apache Spark is an in-memory data processing tool widely used in companies to deal with Big Data issues. Running a Spark application in production requires user-defined resources. This articleā¦
Mar 30, 2020