Data analytics is the process of examining raw data and identifying trends and patterns in order to make conclusions.
The data analyst will be responsible for interpreting the data, preparing the reports with visual presentations where he communicates trends and patterns that he found in the raw data in order to facilitate the decision making of the managers.
He works with the data scientists and the data engineers, the disciplines of the three being tightly linked. Since the boundaries between the three are not always clearly defined and vary across organizations, the tasks of data analysts might include data mining, database management, modeling and predicting which are mostly attributed to the other two disciplines.
Data analytics works in various fields where quantitative methods are required such as market research, financial analysis, marketing analysis, sales analysis.
Tools used by the data analyst are database management systems such as Oracle, statistical analyzing software like SAS or R, and business analysis tools namely Microsoft Power BI.
- Learn more
In this hands-on lab session we demonstrate how to build an end-to-end big data solution with Cloudera Data Platform (CDP) Public Cloud, using the infrastructure we have deployed and configured over…
Jul 24, 2023
A big data platform is a complex and sophisticated system that enables organizations to store, process, and analyze large volumes of data from a variety of sources. It is composed of several…
By David WORMS
Mar 23, 2023
Cloudera Data Platform (CDP) is a cloud computing platform for businesses. It provides integrated and multifunctional self-service tools in order to analyze and centralize data. It brings security and…
Jul 19, 2021
Categories: Big Data, Data Engineering | Tags: Business intelligence, Data Engineering, Data structures, Database, Hadoop, HDFS, Hive, Big Data, Data Analytics, Data Lake, Data lakehouse, Data Warehouse
Introduction Nowadays, the analysis of large amounts of data is becoming more and more possible thanks to Big data technology (Hadoop, Spark,…). This explains the explosion of the data volume and the…
By Aida NGOM
Jul 31, 2020
In data processing, there are different types of files formats to store your data sets. Each format has its own pros and cons depending upon the use cases and exists to serve one or several purposes…
By Aida NGOM
Jul 23, 2020
Categories: Big Data, Business Intelligence, Containers Orchestration | Tags: CNCF, Helm, Metrics, OLAP, Operation, Container Orchestration, EC2, Druid, Cloud, Data Analytics, Kubernetes, Prometheus, Python
Apache Druid is an open-source analytics data store which could leverage the auto-scaling abilities of Kubernetes due to its distributed nature and its reliance on memory. I was inspired by the talk…
Jul 16, 2019
This article covers the integration between Hive Interactive (LDAP) and Druid. One can see it as a complement of the Ultra-fast OLAP Analytics with Apache Hive and Druid article. Tools description…
Jun 17, 2019
RHadoop is a bridge between R, a language and environment to statistically explore data sets, and Hadoop, a framework that allows for the distributed processing of large data sets across clusters of…
By David WORMS
Jul 19, 2012