Azure Analytics Workloads – OLAP vs OLTP

Data Warehouse

Azure Analytics is a comprehensive set of solutions which turn your data into actionable insights

In the context of a cloud adoption, a workload is a collection of IT assets (servers, VMs, applications, data, or appliances) that collectively support a defined process. Workloads can support more than one process.

Types of Workloads

There are three main (relational) database workloads:

● As a place to store business transactions as they occur (OLTP)
●Place to hold data for complex analysis (OLAP)
● As a centralized repository for data from different sources (data warehouse)

1 . OLTP – Online Transaction Processing

● Most business applications require a place to store and retrieve data
● As when business transactions occur, they are recorded to the database
● Existing rows can be updated
● Data can be retrieved by SQL queries
● It is Optimized for general use

Traits of OLTP

● Database normalization
● Schema heavily enforced, data integrity
● Strong consistency
● Heavy writes, moderate reads
● Update-able
● Data size MBs (Mega Bytes) to TBs (Tera Bytes)

Azure OLTP
● Azure SQL Database
● SQL Server in a VM(Virtual Machine)
● Azure Database for MySQL
● Azure Database for PostgreSQL

2 . OLAP – Online Analytical Processing

● Data stored in a transactional database was not designed for complex analysis
● The Data present in a transactional database can change at any time
● Running complex reports can slow down a transactional database
● Can take time to prepare the data for analysis
● Cubes, dimensions, measures

Traits of OLAP
● No locking
● No updates
● Heavy reads, read-only
● Multi-dimensional indexing
● Data size GBs (Giga Bytes)

Azure OLAP
● SQL Server with Columnstore indexes
● Azure Analysis Services
● SQL Server Analysis Services

3. Data Warehousing

● Central repository of data from one or more different sources
● Current and historical data used for reporting and analysis
● Can rename or reformat columns to make it easier for users to create reports
● Users can run reports without affecting the day-to-day business data systems

Azure Data Warehousing

Symmetric Multiprocessing (SMP):
● Azure SQL Database
● SQL Server in a VM


Massively Parallel Processing (MPP):
● Azure Synapse Analytics (SQL Data Warehouse)
● Apache Hive on HDInsight
● Interactive Query (Hive LLAP) on HDInsight

When to Use a Data Warehouse
  • When queries are long running or affect day-to-day operations
  • When data needs further processing (ETL -Extract Transform Load or ELT-Extract Load Transfer) before it can be analyzed
  • When you want to remove historical data from your day-to-day systems (archiving)
  • When you need to integrate data from several sources
  • When users are confused by the data structures, table names or column names
  • When building reports in PowerBI (Power Business Intelligence)

Components of a modern data Warehouse :

Components of a
modern data
warehouse
Data Sources
All data comes from somewhere else
One or more data sources
It can be structured data - existing SQL databases
Also , It can be unstructured data - CSVs, JSON, log files
Azure Data Factory
Azure Data Factory is a hybrid data integration
service that allows you to create, schedule and
orchestrate your ETL / ELT workflows
Azure Data Factory moves the data from outside of
Azure to inside of Azure
Azure Blob Storage / Data Lake
Azure Blob Storage is a massively scalable object
storage for any type of unstructured data - images,
videos, audio, documents and other
Azure Data Lake is a type of blob storage that is
designed to handle even larger amounts of data
The unprocessed data is stored here
Azure Databricks
The Azure Databricks is a fast, easy and
collaborative Apache Spark-based analytics
platform
Databricks allows you to manipulate data at
large scales, by linking to one data source (like
Azure Data Lake), modifying the data, and
storing it in another data source (like Azure
Synapse Analytics / SQL DW)
Azure Synapse Analytics
Azure Synapse Analytics is the fast, flexible
and trusted cloud data warehouse that lets
you scale, compute and store elastically and
independently, with a massively parallel
processing architecture
Data is optimized for read-only queries
Azure Analysis Services
Azure Analysis Services is an enterprise grade
analytics as a service that lets you govern, deploy,
test, and deliver your BI solution with confidence
Data is optimized for complex queries with cubes
and dimensions
Power BI
Power BI is a suite of business intelligence
tools that make is easier for users to look at
data, analyze it, and create reports. They can
then publish them to the organization to
consume on the web or on mobile devices.

Learn more about azure : Core Azure architectural components : Regions, Availability zones and Resource groups

Learn Basics of Cloud Computing : What is Cloud Computing ? Everything you need to know is explained.

Check official website of Microsoft azure : Click Here

Leave a Reply

Your email address will not be published. Required fields are marked *