All Insights
Data Architecture February 2026 11 min read

Azure Cloud Cost Optimisation in 2026: How to Cut Your Data Platform Costs by 40–60%

Cloud costs for data platforms have spiralled for many organisations. This practical guide covers the architecture patterns, Databricks optimisations, and governance changes that can reduce Azure data platform costs by 40–60% without sacrificing capability.

The Cloud Cost Problem

Cloud data platform costs have become a significant concern for European enterprises in 2025 and 2026. The combination of growing data volumes, increasingly complex analytics workloads, and the proliferation of AI and ML experiments has driven cloud costs far beyond initial projections for many organisations.

Our experience with Vesting Finance's Azure migration illustrates both the problem and the solution. Their on-premise data infrastructure was expensive to maintain and lacked the scalability required for their growing analytics needs. But their initial cloud architecture — a direct lift-and-shift of on-premise patterns to Azure — was delivering cloud costs that were actually higher than their on-premise equivalent, without the expected performance improvements.

Through a combination of architecture redesign, Databricks optimisation, and governance improvements, we achieved a 50% reduction in their cloud costs while simultaneously improving performance and governance. This article shares the specific techniques that delivered those savings.

Architecture-Level Optimisations

Right-size your compute: The most common source of cloud cost waste is over-provisioned compute. Many organisations provision compute clusters based on peak demand and leave them running continuously. In a modern data platform, compute should be ephemeral — provisioned for specific workloads and terminated when the workload completes.

Databricks clusters should be configured with auto-termination (typically 30–60 minutes of inactivity) and auto-scaling. For interactive analytics workloads, serverless compute (now generally available in Databricks in 2026) eliminates cluster management overhead and provides more granular cost control.

Separate storage from compute: The fundamental advantage of cloud data platforms over on-premise architectures is the ability to scale storage and compute independently. Organisations that have not fully embraced this principle — for example, by using compute-attached storage rather than Azure Data Lake Storage — are paying for compute capacity they do not need.

Implement storage tiering: Azure Data Lake Storage supports hot, cool, and archive tiers with significantly different pricing. Data that is accessed infrequently (historical data older than 90 days, for example) should be automatically tiered to cool or archive storage. Implementing lifecycle management policies that automate this tiering can reduce storage costs by 60–70% for organisations with large historical datasets.

Optimise Delta Lake table formats: Poorly optimised Delta tables are a significant source of both performance problems and cost waste. Regular OPTIMIZE and VACUUM operations reduce file counts, improve query performance, and reduce storage costs. Delta Live Tables pipelines with automatic optimisation enabled handle this automatically.

Databricks-Specific Optimisations

Photon Engine: Databricks Photon, the C++ vectorised query engine, provides 2–5x performance improvements for SQL workloads compared to the standard Spark engine. For organisations with significant SQL analytics workloads, enabling Photon typically reduces compute costs by reducing the cluster time required to complete workloads.

Cluster Policies: Databricks cluster policies allow administrators to define constraints on cluster configurations — preventing users from provisioning oversized clusters, requiring auto-termination, and enforcing cost tagging. Implementing cluster policies is one of the most effective governance controls for managing Databricks costs.

Spot Instances: Azure Spot VMs (equivalent to AWS Spot Instances) provide significant cost savings (60–80% compared to on-demand pricing) for fault-tolerant workloads. Databricks supports spot instances natively, and most batch data processing workloads are suitable for spot execution.

Workflow Scheduling: Moving from always-on clusters to scheduled job clusters — where clusters are provisioned specifically for each job run and terminated on completion — can reduce compute costs by 70–80% for batch workloads. The trade-off is slightly longer job startup times (typically 3–5 minutes for cluster provisioning).

Governance-Driven Cost Reduction

One of the less obvious sources of cloud cost savings is improved data governance. Organisations with poor data governance tend to accumulate redundant data copies, maintain unnecessary compute resources, and run inefficient queries on poorly structured data.

Data deduplication: A governance-driven data audit frequently reveals significant data duplication — the same data stored in multiple formats, in multiple locations, for different teams. Eliminating this duplication reduces storage costs and simplifies data management.

Query optimisation through governance: Data quality issues — null values, duplicate records, incorrect data types — force downstream queries to perform additional filtering and deduplication work. Improving data quality at the source reduces the compute required for downstream analytics.

Tagging and chargeback: Implementing comprehensive resource tagging and cost allocation to business domains creates accountability for cloud costs and incentivises teams to optimise their workloads. Organisations that implement chargeback models consistently achieve better cost discipline than those that treat cloud costs as a shared overhead.

The Cost Optimisation Roadmap

Based on our experience, the following sequence delivers the fastest cost reduction with the least operational disruption:

  1. Implement auto-termination and auto-scaling on all Databricks clusters (immediate, low risk, typically 20–30% cost reduction)
  2. Migrate batch workloads to scheduled job clusters with spot instances (2–4 weeks, typically 15–25% additional reduction)
  3. Implement storage lifecycle management for historical data (2–3 weeks, typically 10–20% additional reduction)
  4. Conduct data deduplication and governance audit (4–8 weeks, variable savings)
  5. Implement cluster policies and cost tagging (2–3 weeks, governance improvement with ongoing cost discipline benefits)

The combination of these measures typically delivers 40–60% cost reduction within 3–6 months, with the majority of savings achievable in the first 4–6 weeks.

AzureDatabricksCloud Cost OptimisationData Architecture2026

Key Topics

  • Azure cloud cost optimisation strategies
  • Databricks compute right-sizing
  • Storage tiering and lifecycle management
  • Governance-driven cost reduction
  • Spot instance usage patterns

Need Expert Guidance?

MDN.digital helps European organisations implement the strategies discussed in this article.

Book a Consultation

Modern Data Architecture in 2026: Choosing Between Lakehouse, Data Mesh, and Hybrid Approaches

Read

Databricks Unity Catalog in 2026: The Enterprise Governance Layer Your Lakehouse Needs

Read
MDN Assistant
Online · Powered by AI
Hi! I'm the MDN.digital AI assistant. I can answer questions about our services, case studies, and how we can help your organisation with data governance, EU AI Act compliance, cloud architecture, and more.
Suggested questions