AI & Data

Azure Databricks Training

Azure Databricks is a big data service based on the Apache Spark platform that enables the creation, training, and exploration of data in the cloud.

Duration

Ask about the program → See all

Who it's for

Ideal for teams that…

1 Individuals who want to leverage data to optimize processes.

2 Those who wish to deepen their understanding of Apache Spark.

3 Individuals with basic knowledge of data analysis.

4 Developers, Data Engineers, and Data Scientists.

Outcomes after the program

Hands-on AI and data analytics workshops — built around your team's real cases.

✓

Fundamentals of the Azure Databricks platform.

✓

Data processing and preparation techniques.

✓

Data analysis using Databricks SQL.

✓

Utilization of Apache Spark for data processing.

Program · 15 modules

What we actually do

M01

What is the Databricks Lakehouse Platform

· Describe what the Databricks Lakehouse Platform is.
· Explain the origin of the Lakehouse data management paradigm.
· Outline fundamental challenges related to managing and using data.
· Describe security features of the Databricks Lakehouse Platform.
· Provide examples of organizations that have benefited from using the Databricks Lakehouse Platform.

M02

What is Databricks SQL

· Summarize fundamental concepts for using Databricks SQL effectively.
· Identify tools and features in Databricks SQL for querying data and sharing insights.
· Explain how Databricks SQL supports data analysis workflows that allow users to extract and share business insights.

M03

What is Databricks Machine Learning

· Describe the basic overview of Databricks Machine Learning.
· Identify how using Databricks Machine Learning benefits data science and machine learning teams.
· Summarize the fundamental components and functionalities of Databricks Machine Learning.
· Provide examples of successful use cases of Databricks Machine Learning by real Databricks customers.

M04

What is Databricks Data Science and Data Engineering Workspace

· Describe the basic overview of Databricks Data Science and Engineering Workspace.
· Identify assets provided by the workspace.
· Describe a simple development workflow that queries and aggregates data.

M05

Databricks Workspaces and Services

· Databricks Architecture and Services.
· Data Science and Engineering Workspace.
· Create and Manage Interactive Clusters.
· Notebook Basics.
· Git Versioning with Databricks Repos.
· Using Databricks Repos.
· Getting Started with the Databricks Platform.

M06

Delta Lakehouse

· What is Delta Lake.
· Managing Delta Tables.
· Manipulating Tables with Delta Lake.
· Advanced Delta.

M07

Relational Entities on Databricks

· Databases and Views.
· Views and CTEs.

M08

ETL with Spark SQL

· Query Files Directly.
· Providing Options.
· Creating Delta Tables.
· Writing to Tables.
· Cleaning Data.
· Advanced SQL Transformations.
· UDFs.

M09

Getting Started with Databricks SQL

· Navigating Databricks SQL.
· Unity Catalog on Databricks SQL.
· Schemas, Tables, and Views on Databricks SQL.

M10

Basic SQL on Databricks SQL

· Ingesting Data for Databricks SQL.
· Joins.
· Delta Commands in Databricks SQL.

M11

Presenting Data Visually

· Data Visualization.
· Data Visualizations on Databricks SQL.
· Dashboards on Databricks SQL.
· Notifying Stakeholders.

M12

Apache Spark Programming – DataFrames

· Databricks Platform.
· Databricks Ecosystem.
· Spark SQL.
· DataFrames.
· SparkSession.
· Reader and Writer.
· Data Sources.
· DataFrame and Column.
· Column and Expression.
· Transformation Actions and Rows.

M13

Apache Spark Programming – Transformations

· Aggregation.
· Aggregation Functions.
· Datetimes.
· Dates and Timestamps.
· Complex Types.
· Additional Functions.
· UDFs.
· UDFs Vectorized Functions.

M14

Apache Spark Programming – Spark Internals

· Spark Architecture.
· Spark Cluster, Spark Execution.
· Shuffling and Caching.
· Query Optimization.
· Partitioning.

M15

Apache Spark Programming – Structured Streaming

· Apache Spark Programming.
· Streaming.

Every module is adapted to your stack and context. The above is a starting point — not a fixed agenda.

How we work

From brief to retro in 30 days.

Brief & diagnosis

A call with the team lead + a short survey for participants. We define goals, gap and context.

Program customization

We adapt modules, case studies and code examples to your stack. Approval in 5 days.

Workshop

Trainer-led sessions, hands-on, code review. Mentor available between sessions too.

Retro + report

Outcome report for the team and lead. 30 days of consulting included.

Inquiry

Send a brief. We'll reply within 1 day.

After a short brief we'll prepare a program and a quote. No obligations — it's just a starting point.

✓Quote within 48h of the brief

✓First session within 30 days

✓Pilot before the full decision

✓VAT invoice, payment in instalments possible

Other programs for teams

See all →

AI & Data

Active Directory Training

Hands-on AI and data analytics workshops — built around your team's real cases.

AI & Data →

AI & Data

Advanced Power BI Training

Hands-on AI and data analytics workshops — built around your team's real cases.

AI & Data →

AI & Data

Advanced RPA Developer Training

Hands-on AI and data analytics workshops — built around your team's real cases.

AI & Data →

Azure Databricks Training

Ideal for teams that…

Hands-on AI and data analytics workshops — built around your team's real cases.

What we actually do

From brief to retro in 30 days.

Brief & diagnosis

Program customization

Workshop

Retro + report

Send a brief. We'll reply within 1 day.

Thank you!

Other programs for teams

Active Directory Training

Advanced Power BI Training

Advanced RPA Developer Training