AI & Data

Azure Databricks Training

Azure Databricks is a big data service based on the Apache Spark platform that enables the creation, training, and exploration of data in the cloud.

Duration
6h
Who it's for

Ideal for teams that…

1 Individuals who want to leverage data to optimize processes.
2 Those who wish to deepen their understanding of Apache Spark.
3 Individuals with basic knowledge of data analysis.
4 Developers, Data Engineers, and Data Scientists.
Outcomes after the program

Hands-on AI and data analytics workshops — built around your team's real cases.

Fundamentals of the Azure Databricks platform.

Data processing and preparation techniques.

Data analysis using Databricks SQL.

Utilization of Apache Spark for data processing.

Program · 15 modules

What we actually do

M01
What is the Databricks Lakehouse Platform
  • · Describe what the Databricks Lakehouse Platform is.
  • · Explain the origin of the Lakehouse data management paradigm.
  • · Outline fundamental challenges related to managing and using data.
  • · Describe security features of the Databricks Lakehouse Platform.
  • · Provide examples of organizations that have benefited from using the Databricks Lakehouse Platform.
M02
What is Databricks SQL
  • · Summarize fundamental concepts for using Databricks SQL effectively.
  • · Identify tools and features in Databricks SQL for querying data and sharing insights.
  • · Explain how Databricks SQL supports data analysis workflows that allow users to extract and share business insights.
M03
What is Databricks Machine Learning
  • · Describe the basic overview of Databricks Machine Learning.
  • · Identify how using Databricks Machine Learning benefits data science and machine learning teams.
  • · Summarize the fundamental components and functionalities of Databricks Machine Learning.
  • · Provide examples of successful use cases of Databricks Machine Learning by real Databricks customers.
M04
What is Databricks Data Science and Data Engineering Workspace
  • · Describe the basic overview of Databricks Data Science and Engineering Workspace.
  • · Identify assets provided by the workspace.
  • · Describe a simple development workflow that queries and aggregates data.
M05
Databricks Workspaces and Services
  • · Databricks Architecture and Services.
  • · Data Science and Engineering Workspace.
  • · Create and Manage Interactive Clusters.
  • · Notebook Basics.
  • · Git Versioning with Databricks Repos.
  • · Using Databricks Repos.
  • · Getting Started with the Databricks Platform.
M06
Delta Lakehouse
  • · What is Delta Lake.
  • · Managing Delta Tables.
  • · Manipulating Tables with Delta Lake.
  • · Advanced Delta.
M07
Relational Entities on Databricks
  • · Databases and Views.
  • · Views and CTEs.
M08
ETL with Spark SQL
  • · Query Files Directly.
  • · Providing Options.
  • · Creating Delta Tables.
  • · Writing to Tables.
  • · Cleaning Data.
  • · Advanced SQL Transformations.
  • · UDFs.
M09
Getting Started with Databricks SQL
  • · Navigating Databricks SQL.
  • · Unity Catalog on Databricks SQL.
  • · Schemas, Tables, and Views on Databricks SQL.
M10
Basic SQL on Databricks SQL
  • · Ingesting Data for Databricks SQL.
  • · Joins.
  • · Delta Commands in Databricks SQL.
M11
Presenting Data Visually
  • · Data Visualization.
  • · Data Visualizations on Databricks SQL.
  • · Dashboards on Databricks SQL.
  • · Notifying Stakeholders.
M12
Apache Spark Programming – DataFrames
  • · Databricks Platform.
  • · Databricks Ecosystem.
  • · Spark SQL.
  • · DataFrames.
  • · SparkSession.
  • · Reader and Writer.
  • · Data Sources.
  • · DataFrame and Column.
  • · Column and Expression.
  • · Transformation Actions and Rows.
M13
Apache Spark Programming – Transformations
  • · Aggregation.
  • · Aggregation Functions.
  • · Datetimes.
  • · Dates and Timestamps.
  • · Complex Types.
  • · Additional Functions.
  • · UDFs.
  • · UDFs Vectorized Functions.
M14
Apache Spark Programming – Spark Internals
  • · Spark Architecture.
  • · Spark Cluster, Spark Execution.
  • · Shuffling and Caching.
  • · Query Optimization.
  • · Partitioning.
M15
Apache Spark Programming – Structured Streaming
  • · Apache Spark Programming.
  • · Streaming.
Every module is adapted to your stack and context. The above is a starting point — not a fixed agenda.
How we work

From brief to retro in 30 days.

01

Brief & diagnosis

A call with the team lead + a short survey for participants. We define goals, gap and context.

02

Program customization

We adapt modules, case studies and code examples to your stack. Approval in 5 days.

03

Workshop

Trainer-led sessions, hands-on, code review. Mentor available between sessions too.

04

Retro + report

Outcome report for the team and lead. 30 days of consulting included.

Inquiry

Send a brief. We'll reply within 1 day.

After a short brief we'll prepare a program and a quote. No obligations — it's just a starting point.

Quote within 48h of the brief
First session within 30 days
Pilot before the full decision
VAT invoice, payment in instalments possible

Ochrona antyspamowa (Cloudflare Turnstile) zostanie aktywowana po wpięciu klucza.