Azure Databricks - Build data engineering and AI/ML pipeline - CouponED
Skip to content Skip to sidebar Skip to footer

Azure Databricks - Build data engineering and AI/ML pipeline

Azure Databricks - Build data engineering and AI/ML pipeline

Learn anomaly detection, Data Factory, Azure functions, Spark, Delta lake,  Machine Learning Pipelines with Databricks and Azure ML 

What you'll learn

  • What is Anomaly detection?
  • How to apply unsupervised learning algorithms Isolation Forest, KNN and Clustering based Approach to detect anomalies?
  • Step by Step guide to perform ETL operations using Azure Databricks
  • Understand DataLakeHouse Architecture
  • Build Data Pipeline using Azure Tech stack
  • machine learning model interpretable shapley values
  • Spark structured streaming with Kafka
  • Spark Structured streaming with Azure Event Hub
  • Use MLFlow for managing the end-to-end machine learning lifecycle
  • Anomaly detection on Time series data
  • Building CI/CD Pipeline using Azure Devops
  • Building Data Pipeline using Azure Data Factory
  • Productionizing model using Azure Function and Docker

Get Udemy On

Description

This course is designed to help you develop the skill necessary to perform ETL operations in Databricks, build unsupervised anomaly detection models, learn MLOPS, perform CI/CD operations in databricks and Deploy machine learning models into production.

Big Data engineering:

  • Big data engineers interact with massive data processing systems and databases in large-scale computing environments. Big data engineers provide organizations with analyses that help them assess their performance, identify market demographics, and predict upcoming changes and market trends.

Azure Databricks:

  • Azure Databricks is a data analytics platform optimized for the Microsoft Azure cloud services platform. Azure Databricks offers three environments for developing data intensive applications: Databricks SQL, Databricks Data Science & Engineering, and Databricks Machine Learning.

Anomlay detection:

  • Anomaly detection (aka outlier analysis) is a step in data mining that identifies data points, events, and/or observations that deviate from a dataset’s normal behavior. Anomalous data can indicate critical incidents, such as a technical glitch, or potential opportunities, for instance a change in consumer behavior. Machine learning is progressively being used to automate anomaly detection.

Data Lake House:

  • A data lakehouse is a data solution concept that combines elements of the data warehouse with those of the data lake. Data lakehouses implement data warehouses' data structures and management features for data lakes, which are typically more cost-effective for data storage .

Explainable AI:

  • Explainable AI is artificial intelligence in which the results of the solution can be understood by humans. It contrasts with the concept of the "black box" in machine learning where even its designers cannot explain why an AI arrived at a specific decision.

Spark structured streaming:

  • Structured Streaming is a scalable and fault-tolerant stream processing engine built on the Spark SQL engine. .In short, Structured Streaming provides fast, scalable, fault-tolerant, end-to-end exactly-once stream processing without the user having to reason about streaming.

CI/CD Operation :

  • CI and CD stand for continuous integration and continuous delivery/continuous deployment. In very simple terms, CI is a modern software development practice in which incremental code changes are made frequently and reliably.


Online Course CoupoNED based Analytics Education Company and aims at Bringing Together the analytics companies and interested Learners.

Post a Comment for "Azure Databricks - Build data engineering and AI/ML pipeline"