Introduction to Azure Synapse Analytics 1

Introduction to Azure Synapse Analytics

https://learn.microsoft.com/en-us/training/paths/get-started-data-engineering/

What is Azure Synapse Analytics

  • Descriptive analytics, “What is happening in my business?”
  • Diagnostic analytics, “Why is it happening?”
  • Predictive analytics, “What is likely to happen in the future based on previous trends and patterns?”
  • Prescriptive analytics, decision making based on real-time or near real-time analysis of data.

Azure Synapse Analytics provides
[…] support for multiple data storage, processing, and analysis technologies in a single, integrated solution.
[…] including SQL, Apache Spark, a single, consistent user interface.

Azure Synapse Analytics workspace
A Synapse Analytics workspace defines an instance of the Synapse Analytics service in which you can manage the services and data resources needed for your analytics solution.

Working with files in a data lake
One of the core resources in a Synapse Analytics workspace is a data lake, in which data files can be stored and processed at scale.

Ingesting and transforming data with pipelines
[…] built-in support for creating, running, and managing pipelines that orchestrate the activities necessary to retrieve data from a range of sources, transform the data as required, and load the resulting transformed data into an analytical store.

Note:
Pipelines in Azure Synapse Analytics are based on the same underlying technology as Azure Data Factory.

Querying and manipulating data with SQL

  • A built-in serverless pool that is optimized for using relational SQL semantics to query file-based data in a data lake.
  • Custom dedicated SQL pools that host relational data warehouses.

Processing and analyzing data with Apache Spark
Apache Spark is an open source platform for big data analytics. Spark performs distributed processing of files in a data lake by running jobs that can be implemented using any of a range of supported programming languages. Languages supported in Spark include Python, Scala, Java, SQL, and C#.

In Azure Synapse Analytics, you can create one or more Spark pools and use interactive notebooks to combine code and notes as you build solutions for data analytics, machine learning, and data visualization.

Exploring data with Data Explorer
Azure Synapse Data Explorer is a data processing engine in Azure Synapse Analytics that is based on the Azure Data Explorer service. Data Explorer uses an intuitive query syntax named Kusto Query Language (KQL) to enable high performance, low-latency analysis of batch and streaming data.

Integrating with other Azure data services

  • Azure Synapse Link enables near-realtime synchronization between operational data in Azure Cosmos DB, Azure SQL Database, SQL Server, and Microsoft Power Platform Dataverse and analytical data storage that can be queried in Azure Synapse Analytics.
  • Microsoft Power BI integration
  • Microsoft Purview integration enables organizations to catalog data assets in Azure Synapse Analytics, and makes it easier for data engineers to find data assets and track data lineage when implementing data pipelines that ingest data into Azure Synapse Analytics.
  • Azure Machine Learning integration

When to use Azure Synapse Analytics

  • Large-scale data warehousing
  • Advanced analytics
  • Data exploration and discovery
  • Real time analytics
  • Data integration, Azure Synapse Pipelines enables you to ingest, prepare, model and serve the data to be used by downstream systems. This can be used by components of Azure Synapse Analytics exclusively.
  • Integrated analytics