Today at Microsoft Connect(); we introduced Azure Databricks, an exciting new service in preview that brings together the best of the Apache Spark analytics platform and Azure cloud. As a close partnership between Databricks and Microsoft, Azure Databricks brings unique benefits not present in other cloud platforms. This blog post introduces the technology and new capabilities available for data scientists, data engineers, and business decision-makers using the power of Databricks on Azure.
Azure Databricks Preview
Azure Databricks is an Apache Spark-based analytics platform optimized for the Microsoft Azure cloud services platform. Designed with the founders of Apache Spark, Databricks is integrated with Azure to provide one-click setup, streamlined workflows, and an interactive workspace that enables collaboration between data scientists, data engineers, and business analysts.
Quickstart: Get started with Azure Databricks using the Azure portal :
This quickstart shows how to create an Azure Databricks workspace and an Apache Spark cluster within that workspace. Finally, you learn how to run a Spark job on the Databricks cluster.
In Databricks, you can create two different types of resources:
Standard Clusters: Databricks’ standard clusters have lot of configuration options to customize and fine tune your Spark jobs. You can learn more about standard clusters below.
Serverless Pools (BETA): With serverless pools, Databricks’ auto-manages all the resources and you just need to provide the range of instances required for the pool. Serverless pools support only Python and SQL. Serverless pools also auto-configures the resources with right Spark configuration. Visit Serverless Pools to know more about them.
Read more on the Microsoft Azure Blog here: A Technical Overview of Azure Databricks after Microsoft Connect() 2017.