BI & Data Science Analyst at 10 Senses
Microsoft Fabric brings a brand-new, cutting-edge unified data analytics platform. Its announcement in the 2nd quarter of 2023 has gained excessive coverage in the world of business intelligence, data engineering, and data science.
We have already introduced the Microsoft Fabric architecture, key features, seamlessly integrated analytical tools, and potential here.
Even though Microsoft Fabric, as a complete analytics platform, still evokes emotions among business and data professionals, Microsoft hasn’t rested on its laurels and has been improving the service based on users’ feedback.
The first wave of Microsoft Fabric updates came in July. We have decided to look closer into them and provide you with a condensed overview of the most important component updates that Microsoft Fabric offers.
Core Microsoft Fabric updates
Enhanced Microsoft Fabric home tab
Firstly, Microsoft added two helpful features in the home tab of Fabric. It should facilitate users to get access to relevant documentation or reach the community in search of answers.
As a result, you can now find “Read documentation” and “Explore community” buttons in the main interface of the platform.
Moreover, from now on, when you press the question mark button on the top right, you get a feature-aware help pane. It presents a list of recommended topics, resources relevant to your use case, and links to other resources.
You can find them there:
- feature-aware documents with valuable insights,
- forum topics from the Community related to the features on the current screen,
- other resources that may help you use Microsoft Fabric with ease,
- search engine to enter a keyword to find relevant information and resources in Microsoft articles or Community topics (for example, data science, data lake, data analysis, business intelligence, Power BI, etc.).
Monitoring Hub with a new feature
Apart from the main interface, the monitoring hub has a new feature “Column options”.
You can use it to customize the columns in the job list to get the information that is relevant for you and other data analysts or your data teams.
Once you choose the relevant columns, all your data is automatically memorized. Therefore, you can navigate to other places and when you are back monitoring hub will have the same columns and filters.
Admin Microsoft Fabric Updates
Another major update is the public preview default sharing, which is effective July 5, 2023. As a result, the Fabric Public preview will be set to ON by default unless you, as an admin in your organization, explicitly opt out.
Otherwise, if you don’t change this setting, all users in your company can create Fabric artifacts in workspaces connected to Fabric or Power BI Premium.
Therefore, if you are fine with your organization members using Microsoft Fabric preview features, you don’t need to take any additional actions or enable Microsoft Fabric new features.
Nevertheless, If you need, though, to restrict using the Fabric preview in your Premium and Fabric capacities, you have two options:
- use the security group configuration and capacity levels to limit Fabric to a smaller group of business users,
- turn off the setting at both the tenant and capacity levels to prevent all users from accessing the Fabric preview features.
Microsoft Fabric OneLake updates
Support for switching organizational accounts
Microsoft has also released the new data lake file explorer, OneLike file explorer v1.0.9.0. It enables users to switch between different Microsoft Azure Active Directory (AAD) accounts easily.
As a result, all users who work with multiple AAD tenants across different companies for multiple vendors and multiple systems, or even within one, can now work more efficiently.
Item sharing in the Microsoft Fabric data lake
Another major Microsoft Fabric OneLake update is the possibility to share Fabric items with selected users or groups. It can helps to eliminate data silos. These items include:
- data warehouses,
- lakehouses,
- Spark job definitions,
- Kusto databases,
- KQL query-sets
This data lake update empowers users that are not in workspace roles to collaborate more effectively. Admins, members, and users granted reshare permission can share specific items with additional users outside of the workspace. And all they need to do is click Share in the item list or within the item.
Once you share certain items, recipients can explore them in Data Hub or get a link via email. Moreover, users who share can decide on the level of access that recipient(s) get (for example, Read, Share, Edit etc.).
Microsoft Fabric Synapse Updates
Microsoft Fabric Data Warehouse Updates
Automatic updates for statistics
In Synapse Data Warehouse, from now on, SQL statistics are automatically updated by the query engine.
Therefore, when engine executes the queries, it intends to collect existing statistics for columns and tap into this information to choose the optimal execution plan. As a result, this update can increase query efficiency.
In the past, data engineers or data scientists had to manually update statistical data to perform accurate data analytics. With this update, query engine will assess and refresh all statistics and their status.
Therefore, you can be sure that your query uses the most precise plan for execution and that your workload is positioned for the best performance possible with zero manual intervention.
UPDATE FROM and DELETE FROM T-SQL
Another great update in the Data Warehouse experience for SQL users is the possibility to update and delete existing data in your target table from raw data from the data source’s table using the FROM argument in your UPDATE and DELETE scripts in Fabric SQL.
Optimization of data movements during query execution
When you work, the Fabric SQL engine creates a query plan, including execution steps and data movement operations. The latter ensures that the data needed for each step is ready at the location of its execution.
To improve this process, the Fabric SQL engine has been updated to make data movement more balanced. The improvement is automatically added, and you don’t need to change code in your Synapse Data Warehousing artifacts to use this optimization.
Dataflow Gen2 from the warehouse
Another great update is the possibility to directly create Dataflow Gen2 from your warehouse for data ingestion. All you need to do is click on Get Data and New Dataflow Gen2.
Zero copy table clones (public preview)
Zero copy table clones are instant metadata-only operations that allow you to create copies of your warehouse tables in a quick and easy manner with no extra costs and minimal overhead.
Clones have references to the source table, but there is no data duplication of underlying parquet files. Instead, there is a fork created, and the clone table behaves like an independent table. Therefore, any modifications you make in the source table won’t be reflected in the clone, and the other way around.
You can create zero clone tables for a wide selection of tests, development and production use cases, and experimentation.
For example, client might wish to stage a production release in a zero copy table clone before going fully live. Once they are happy with the data transformation changes, they can merge them into the production table.
Synapse data warehousing sharing (public preview)
Another update that has been released for public preview is the data warehousing feature of sharing. It allows you to easily grant read access to enable downstream users (for example, business analysts) to use organizational data.
As a result, if you are an admin or member within a Fabric workspace, you can share a warehouse with another recipient within your organization by granting them:
- Connect permission,
- Build reports on the default dataset that is connected to your Synapse Data Warehouse,
- Read all data using SQL to read all tables and view them within the warehouse,
- Read all data using Apache Spark to read all files in One Lake through Spark.
Synapse Data Engineering updates
Load to table capability update
As for Data Engineering, the “Load to Table” feature allowing to load a single file to a new table with no-code experience has been equipped with new functionalities, such as:
- Folder-level load to load all files under a folder and its subfolders at once (it automatically traverses all files and loads them to a Delta table),
- Load to existing table to load files and folders to a new or an existing table of your choice,
- Source file option to specify if your source file includes the column same as the first row of data stored and the separator used in it.
Lakehouse sharing
Another Data Engineering update is the possibility for Admins and Members of workspaces to share an individual Lakehouse with users without granting them access to the whole workspace (users can access lakehouse directly from Data Hub or using the link from the sharing notification message).
This way users can get access to the SQL endpoint and default dataset, which enables them to query data using T-SQL or build Power BI reports on top of the lakehouse data.
What is more, users with Viewer roles are able to get extra permissions to access Lakehouse data throughSpark (thanks to permission management).
Notebook improvements
There have been a few useful updates in Notebooks. They can facilitate and increase the quality of the user experience when using Synapse Data Engineering.
These are:
- Adding Notebook Resources on Fabric Notebook, providing data engineers with a writeable file system space to store small-sized files (code modules, datasets, or images),
- Displaying the running cell output after reconnecting to the original session allows you to easily recover your work after accidentally closing the browser or leaving the live session,
- Notebook sharing without granting workspace permissions,
- Updated status bar with new styling and functional enhancements.
Synapse Data Science updates
New capabilities for Data Science Large Language Models (LLMs)
When it comes to Data Science updates, there has been an integration of the Azure OpenAI service with the distributed machine learning library SynapseML.
As a result, data scientists can easily use the Spark distributed computing framework to process numerous prompts with the OpenAI service on Microsoft Fabric. This update can help you build more accurate Machine learning models and generate better data insights.
If you want to dive deep into this new Synapse Data Science feature, read the documentation provided by Microsoft here.
Synapse Real-time Analytics updates
Creating a continuous data connection from Event Hubs to the KQL database
The real-time analytics experience and KQL Database have been enhanced with the possibility to stream Azure Events Hubs, data streaming platform, and event ingestion service directly into your Fabric KQL Database.
Such improvements can help transform data more effectively (reduced latency), unify data management, increase data security, facilitate data migration, and data integration for comprehensive data analytics.
To take advantage of this enhancement, you need to:
- Create a Microsoft Fabric platform-based data connection to a certain event hub instance,
- Connect it to a KQL database.
Remember, though, that if you need to create a cloud connection, you need to start by setting a share access policy (SAS) on the event hub and gather data to be used later in the setting up the connection.
Querying the KQL database from MATLAB
Another useful update in real-time analytics and KQL databases is the possibility to query SQL database data directly from MATLAB.
Therefore, if you are familiar with MATLAB, you no longer need to learn KQL to query the database. Instead, you can leverage your MATLAB skills to query the data and get started with Microsoft Fabric.
Microsoft Fabric Data Factory Updates
Power Query Editor
There have also been enhancements to the Power Query Editor, familiar to all Power BI report creators, such as:
- Auto-fix column names and data types during the mapping of columns (from data sources) dialog,
- Possibility to rename a dataflow inside the Power Query Editor.
Data pipeline activities
Within Microsoft Fabric Azure Data Factory, the data pipelines have also new features, such as:
- Teams activity is available in Azure Data Factory pipelines to customize a message to Teams channel or a group chat,
- Possibility to parametrize your Fabric artifacts in your data pipelines to use expressions, functions, parameters and variables,
- Enhanced performance when copying parquet files to your Lakehouse,
- Option “Save as” is supported by data pipelines directly from your Data Factory workspace, which allows you to easily build upon or edit existing pipelines without completely rebuilding them,
- Supporting column mapping for Lakehouse destination in Copy activity.
Microsoft Fabric Community Updates
Implementing a Lakehouse with Microsoft Fabric Courses
In the Microsoft Learn space, there appeared a new course. Its scope can help you build your foundational skills in data engineering and data analytics, as it focuses on the Lakehouse concept.
This course will help you familiarize with the capabilities of Apache Spark for data processing, data integration and the essential techniques for data versioning, and reliability by working with Delta Lake tables. It also covers data ingestion and orchestration using Dataflow Gen2 and Data Factory pipelines.
You can check it out here.
Filtering by product for User Groups and Events
Another major Community change is the possibility to filter by product for User Groups and Events in the Community site.
From now on, you can filter by product for both Fabric User Groups and Events on the Fabric Community site. It allows you to easily find User Groups along with your interests.
All in all, there have been quite a lot of useful enhancements to Microsoft Fabric in Synapse Dara Warehouse, Synapse Data Science, Synapse Data Engineering, Synapse Real-time Analytics, or Microsoft Fabric Community.
With such improvements to the components of Microsoft Fabric, it becomes even more useful for data professionals. All of them can help to increase the efficiency of data analytics services of the platform, govern organization-wide data mesh more effectively, facilitate data integration, enhance user experience, and take a step forward in the direction of data democratization.
It is also worth mentioning that Microsoft Fabric offers updates in Power BI as well, which you can check here. There have also been improvements in the brand-new Microsoft Fabric Data Activator.
Data Activator is a data-event trigger service that is now available only in limited public preview. If you are eager in exploring it, you can sign up here and help Microsoft enhance this existing new feature.
Talk to our expert
Are you looking for expert skills for your next data project?
Or maybe you need seasoned data scientists to extract value from data?
Fill out the contact form and we will respond as soon as possible.