Configuring the Celery Kubernetes Executor for Airflow 2.0
Share this post

Introduction to Power BI

Power BI is a complete reporting solution that is used in many organisations to produce visually stunning reports. Once reports are published to Power BI service, they are available to business users and help them to perform business analysis on many crucial business problems. Power BI allows users to easily perform descriptive analytics, e.g., analyse historical data to understand trends and patterns.

Building a report involves a few steps which complexity is case dependent. First, a developer should determine what data to ingest and build a model from it. Once a model is ready, a developer can start telling a story with data by means of a great choice of visuals. One has to point out that during report development one usually iterates between building model and building visuals until satisfactory report responsiveness is achieved.

Model design

A developer can perform data integration from various sources using Power Query. It supports many sources like SharePoint, Azure SQL server or Databricks. The full list can be found here. Data from different sources can be used within one report as one or more tables. If additional transformations are required on the top of loaded data they can be performed with Power Query in low code/no code way:

 

On top menu possible transformations are shown (tabs Home, Transform, Add Column). On the left one can see list of all Power Query queries in a report. Section Applied Steps lists all steps for query Product. Each query result is a scalar or a table. In the following example query Products is a table and its preview can be seen in the middle of the figure.

In the most simple scenario a developer creates a model as one table. This approach will work for very small reports with very low volume of data. However, in most cases a developer has to build a star model with one or more fact tables like this:

 

In this example table Sales is a fact table and stores business information which will be later aggregated in report’s visuals. The fact table is surrounded by dimension tables which are used in report as filters and define aggregation in visuals. Most common connection between tables is 1-* which is also called ‘one to many’. Symbol 1 is on the side of dimension table and * in fact table which means that tables are linked by columns for which there are only unique values in dimension table but they can repeat in a fact table. Almost all connections are defined by solid lines which means that they are active and will be used in the visuals (unless it is altered by a measure). In the following example table date is a role playing dimension meaning that its Date column can be ship date or order date depending on the situation. Dashed line in diagram indicates that by default connection is inactive and has to be enabled on the level of a measure.

In the example above only one fact table is present in the model. In a more general case there can be more than one fact table connected with each other via one or more dimension tables.

Most often a developer creates tables in Import mode which guarantees the best performance. In case volume of data is large. e.g. few hundred million records or more or a business requirement is to access in near real time the most optimal storage mode is direct query. With the recent release of MS Fabric platform new powerful storage mode has come to play: Direct Lake. In Direct Lake a report connects directly to Parquet columnar based files in Lakehouse giving the performance of import and data freshness of direct query.

Data Security

When working with data in the corporate world security is usually one of the most important business requirements. Power BI developers can secure their reports by sharing them with only required users. Usually access to reports is implemented by security groups and Azure Active Directory. In some cases an access should still be narrowed down and users should see only part of a report. This can be achieved by row level security (RLS) roles. For example, once RLS is set up one user can see only data for Europe and another can see only data for USA. Roles can be defined in Power BI desktop using manage roles:

 

Data Visualisation

Once model is built and secured a developer can start to fill visuals with data.

Report

Report is the most common way a developer visualizes data. A developer can choose amongst many powerful visuals:

One can use line and bar charts to show categorical information or time series information. In almost every report developer uses filter visual which makes reports dynamic. Tables can be shown by table or matrix visuals. There are also many specialised visuals used on various situations. For example, for KPI management card visual can be used:

Once a developer spends reasonable amount of time a report page can look like this:

Within reports one can create many pages. In order to share report with users a developer publishes it to Power BI service.

Dashboard

In order to summarize reports in one-pager one can use interactive dashboard in Power BI service :

All visuals can come from one or more reports. Visuals pinned to dashboard show the current state of visuals from parent reports.

Mobile BI

Usually users consume reports on big screens, e.g. 14”>=. However, there is also a possibility to consume them  using smartphones:

One has to keep in mind that mobile view is not created automatically and a developer must create it beforehand from selected report visuals. Also, special app should be installed in order to allow consumption of reports on a smartphone.

Power BI App

Usually users need to have access to many reports. Each report has its own link. In order to reduce a number of links a user has to handle a developer can group reports into Power BI app:

 

Another advantage of Power BI App is that it separates development and production environments for reports: changes in report visuals are visible for users only after app is updated. This can be quite useful if a developer wants to discuss with selected business user a new visual in a report but don’t want all users to see it in such an early stage.

CI/CD for Power BI reports

Power Bi developers can use versioning systems like git and CI/CD approach to improve quality of reports.

Recently a new feature has been added to Power BI desktop which allows to save a report as individual plain text files  in intuitive structure folder. In order to do so a developer should choose Power BI project (PBIP) option when saving:

Once a report is saved as PBIP it is stored as hierarchical folder structure which contains both semantic model and report:

A developer can create Azure DEV OPS repo and commit every change of a report to it. In Fabric enabled workspace one can sync report item with repo

A developer can push local changes in PBIP to Azure DEV OPS git and then visit Fabric workspace and sync it with repo. A developer can switch between git branches in Fabric workspace if necessary.

In order to follow continues integration and continues deployment (CI/CD) approach a developer can create deployment pipeline with three environments like this:

For each environment a different Power BI premium or Fabric workspace is used. The following picture shows example of such pipeline:

Usually for development environment only sample data is loaded to reports and all data to test and production environments. When a new feature is added to development environment it can be deployed to Test environment and once it is tested it can be deployed to Production environment. In case of Production environment a developer can use Power BI app to share reports with users.

Author

  • Krzysztof Dymkowski

    Krzysztof is a seasoned BI developer with over 5 years of experience. Mostly using Microsoft stack like Power Bi and Azure but also has some experience with Databricks and Python. Focused on solving performance problems in reports and building scalable data models for reporting. He enjoys running in forest and playing Total War Warhammer III.

    View all posts
Share this post

Krzysztof Dymkowski

Krzysztof is a seasoned BI developer with over 5 years of experience. Mostly using Microsoft stack like Power Bi and Azure but also has some experience with Databricks and Python. Focused on solving performance problems in reports and building scalable data models for reporting. He enjoys running in forest and playing Total War Warhammer III.