Best Practice Guide and Checklist-For-Power-Bi-Projects

Download as pdf or txt
Download as pdf or txt
You are on page 1of 34

A Best Practice Guide

and Checklist for Power


BI Projects

[email protected]
Paul Turley
Principal Consultant, linkedin.com/in/pturley
Pragmatic Works
Microsoft Data Platform MVP SqlServerBi.blog

@Paul_Turley
Objective sqlserverbi.blog/2019/08/24/power-bi-project-good-
and-best-practices

Set of guidelines that address many


aspects of new projects.
Recommendations are applicable in most
use cases.
Living document with ongoing revisions
with the goal to provide a comprehensive,
best practices guide as the platform
continues to mature and as experts
continue to use it.
This Session is: …about options and choices
…it is:
• NOT a 100-level introduction to
Power BI
• NOT a technical “how-to” deep
dive but we will discuss some
technical topics
• Guidelines for experienced
Business Intelligence & Power BI
practitioners
• Good knowledge for less-
experienced Power BI
developers
Solution Architecture
All Business Intelligence projects involve the
same essential components including:
• Source Queries Composite models |
Aggregations
• Data transformation steps
Embedding service | B2B Sharing
• Semantic data model
• Calculations (typically measures) Premium capacity | All employees
Certified datasets | Self-service reporting
• Data visualizations Pro license | Share with small group

Desktop/Free service | Web sharing


Checklist: Identify Your
Audience
Categorize the solution by identifying the
author & user roles related to the project
Author roles: %
 Author role: Business Data Analyst
 Author role: Skilled Data Modeler,
Analyst, Data Scientist
 Author role: IT BI Developer

User roles:
 Users’ role: Report/Dashboard
Consumer
 Users’ role: Self-service Report
Author
 Users’ role: Advanced Data Analyst
Formal
Project
Checklist: Solution Type for the project
Identify the Project Type & related Solution Architecture:
 Formal projects
are scoped, funded, staffed and executed with the collaboration of a business champion and
stakeholders; and IT Business Intelligence developers and data managers. These projects
promote business and IT-governed datasets and certified reports.
 Informal projects
are executed by business users and are considered ad hoc in nature. Datasets are generally
not IT governed, and reports are typically not certified.
 Hybrid projects
can be anything in-between. They might be a user-authored report using published, certified
dataset used for self-service reporting. Informal, self-service datasets can be migrated to
governed datasets in collaborative IT/business projects.
Join The Separatist
Movement
Checklist: Dataset & Report Architecture
Choose dataset architecture:
 Single PBIX file
For small group, departmental project authored by one developer for a
limited group of users
 Separate dataset and report PBIX
Design & deploy a separate dataset PBIX file – from report file(s) – when
the dataset should be branded as a Certified dataset.
For formal projects with more than one dataset & report developer, to
coordinate work
 SSAS/AAS as a data modeling option
when those databases exist or where IT operations insist on managing
development and maintenance through integrated source control (e.g.
Visual Studio Team Services Vern’s Red
& Azure DevOps) Dot
Operational & Paginated Reports
• Power BI is not a replacement for paginated, operational reporting
• For static, multi-page, printable reports; use SQL Server Reporting
Services (SSRS) aka “Paginated Reports” instead of Power BI
• Paginated Reports/SSRS is integrated into the Power BI service with
Premium capacity licensing and can be integrated with interactive
Power BI reports and Power BI data datasets
• To a limited degree, some operational reports can be reproduced
using Power BI reports and SSRS can be used, some a limited degree,
to create interactive reports
If Users Need Excel, Give them Excel
• Teach analyst users how to use
Excel with Power BI
• Don’t “export”, …“connect”
• “Analyze In Excel” allows Excel
to connect, live, to a published
Power BI dataset
• Now available to Power BI Pro &
Free Premium licensed users
• Now available to “free” licensed
users in a Premium
Checklist: Report Types
Dashboard & Scorecard Statistical & Scientific Financial balances &
style reporting analysis worksheets
 Infographics  Deviations & percentiles  Cost accounting &
 KPIs & scorecards  Forecast trends & balance sheets
 Segmented comparisons predictions  General ledger
 Time-series trends  Scatter plots  Accounts receivable &
 Population analysis payable
 Invoices
 Forms & lists
Checklist: Query Optimization
Query Source Object Power Query Query Data Model Table
(table, view, file) (“M” code produces folded source query) (calculations performed in DAX)

 Decide: Perform column transformations in ETL, database view, Power Query or DAX?
 Decide: How is process managed & governed? who maintains the query?
 Avoid using SQL statements in PQ queries. Use database views.
Views and tables support query folding. SQL statements generally do not.
 Remove unnecessary columns & filter rows early in the query
 Consolidate field renaming, removing fields and data type changes
 Add custom columns in Power Query instead of calculated columns in DAX, where
possible
 Use friendly field names for all fields that won’t be hidden in the data model
 Rename steps and add annotations in M script
Workspace and App Management
For a formal project, create the following workspaces:
 DEV Workspace - Only development team members need Contributor
access to this workspace. This workspace does not need to have
Premium capacity; unless, developers need to unit test incremental
refresh or other Premium features.
 QA Workspace - All testers must have View access for testing and
Contributor access for report authoring. Should be in Premium
capacity to test incremental refresh.
 PROD Workspace - Omit the “PROD” designation in the name. This
workspace will be the name of the published app that users will see in
their Apps, Home and Favorite pages so use a name that is simple and Vern’s Red
sensible. Must have Premium capacity to share the app with non-Pro Dot
licensed users.

Deployment Options:
• PowerShell script may be used to publish datasets and reports, and to change dataset bindings. It is possible to either publish to a
production workspace or to effectively move assets from one workspace to another. This approach is discussed briefly in the Power
BI Enterprise Deployment Guide. Other approaches are discussed here: Power BI release management
• OneDrive folder sync for development workspaces (later slide)
Promote Self-service Reporting
Non-governed Data Governed Data
• Teach & support analyst users to • Separate datasets from reports
use Power BI to acquire, mashup
& model data • Publish to a secured & managed
workspace
• “make mistakes, get messy”
– Lilly Tomlin, Miss Frizzle • Promote & Certify datasets
• Deploy to “user” designated • Use dataflows for standardized
workspaces common data models
• User-authored solutions be used • Enable users to connect to
to prototype & pattern governed published datasets & create
data models
their own reports
To The
Cloud
Warning:
Explicit
Measures
Implicit and Explicit Measure Guidelines
Implicit measure
= numeric field with default summarization
Explicit measure
= Defined using DAX expression
• Implicit measures are typically OK in informal
projects
• Measures should be explicitly defined in formal
data models
• Implicit measures don’t work in some client tools
Vern’s Red
Dot
Certified & Shared Datasets
• Use Dataset endorsement
& certification in the
service
• Certification can be
managed by security group
• Access to datasets can be
restricted to certified
datasets
• Organization defines
certification policy &
provides documentation
Enterprise Scale Options
In many ways, Power BI has now surpassed the capabilities of SQL Server Analysis
Services. Microsoft are investing in the enterprise capabilities of the Power BI
platform by enhancing Power BI Premium Capacity, adding Paginated Report and
features to support massive scale specialized use cases. Consider the present and
planned capabilities of the Power BI platform; before, choosing another data
modeling tool such as SSAS.
Resources:
https://sqlserverbi.blog/2018/07/27/power-bi-for-grownups
https://sqlserverbi.blog/2018/12/13/data-model-options-for-power-bi-solutions
Power BI Licensing Plan Checklist
On-premises server:
Capacity and platform:
 SQL Server Enterprise + SA, or:
Shared capacity service:
 Premium license
 Assign user licenses

Dedicated capacity:
Assign user licenses and access:
 Assign Pro licenses to all developers, admins and
 Are Premium features required?
report author users
 Is dedicated capacity needed?
 If Premium, use app deployment & assign Free
 Is Premium more cost-effective than licensing all licenses to all users
users?
 Assign membership and access to workspaces
Managing Power BI Desktop Files
• Store in a centrally managed network-assessable folder
The storage folder should support automatic backup and recovery in the case of
storage loss.
• Report and dataset developers must open files from the Windows file
system
Files must either reside in or be synchronized with the Windows file system.
• Files containing imported data typically range in size from 100 to 600 MB. Any shared
folder synchronization or disaster recovery system should be designed to effectively
handle multiple files of this size.

Options:
• OneDrive For Business (shared by team, with folder synchronization).
• SharePoint or SharePoint Online (with folder synchronization).
• GitHub and/or VSTS with local repository & folder synchronization. If used, Git must be
configured for large file storage (LFS) if PBIX files are to be stored in the repository.
Folder & Workspace Synchronization
1. Create team site in Office 365, add developers
2. Create development folder in team site & synchronize with desktop
3. Create workspace(s) & set OneDrive group
4. Add PBIX files to workspace using Get Data from team OneDrive folder
5. Edit & save PBIX files. Deployment is automatic.

Vern’s Red
Dot

https://sqlserverbi.blog/2019/11/24/setting-up-power-bi-project-team-collaboration-version-control
File & Workspace Management Checklist
 Create storage locations and folder structure for  Decide on Workspace and App Management, workspace &
Development file management: app name, etc.:
 Development file storage  Create PROD workspace (omit PRD from name), assign
 Team member collaboration environment & processes dedicated capacity if available.
 Folder synchronization  Create QA workspace (post-fix name with QA), assign
dedicated capacity
 Define File naming standards
 (optionally) Create DEV workspace (postfix name with
 Decide on dataset and report names
DEV), dedicated capacity not required (or combine with
 Define the Version Control & Lifecycle Management: QA workspace).
 Postfix files with 3-part version number
 Remove version number from published files in QA and
PROD
 Create Version History table in Power Query
 Increment version numbers in data model
 Backup PBIT files for archive
 Create measures: Last Refresh Date/Time
 Create measure: Current Version
 Add data model info page to report
Model Design Guidelines
• Dimensional design concepts haven’t changed in 20 years & are as true as ever
• Dimensional modeling “rules” should be followed but can be relaxed for Power BI
in certain cases, such as:
• Leaving some dimensional attributes in fact tables
• Use natural keys rather than generating surrogate keys
• The art of dimensional modeling ranges from simple to complex. Start with the
basics.
• Flattened “spreadsheet” models are OK for small, informal projects but have
significant limitations
• As models grow in size & complexity, data quality challenges will surface that can
be solved by implementing proper governance controls

The Kimball Method: https://www.kimballgroup.com/data-warehouse-business-intelligence-resources/kimball-techniques/dimensional-modeling-techniques


Lawrence Corr, Model Storming Agile method: https://modelstorming.com/hierarchy-map
Model Design Checklist
 Model for the user experience, not for developers for maintainability.
 Build star schemas  Annotate code
Wherever possible, reshape data into fact a dimension tables with Use in-line comments and annotations in all code including SQL, M
single key, one-to-many relationships from dimensions to fact. and DAX; to explain calculation logic and provide author and
 Enforce dimension key uniqueness revision information.
Just because a key value “should” be unique, there is no guarantee  Remove all unused fields – if in doubt, take it out
that it will be unless enforced at the data source. Perform grouping  Hide all fields not used directly by users
and duplicate reduction in the data source views or Power Query primary and foreign key columns, numeric columns used to create
queries to guarantee uniqueness. Duplicate record count checks measures, and columns used to specify the sort order of other
and other mechanisms can be applied to audit source data for fields.
integrity but do not allow the data model to violate these rules.
 Use friendly field names
 Avoid bi-directional filters & unnecessary bridging tables Rename all visible columns (in Power Query) to short but user-
These data modelling patterns adversely affect performance. friendly names with mixed case and spaces.
 Consider using DAX measures rather than complex & inefficient  Set to Do Not Summarize
relationships Any non-hidden numeric columns that are not intended to roll-up
 Create custom columns in Power Query or summarize values. Columns set to summarize are indicated with
Rather than DAX calculated columns wherever possible for row- a Sigma icon.
level derived columns. This maintains a consistent design pattern
Managing Dataset Size with Parameters
• Use parameters whether
implementing incremental
refresh or not
• RangeStart & RangeEnd
parameters must be date/time
type
• Apply range filter
on date/time
column in
Power Query
Vern’s Red
Dot

*Incremental Refresh is a Premium feature


Training & Usability Support Plan Checklist
Training Guidelines:
For general best practice training, don’t reinvent the wheel. There are many good books and training programs
available that took several years to develop. Best practices continue to evolve quickly.
Promote and teach “your way” within your organization. Don’t just turn users loose with the tools and expect
them to make good decisions.

Training and Usability Support:


Develop & Document Support & training plan for users:
 Usability training for read-only report/app users
 Self-service reporting for Novice Report Authors & Data Analysts
 Training for advanced analysts & developers

Choose or develop training platform & curriculum:


 Third-party training courses for developer orientation
 Use internal training & support to direct users to your solution
 Teach users to use governed datasets, standard or self-service reports
Master Project Preparation Checklist
Solution Audience: insist o management development and  Remove version number from published source data into conformed dimension & fact
maintenance through integrated source control files in QA and PROD tables
 Categorize the solution by identifying the author (e.g. Visual Studio Team Services & Azure  Create Version History table in Power Query
& user roles related to the project: DevOps)  Create views in database for each dimension and
 Increment version numbers in data model fact
 Author role: Business Data Analyst  Identify the Project Type & related Solution  Backup PBIT files for archive
Architecture:  Create measures: Last Refresh Date/Time  Enforce key uniqueness to remove all duplicate
 Author role: Skilled Data Modeler, Analyst, Data  Project type: Formal project keys from all dimension tables
Scientist  Create measure: Current Version
 Project type: Informal project  Add data model info page to report  Query Date dim/lookup table at source if it exists
 Author role: IT BI Developer  Project type: Hybrid project
 Decide on Workspace and App Management,  If not available, generate Date dim/lookup table
 Users’ role: Report/Dashboard Consumer  Architectural approach: Single PBIX in Power Query
workspace & app name, etc.:
 Architectural approach: Separate dataset
 Users’ role: Self-service Report Author and report PBIX  Create PROD workspace (omit PRD from
name), assign dedicated capacity if Data modeling:
 Architectural approach: Report PBIX available.
 Users’ role: Advanced Data Analyst connected to SSAS or AAS  Build star schemas
 Create QA workspace (post-fix name with
Training and Usability Support: QA), assign dedicated capacity
 Understand DirectQuery model trade-offs and  Enforce dimension key uniqueness
special use cases. Avoid if possible.  (optionally) Create DEV workspace (postfix
 Develop & Document Support & training plan for name with DEV), dedicated capacity not  Avoid bi-directional filters & unnecessary
users:  Define your Release Management, DevOps & required (or combine with QA workspace).
bridging tables
 Usability training for read-only report/app Automation strategy (if any – Might be OK to
users deploy files manuall, to automate or not to  Consider using DAX measures rather than
automate)
 Self-service reporting for Novice Report Assign licenses and access: complex & inefficient relationships
Authors & Data Analysts File & Workspace Management:
 Assign Pro licenses to all developers, admins and  Create custom columns in Power Query
Solution Type & Architecture:  Create storage locations and folder structure for report author users (QA?)
Development file management:  Annotate code
 Identify the Solution Type for the project. This  Assign Free licenses to all users if Premium/app
will guide other project management designs:  Development file storage deployment will be used  Hide all fields not used directly by users
 Design single PBIX file for small group,  Team member collaboration environment &  Assign membership and access to workspaces  Use friendly field names
departmental project authored by one developer processes
for a limited group of users Query Design:  Set to Do Not Summarize
 Folder synchronization
 Design & deploy a separate dataset PBIX file –  Create fact date range filter parameters:
from report file(s) – when the dataset should be  Define File naming standards RangeStart & RangeEnd to reduce volume in
branded as a Certified dataset PBIX file under 400 MB.
 Design separate dataset and report PBIX files for  Decide on dataset and report names  Filter large fact tables with range filters, consider
formal projects with more than one dataset & incremental refresh policies if slow and/or over
report developer, to coordinate work  Define the Version Control & Lifecycle
Management: 800 MB compressed.
 Use SSAS/AAS as a data modeling option when  Postfix files with 3-part version number  Design source queries (T-SQL?) to reshape
those databases exist or where IT operations
[email protected]

linkedin.com/in/pturley

SqlServerBi.blog
Paul Turley
@Paul_Turley

Please connect with me using one of these mediums

You might also like