Power BI metadata contains information on the underlying datasources referenced along with all relevant report design and instructions used to generate report contents. This metadata also contains source data refresh rates, automated report distribution schedules and much more. It is the glue that binds the entire Power BI reporting solution together.
Overview of Power BI system and features
For a Power BI developer or administrator, arguably considered the most important metadata is the inter-relationships between individual Power BI reports and the underlying datasources they are built on. Whenever a change is planned in these datasource (file) structures or database models, this metadata answers the critical question “which of my Power BI reports may be affected by these (datasource) changes?”
A SQL Server database contains this metadata in a table that was originally designed for use with SSRS, and extended to support Power BI. The following SELECT statement can be used to extract this information from the Microsoft Reporting Services SQL database:
WHEN 2 THEN 'SSRS Report'
WHEN 5 THEN 'DataSource'
WHEN 7 THEN 'Report Component'
WHEN 8 THEN 'Shared Dataset'
When 13 Then 'Power BI Report'
ELSE 'Other - not applicable'
END AS TypeDescription
,CONVERT(varbinary(max),Content) AS Content
WHERE Type in (2,5,7,8,13)
However, this same method does not work for Power BI metadata stored in the same catalog.content field. Reason being, the encoded Power BI metadata is actually the entire .pbit archive comprised of several compressed XML documents and folders. Essentially it is an encoded binary file.
To extract this encoded data for Power BI content, you must use a programming language such as .net to detect and decode the Bit Order Marker and stream the ( essentially zip file ) contents into memory or write to disk for processing. Below are C#.net code snippets to accomplish this:
, convert(varbinary(max), c.content) as RDL
FROM Catalog c
WHERE c.Content IS NOT NULL
AND c.Type IN (2,13,5,8,7)
We stream this information into a DataTable object. Then for each row we write the content of the report out to a file with this C#.net code.
string extension = ( see explanation below. use c.type to derive )
string filename = System.IO.Path.GetTempPath() + System.Guid.NewGuid().ToString("N") + extension;
using (var stream = System.IO.File.Create(filename))
byte content = (byte)row["RDL"];
stream.Write(content, 0, content.Length);
This also works for SSRS types stored in the SQL database Report Server catalog as well (reports, report components, datasource and shared dataset definitions). When writing the files, simply append the appropriate filename extension for each file based on the catalog.type value. For example, for “SSRS reports” catalog.type = 2, append extension = “.rdl” to the filename. For “Power BI reports” catalog.type = 13, append extension = “.pbix” to the filename, and for the remainder of types, use extension = “.xml”.
For the Power BI report, once written to disk as a decoded .pbix file, you can then open the archive file ( if renamed to .zip for example ) to easily extract the Power BI metadata file contents. This will include:
/report/layout – json Power BI report contents
datamashup – the raw data used by the report, still compressed and encoded
… plus a number of other files used by Power BI
The embeded layout file contains Power BI metadata in a json text file. Note that some of the other .pbix file contents are themselves archive binaries and would need to be decoded. For example the datamashup file is itself an archive with an encoding wrapper that contains some block header & footer information that would need to be stripped off in order to extract the archive file contents.
Members of our R&D team, Gordon, Matteo, Ibrahim and Madel, are adding integrated Power BI search, documentation and change impact analysis support to our Dev Surge 365 Enterprise edition software, with an anticipated availability in early 2018. Find it EZ, change made simple.™
Three inter-related layers or processes in a robust BI analytics solution can include:
a Symantic (meta-data) layer;
Extract, Transform, Load (ETL) jobs.
Let’s take a closer look at each of these and how small changes within any one of them can affect your critical business reports at the top of your BI stack.
(1) Semantic layer – an abstracted ( meta-data ) layer that contains simplified business views mapped to underlying relational database models and data warehouses. This layer is a pre-requisite for some WSYWIG, drag and drop design reporting tools.
A semantic layer is a business representation of corporate data that helps non-technical end users access data using common business terms. The semantic layer is configured by a person who has knowledge of both the data store and the reporting needs of the business. The person creating the semantic layer chooses to expose appropriate fields in the data store as “business fields,” and to hide any fields that are not relevant. Each business field is given a friendly, meaningful name, and the business fields are organized in a way that will make sense to business users.
By using common business terms, rather than data language, to access, manipulate, and organize information, a semantic layer simplifies the complexity of business data. This is claimed to be core business intelligence (BI) technology that frees users from IT while ensuring correct results when creating and analysing their own reports & dashboards. In reality, however, IT staff are typically needed to create and maintain the semantic layer ( business view definitions ) which maps tables to classes and columns to objects.
(2) OLAP cubes – the arrangement of data into Cubes overcomes a limitation of relational databases, which are not well suited for near instantaneous analysis and display of large amounts of data. OLAP data is typically stored in a star schema or snowflake schema within a relational data warehouse or in a special-purpose data management system.
An OLAP cube is a term that typically refers to multi-dimensional array of data. OLAP is an acronym for online analytical processing, which is a computer-based technique of analyzing data to look for insights. A cube can be considered a multi-dimensional generalization of a two- or three-dimensional spreadsheet. Each cell of the cube holds a number that represents some measure of the business, such as sales, profits, expenses, budget and forecast. A Slice represents two-dimensional view of an OLAP Cube that arranges data in a grid, similar to a spreadsheet; a Slice functions much like a report or a query in an RDBMS in that it returns data based on a request for what to see
Although many report-writing tools exist for relational databases, these are slow when the whole database must be summarized, and present great difficulties when users wish to re-orient reports or analyses according to different, multidimensional perspectives, aka, Slices. The use of Cubes facilitate this kind of fast end-user interaction with data. Some reporting tools like Crystal Reports include the ability to define simple two-dimentional OLAP cubes for generating graphs for example.
(3) ETL Software – import clean data from virtually any source into permanent, structured relational database tables.
ALSO CALLED: Data Extraction Software, Database Extraction Software
DEFINITION: In managing databases, ETL refers to three separate functions combined into a single programming tool: Extract, Transform, Load
First, the extract function reads data from a specified source database and extracts a desired subset of data.
Next, the transform function works with the acquired data – using rules or lookup tables, or creating combinations with other data – to convert it to the desired state.
Finally, the load function is used to write the resulting data (either all of the subset or just the changes) to a target database, which may or may not previously exist.
ETL can be used to acquire a temporary subset of data from virtually any file for reports or other purposes, or a more permanent data set may be acquired for other purposes such as: the population of a data mart or data warehouse; conversion from one database type to another; and the migration of data from one database or platform to another.
A change to the underlying source database, or at any one of these BI layers can affect the hundreds or in some organizations, thousands of reports that are dependent on the underlying data model. Something as simple as renaming a field or variable name can essentially break the chain, and stop reports from working or producing accurate results the business depends on.
Nowimagine a change impact analysis tool that can search your entire BI process flow to show you all affected references throughout your Business Intelligence stack.
Find it EZ – change made simple. Download your free trial today @ http://www.finditez.com
Inexperienced and/or over-eager technology workers often unintentionally increase the risk of software project failures due to three common mistakes:
Lack of transparency
One of the biggest challenge with each of these classic mistakes is they often go unnoticed, potentially even rewarded or encouraged by project leaders and project managers … until it is too late and the damage is done. But why and how does this happen so often in our industry?
Lack of transparency. This comes down to one thing. Communication. In over 25 years of experience, I have observed that this is often a skill that many software developers simply have to work on as it does not come naturally. Most software engineers are logical, introspective thinkers and very much focused on getting right to work problem solving. Generally, people of action and few words. The problem is, software developers often have a tendency to jump right in and start “doing” without sharing or vetting their ideas, or taking the time to research all viable options before starting to change code. Communication, and full transparency are critical for cost-effective design selection and productive development phases.
Over engineering. This is a personal pet peeve of mine because I have been burned so many times by this mistake, both as a developer and project manager! WAY TOO MUCH time can be spent on creating an elegant, clever, over-engineered solution, especially when a simple direct approach is available, can be a huge time waster … and is often very costly!
As problem solvers, we enjoy technical challenges and like to avoid having to revisit code by (over) engineering a solution trying to anticipate when end users will change their mind … um I mean requirements. But in doing so, we may fail to realize or consider the unnecessary cost and effort involved. Don’t fix what aint broke, yet!
Often, complex solutions that may be considered by the creator to be “brilliant” and “ultimately flexible” can be poorly documented and extremely difficult to understand or maintain. Worse still, after hours of attempting to make these over engineered solutions work, the effort can prove fruitless and abandoned as unfeasible. Use the KISS principle …
Shiny objects. In a word, distractions. In a software project, this typically manifests itself in scope creep. This, however, is scope creep that originates from within the development team … “hey, let’s do this too while we are in here”, or “wouldn’t this be cool?” These suggestions can often be encouraged and accepted by the Project Lead or Manager as a client “bonus” … as it comes from an enthusiatic developer team member who swears they can do it without affecting the delivery timeline. The PM may also accept these to get buy-in from the developer and to avoid discouraging this type of proactive behaviour for team building. However, these types of side-tracks, often combined with lack of transparency, can be a project killer. Stay on track, stay focused, don’t get tempted by shiny objects.
Change impact analysis tools by FinditEZ provide a full 360 degree view of your entire code stack. Produce more accurate estimates, identify the right resource skillsets required, quickly run “what if” change impact cost scenarios to confidently select the most cost effective option. Transparent, easily distributed change lists for clear and simple communication. Perfectly engineered solutions. Change made simple @ https://www.finditez.com
Are you an Epic Clarity change manager responsible for verifying, planning and preparing your healthcare organization for new release upgrades? Ever run into situations where the distributed Nova Release Notes were incomplete or inaccurate, resulting in a release going live with some unexpected issues you had not been prepared for?
Keep calm and finditez to fix this
To be fair, the Epic software is very complex and has a large team working hard to continually enhance their offering. It is not uncommon with delivery deadlines and last minute changes within a release that some updates don’t make it all the way back into the documentation that is distributed, despite all the best intentions and attention to detail.
Remember beaker from the muppets? He had that expression and expectation that every experiment ( Epic release ) would have some pain. It is not IF, but WHEN the pain would come. Just look at that hair and those eyes clearly expressing the stress and fear. From experience, it was inevitable that it would likely happen, again.
keep calm and trust @finditez to carry on!
But it doesn’t have to be that way. You don’t need your client group cringing everytime an Epic release change is implemented. They can, in fact, experience the positive emotion of “hey, that was seamless” … and these new features are fantastic!
It goes beyond just the Epic release and what the vendor can alert you to. What about all of your own custom programs, interfaces and reports built on top of the Epic SQL Server or Oracle data warehouse? You need a way to search not only the vendor code within the release, but also where and how changes to their underlying data model may affect your customized reports and programs.
Healthcare organizations like Upstate Univerisity, Aspirus, Stony Book Medicine, and Honor Health rely on our change impact analysis software to show them just that. How to avoid an Epic release failure as described our FinditEZ productivity software users:
With each Epic upgrade we are able to extract something called a compass comparison. This is a generated file that lists all database related changes such as deprecated tables, deprecated columns and data type changes, etc. We then import the search string into finditez and search the necessary locations. The locations we search may contain Epic supplied reports, custom reports and/or ETL programs. The automatically generated change impact lists are then exported to Excel or MS Project as tasks and distributed to our developers to continue processing.
Also, when an upgrade occurs we are given a new reporting content which contains all our reports, modified or not. Since we’ve found that there seems to be missing Nova Notes for some report changes, we use the FinditEZ database schema and file compare wizard to compare the new release data model, SQL code and reporting content to the production system. This allows us to verify the release notes with complete accuracy, as well as extend the analysis to our own custom code stack.
Aside from Epic upgrade, we also modify our own data warehouse, which we then have to search to find what reports, etc would be impacted by the changes made. Find it EZ currently allows searching of our Oracle databases, SQL Server databases, Crystal Reports, SQR, and C#, HTML and SAP Business Objects report schedules and documents.