Microsoft BI Tools

Thursday, 31 October 2019

Databases in DevOps - Build Pipeline

Case
Before we can deploy our database project through the DTAP environment we need to build it to check whether the merged code of all developers still works. The result of the build is a dacpac file which can be used for the deployment. But what is a dacpac file and how do we create it in Azure DevOps?

Build Pipeline

Solution
First, Dacpac stands for Data Application Component Package. It is a single file containing a database model that is equal to the database project in Visual Studio. It contains all the information to create a database. You can rename it from .dacpac to .zip to see the content of the file.

1) Create empty build pipeline
There are several pipeline templates that you can use, but for this example we will start with an empty build pipeline.

Go to Pipelines (Builds) in the left menu
Click on New to create a New Build Pipeline
Use the classic editor link on the bottom
Select Azure Repos Git as the source
Then select the Team project, Repository and the default branch
Next choose an Empty job

Create empty Build Pipeline

Now give the new Build pipeline a useful name. Next is choosing the Agent pool. For this example we will use the default Microsoft hosted pool: Azure Pipelines. The agent for this pipeline will be 'vs2017-win2016'.

2) Add trigger
To 'Enable continuous integration' for the database project we will need to add a trigger. For this example we will use a Branch filter on 'Master' and a Path filter on the path of this database project in GIT. This trigger will automatically start the build of the database project when someone changes code in the master branch for this particularly project (if done right via a pull request from a feature branch to the master branch).

Go to the Triggers tab
Enable continuous integration by checking the checkbox
First set the branch filter to: Include and master (or the branch you want to use)
Click on + Add below Path filters
Set the path filter to: include and the path of your project in Git: DB/HST/HST_BB/*

Add trigger

3) Builds tasks: Build solution
Now we need to add tasks to the Agent job generated in step 1. The first task is MSBuild which will build the Visual Studio database project. Once succeeded, this will generate a dacpac file which will be used by the next tasks

Click on Agent job 1 and optionally change its name
Click on the + icon to add a task the Agent job
Search for msbuild in the seachbox
Select MSBuild (Build with MSBuild) and click Add
The only property you need to set is the filepath of the sqlproj file in GIT: DB/HST/HST_BB/HST_BB.sqlproj

Add MSBuild task

4) Builds tasks: Copy dacpac and publish files
With this task we will copy the result of the build task (a dacpac file) to the Artifact Staging Directory. We need to specify which file we want to copy and to which folder. For the target folder we will use a predefined variable pointing to the right folder. For the files we will use a wildcard filter. Note that we not only publish the dacpac file, but also the Publish profile xml file.

Click on Agent job 1 (or the new name you provided in the previous step)
Click on the + icon to add a second task the Agent job
Search for 'copy files' in the seachbox
Select Copy files (Copy files from .... folder paths) and click Add
In the contents field add two rows:

**\*.dacpac
**\*.publish.xml

As target folder add: $(Build.ArtifactStagingDirectory)

Copy dacpac to Artifact Staging folder

5) Builds tasks: Publish build artifact
The last step of the build pipeline is to publish all files in the Artifact Staging Directory and give them an Artifact Name. This Artifact Name will be used in the Release pipeline which we will explain in a subsequent post.

Click on Agent job 1 (or the new name)
Click on the + icon to add a second task the Agent job
Search for 'Publish build artifact' in the searchbox
Select Publish build artifact (Publish build .... file share) and click Add
Give the artifact a new name (instead of drop)

Publish the artifacts

Note: in this case the artifact is a dacpac file, but it could also be ispac for SSIS or a assembly for C#.

6) Test the pipeline
Now it's time to test your build pipeline by committing some changes to your master branch via Visual Studio (or to create a pull request to pull changes to your master branch). This will trigger an automatic build. If you don't want to make any changes at this moment, you could just click on (Save &) Queue in devops to do a build manually.

Dry run your build pipeline

Summary
In this post you saw how to build your database project automatically when someone changes something in the master branch (like a pull request). You don't have to check the result manually if you setup notifications in DevOps. Then you will receive an email when something succeeds of fails.

Bisides master branches you could also build other branches like feature or personal branches by using different filters. Or you could schedule your build to build each night. After setting up this basic example explorer and try out all the other options from the build pipeline. In a followup post we will show you how to deploy this built artifact through your DTAP environment using Continuous Deployment.

Monday, 21 October 2019

Flow Snack: Refresh Power BI

Case
Exiting news from the Microsoft Power Platform. You can now use Microsoft Flow to refresh a Power BI dataset. How does it work and why would you want to use it?

Refresh a Power BI dataset

Solution
Until now you could schedule your refresh or do a manual refresh in Power BI, but now you can also use Flow to trigger a refresh when new data arrives. For example when a new source file arrives in OneDrive, DropBox, SharePoint or a Azure Blob Storage Container. But the list of triggers in Flow is nearly endless and the number of possible sources in Power BI is growing steadily so most likely one of these sources fits your needs.

1) Choose your trigger
First choose a trigger to refresh your dataset. In this example the trigger is 'When a blob is added or modified' in the Azure Blob Storage container.

Trigger when new data arrives in a Blob Storage container

2) Search for Power BI
Add a new action and search for Power BI and select Refresh a dataset. Note that it is still a Preview version and could change.

Refresh a Power BI dataset

3) Choose Workspace and Dataset
Now you must select your workspace and then the dataset in that workspace. After that save your Flow and check whether it is turned on and then start testing it by adding new files.

Select Workspace and Dataset that needs to be refreshed

Summary
An exiting new feature where you can use all kinds of triggers to refresh your Power BI dataset. Note that the refresh limitations still applies. So for PRO you have a maximum number of 8 refreshes a day and that is including any scheduled refreshes.

Thursday, 12 September 2019

Databases in DevOps - Pre-Deployment scripts

Case
I'm using database projects combined with Azure DevOps CI CD and I want to deploy data through my DTAP enviroment for some reference tables. Another reason could be that you want to do some data preparations before deploying all database changes. How do I do that in Visual Studio Database projects?

Adding Pre- and Post-Deployment scripts

Solution
This is where Pre-Deployment and Post-Deployment scripts could help you out. It's all in the name, but a Pre-Deployment script will be execute before deploying the database project and a Post-Deployment script will be executed after that deployment. These are all just regular .sql files containing T-SQL code.

For example a Pre-Deployment script can be used to secure data in a temporary table before a table change. A Post-Deployment script can be used to copy the data back to the appropriate tables and then clean up the temporary table afterwards.

Deploying data through the DTAP environments is nothing more then creating a delete statement and some subsequent insert queries. If the reference table already exits you could use a Pre-Deployment script or else you should use a Post-Deployment script.

1) Creating a deployment script
To keep our database project clean we will first create a folder called 'Scripts' to store the Pre-Deployment and Post-Deployment scripts. If you have a lot of scripts you could also create two separate folders.

To add scripts to your project, right click new folder in your the project and choose: Add, New Item..., User Scripts and then Pre-Deployment Script or Post-Deployment Script. Then give the file a suitable name.

Adding Pre- or Post-Deployment scripts

Note: see the file property of the deployment file that is called Build Action. This is where you determine the purpose and execution moment of a regular '.sql' file. However you can have only one Pre-Deployment and one Post-Deployment file in a project. There is a workaround below if you need multiple files.

2) Adding code
Now you can add your T-SQL code, but make sure it is repeatable (in case of a crash or when deploying multiple times). So if you want create a temporary table, make sure it doesn't already exits by adding a drop if-exists construction before the create statement.

Below a script that pushes data through the DTAP environment for a reference table.

/*
Post-Deployment Script Template       
--------------------------------------------------------------------------------------
 This file contains SQL statements that will be appended to the build script.  
 Use SQLCMD syntax to include a file in the post-deployment script.   
 Example:      :r .\myfile.sql        
 Use SQLCMD syntax to reference a variable in the post-deployment script.  
 Example:      :setvar TableName MyTable       
               SELECT * FROM [$(TableName)]     
--------------------------------------------------------------------------------------
*/

DELETE FROM [ref].[myTable]
GO
INSERT [ref].[myTable] ([Column1], [Column2], [Column3]) VALUES ('bla 1', 'bla 2', 'bla 3')
GO
INSERT [ref].[myTable] ([Column1], [Column2], [Column3]) VALUES ('bla a', 'bla b', 'bla c')
GO

If you have a big deployment script and you want to divide the code over multiple files then you have to create one 'master' script to call the other 'child' scripts because you can only have one Pre-Deployment and one Post-Deployment script in a database project. For this example we added multiple child scripts in the same folder as the parent script and called those by adding ":r .\" in front of the file names.

/*
Post-Deployment Script Template       
--------------------------------------------------------------------------------------
 This file contains SQL statements that will be appended to the build script.  
 Use SQLCMD syntax to include a file in the post-deployment script.   
 Example:      :r .\myfile.sql        
 Use SQLCMD syntax to reference a variable in the post-deployment script.  
 Example:      :setvar TableName MyTable       
               SELECT * FROM [$(TableName)] 
      
There can be only one post-deployment script per database! (One script to rule them all...)
--------------------------------------------------------------------------------------
*/

:r .\PostDeploymentFile1.sql
:r .\PostDeploymentFile2.sql
:r .\PostDeploymentFile3.sql

Conslusion
In this post you learned how to add some scripts before and after the general deployment of the database project. A good way to either add some reference data or to do some cleanup before you deploy your changes. Next take look at the publishing profile file for you SQL DB project.

Saturday, 31 August 2019

Databases in DevOps - Introduction

Case
I want my Data Warehouse databases in Azure DevOps just like my ETL projects and use Contunious Integration (CI) and Continuous Deliver (CD) for multiple environments. How do you do that?

SQL Database in DevOps

Solution
If you read a couple of forums and blogs about this subject then you will probably notice that 'all roads lead to Rome'. In this blog series we will explain our approach, but feel free to deviate.

Just for the record. With Continuous Integration (CI) we mean merging the code of all developers and building the project to check whether everything still works. Continuous Deliver will orchestrate the deployment of the package created by CI through out the DTAP environment in an automated manner to reduce costs, but speed up the release proces.

For this example we work with four environments each with its own Azure SQL database: Development, Test, Acceptance and Production. The Development database is a shared database where all developers do there coding.

The database changes since the last compare will be pushed to a Visual Studio Database project. Which on its turn will be committed to Git. After that the project will be build in DevOps and then deployed through all environments.

DTAP environments

Note: This approach with a single shared development database only works for smaller teams or when the database work can be separated to avoid conflicts. The alternative is to give each developer their own local development database. This will take a little more syncing and merging effort though.

1) Visual Studio Database project
First, you either need Visual Studio 2017 or 2019. You have three options:

SSDT standalone installer. This is Visual Studio 2017 with only the BI and DB project templates.
Visual Studio 2017 (Community Edition is free). During installation locate Data storage and processing and select SQL Server Data Tools. Finding a download without an MSDN license is a bit tricky since 2019 is the current version
Visual Studio 2019 (Community Edition is free). During installation locate Data storage and processing and select SQL Server Data Tools.

Data Storage and processing - SQL Server Data Tools

After installing you will find the SQL Server Database Project under SQL Server when you create a new project.

Create new Database Project

2) Import or Schema compare
Now that we have an empty database project we need to add the database items from our development database to this new project. You can either do an import of the database or do a Schema Compare between the database and the project.

When importing a new database you only have couple of options like not importing referenced logins or the folder structure. I would suggest not to import any logins because the are probably different on all environments. As folder structure I recommend Schema\Object Type because I think this is the clearest structure. Note that you can do an import only once.

Import Database

The second option is to do a schema compare which gives you much more options for tuning the synchronization. The best feature is that you can repeat this any time you like to update the Visual Studio Project.

Right click the project and choose Schema Compare... Then make sure the source database is on the left side and the Visual Studio project on the right side. Click on the options/settings icon to tune the comparison. In this case we will ignore users and role memberships.

Now press the compare button and review all changes especially when you work in a team. Then you probably only want your changes. Uncheck changes that are not yours or not ready to go to the next environment. By clicking on a change you can see the actual differences. Last step is to press the Update button and move all selected changes to the Visual Studio project. Before committing any changes to Git you should built your project to check if everything works.

Sql Schema Compare

Note: When working with multiple developers in a single development database you will probably encounter a couple of database changes which where not made by you. This is the point where you have to be selective on which changes you want to push to the database project.

Tip: When you have a lot of 'garbage' in your development database you could do a clean up by switching the source and target in the Sql Schema Compare and then push the update button. This will throw everything away that is not in your database project. It is probably a good idea to first create a backup.

3) Add Schema Compare to project
A great feature of the Sql Schema Compare file is that you can save it to your PC and then add it to your database project. The next time you only have to open it and click on the compare button.

Add Compare file to db project

4) Dacpac to create or update
If you built the database project in Visual Studio or DevOps it will generate a .dacpac file which can be used to create or update the next database in our DTAP environment. By default it can only update tables when they get bigger (extra columns or larger datatypes). This is to prevent loss of data. In a next post we will explain how to overcome this with Publish profiles or Pre-Deployment script.

Conclusion
In this introduction post you learned how to create a database project in Visual Studio and how to synchronize a database and a database project. In the next posts we will create a Build pipeline in DevOps to validate your committed database changes and after that we will create a Release pipeline to release your committed database changes through the DTAP environment.

A downside of database projects in Visual Studio is that it can become very slow if you have like thousands and thousands of database objects. For those very large databases (not in size, but in structure) you could also take a look at commercial database tools like those of redgate or SentryOne. However the majority of Data Warehouse databases will probably easily fit in Visual Studio Database projects.