Programming well structured Javascript stored procedures for DocumentDB with Typescript and SystemJs

If you are using DocumentDB you may had to write your own stored procedure. A stored procedure is a function written in Javascript that runs on the DocumentDB cloud infrastructure. It may reduce performance problem or make you execute some queries that are not supported yet through the REST API such as aggregate function.

A stored procedure should be registered as a single function of the form:

function myStoredProcedure(arg1, arg2){ /* you can place all the arguments you want here*/
    //The body of the function here

    //you set the response like this
    var context = getContext();
    context.setBody(myResult);
}

You can create function inside the myStoredProcedure function body. However, you cannot create other functions in the file otherwise DocumentDB will complain. This is quite annoying because you probably want to create independent and reusable pieces of code for testability or simply for the sake of readability. The problem comes from the fact that you cannot really use out-of-the box third party module management libraries such as requireJS, commonJS or SystemJS. This sounds no good. Is that means we are forced to inline all our code in a big not modular and un-testable Javascript file!?!?

The answer is no, in this blog post I will show you the solution we implemented at Keluro to overcome this problem. This solution is based on SystemJs and SystemJs-Builder to create a standalone function where all modules/class dependencies are embedded in the stored procedure single function which acts as our entry point. The code snippets presented in the following are extracted from the following git repository.

In this article, we will use Visual Studio as an IDE but it is not mandatory. Actually, it is only to simplify the options settings for compiling Typescript files, you can invoke the compiler manually exactly with the same set of options.

In the following, we will take an example where the stored procedure computes the sum of input arguments. Therefore, we create a Typescript class for the core of our stored procedure called Utilitary that contains a method called sumValues.

A typescript class used in DocumentDB stored procedure

A typescript class used in DocumentDB stored procedure

Then we create the entry point of the stored procedure in a Typescript file that uses the DocumentDB current context. We benefit from the typings from documentdb-server.d.ts that defines the IContext interface.

MyStoredProcedure typescript file executed by DocumentDB

MyStoredProcedure typescript file executed by DocumentDB

We compile the Typescript files has SystemJs modules and we redirect all generated Typescript in the directory out-js. As I told you, no need of VisualStudio to do this, you can achieve the same result by invoking the Typescript compiler with similar options.

ompiling options set in Visual Studio

ompiling options set in Visual Studio

If you look at the generated Typescript you will get something that looks like

System.register(["Utilitary"], function(exports_1, context_1) {
    "use strict";
    var __moduleName = context_1 && context_1.id;
    
     /* ETC */
});
//# sourceMappingURL=myStoredProcedure.js.map

If you try to run such a Javascript source code directly with DocumentDB you will get an error. These modules are meant to run with System.js and in the case of a stored procedure you cannot reference any third party library such as System.js. This is where SystemJs-Builder will come into play. SystemJs-builder will package everything so that your modules are independent of System.js and they can be used as a standalone file.

To invoke SystemJs-Builder you need to use Node.js. In the sample git repository, you will have to hit ‘npm install’ to restore the SystemJs-Builder Node package. When the Typescript files are compiled invoke the node script “documentdb_storedprocedure_builder.js” at the root of the repository. This script basically executes two tasks. First, it generates a standalone Javascript file with SystemJs-Builder. Secondly, because DocumentDB explicitly needs a file with one function and not an executable Javascript script, we wrap this code inside a function that will act as our entry point for the stored procedure.

We also retrieve the arguments passed to the storedprocedure procedure function in a variable called storedProcedureArgs that is kept in the global namespace. The resulting file is generated and put in the directory generated-procedure. Finally, this file contains what we needed: a standalone and executable by DocumentDB Javascript function. With this approach all Typescript classes can be reused for other stored procedures , for unit tests or anywhere in your Typescript code base.

There are probably thousands of alternatives to create testable and decoupled stored procedures. We liked this approach because it reuses the same tools that we were already using for our single page applications: Typescript and SystemJs. To conclude let me thank Olivier Guimbal who showed us some months ago how Typescript, SystemJs and SystemJs-Builder worked well together.

My TransientFaultHandling utilitary classes for DocumentDB

Keluro uses extensively DocumentDB for data persistence. However, it’s extensive scaling capabilities come with a price. Indeed with your queries or your commands you may exceed the amout of request unit you are granted. In that case you will received a 429 error “Request rate too large” or a “DocumentException” if you use the .NET SDK. It is your responsability then to implement the retry policies to avoid such a failure and wait the proper amout of time before retrying.

Edit: look at the comment below. The release v1.8.0 of the .NET SDK proposes some settings options for these retry policies.

Some samples are provided by Microsoft on how to handle this 429 “Request too large error”, but they are concerning only commands, such as inserting or deleting a document, there is no sample on own to implement the retry policies for common queries. A Nuget package is also available: “Microsoft.Azure.Documents.Client.TransientFaultHandling” but even if integrating it is as quick as an eye blink, there is no logging capabilities. In my case it did not really resolve my exceeding RU problem, I even doubt that I made it work and the code is not opensource. Then, I decided to integrate the ideas from the samples in own utilitary classes on top of the DocumentDB .NET SDK.

The idea is similar to “TransientFaultHandling” package: to wrap the DocumentClient inside another class exposed only through an interface. By all accounts, it is a good thing to abstract the DocumentClient behind an interface for testability purposes. In our case this interface is named IDocumentClientWrapped.

public interface IDocumentClientWrapped : IDisposable
{
    /* Some command like methods*/
    Task DeleteDocumentAsync(Uri uri);

    Task CreateDocumentAsync(Uri uri, object obj);

    /* Some query like methods*/
    IRetryQueryable<T> CreateDocumentQuery<T>(Uri documentUri, FeedOptions feedOptions = null);

    IRetryQueryable<T> CreateDocumentQuery<T>(Uri documentCollectionUri, SqlQuerySpec sqlSpec);

    /* Copy here all other public method signature from DocumentClient but return IRetryQueryable instead of IOrderedQueryable or IQueryable */
}

Instead of returning an Queryable<T> instance as DocumentClient would do we return an IRetryQueryable<T>. This latter type, whose definition will follow, is also a wrapper on the IQueryable<T> instance returned by the DocumentDB client. However, this interface explicitely retries when the enumeration fails because of 429 request too large exception raised by the database engine, DocumentDB in our case.

public interface IRetryQueryable<T>
{
    IRetryQueryable<T> Where(Expression<Func<T, bool>> predicate);

    IDocumentQuery<T> AsDocumentQuery();

    IEnumerable<T> AsRetryEnumerable();

    IRetryQueryable<TResult> SelectMany<TResult>(Expression<Func<T, IEnumerable<TResult>>> predicate);

    IRetryQueryable<TResult> Select<TResult>(Expression<Func<T, TResult>> predicate);
}

In this interface we only expose the method extension methods that are actually supported by the “real” IQueryable<T> instance returned by DocumentDB: Select, SelectMany, Where etc. For example, at the time of the writing GroupBy is not supported. You would get an runtime exception if you used it directly on the IQueryable<T> instance returned by DocumentClient.

Now look at how we use these interfaces in the calling code.

IDocumentClientWrapped client = /* instanciation */;
IRetryQueryable<MyType> query = client.CreateDocumentQuery<MyType>(/* get usual uri using the UriFactory*/);
IEnumerable<string> = query.Where(c => c.Member1 == "SomeValue").Select(c=>c.StringMember2).AsRetryEnumerable();
/* manipulate your in memory enumeration as you wish*/

Independently of the retry policies classes and the “request too large errors”, let me emphasis that LINQ can be tricky here. Indeed, the piece of code above is completly different from this one:

IDocumentClientWrapped client = /* instanciation */;
IRetryQueryable<MyType> query = client.CreateDocumentQuery<MyType>(/* get usual uri using the UriFactory*/);
IEnumerable<string> = query.AsRetryEnumerable().Where(c => c.Member1 == "SomeValue").Select(c=>c.StringMember2);
/* manipulate your in memory enumeration as you wish*/

In this case the Where constraint is perfomed “in memory” by your .NET application server. It means that you have fetched all data from DocumentDB to your app server. If MyType contains a lot of data, then all have been transfered from DocumentDB to your application server and/or if the Where constraint filters a lot of documents you will probably have a bottleneck.

Let us get back to our problem. Now that we saw that having retry policy for a query means only calling AsRetryEnumerable() instead of AsEnumerable() let us jump to the implementation of thoses classes.

The idea is to use an IEnumerator that “retries” and use two utility method: ExecuteWithRetry,ExecuteWithRetryAsync. The former one for basic mono threaded calls while the latter is for the async/await context. Most of this code is verbose because it is only wrapping implementation. I hope it will be helpful for others.

public class DocumentClientWrapped : IDocumentClientWrapped
{
    private class RetriableEnumerable<T> : IEnumerable<T>
    {
        private readonly IEnumerable<T> _t;
        public RetriableEnumerable(IEnumerable<T> t)
        {
            _t = t;
        }

        public IEnumerator<T> GetEnumerator()
        {
            return new RetriableEnumerator<T>(ExecuteWithRetry(() => _t.GetEnumerator()));
        }

        IEnumerator IEnumerable.GetEnumerator()
        {
            return this.GetEnumerator();
        }
    }

    private class RetriableEnumerator<T> : IEnumerator<T>
    {
        private IEnumerator<T> _t;
        public RetriableEnumerator(IEnumerator<T> t)
        {
            _t = t;
        }

        public T Current
        {
            get
            {
                return ExecuteWithRetry(()=> _t.Current);
            }
        }

        object IEnumerator.Current
        {
            get
            {
                return ExecuteWithRetry(() => _t.Current);
            }
        }

        public void Dispose()
        {
            _t.Dispose();
        }

        public bool MoveNext()
        {
            return ExecuteWithRetry(() => _t.MoveNext());
        }

        public void Reset()
        {
            _t.Reset();
        }
    }

    private class RetryQueryable<T> : IRetryQueryable<T>
    {
        private readonly IQueryable<T> _queryable;
        public RetryQueryable(IQueryable<T> queryable)
        {
            _queryable = queryable;
        }

        public IDocumentQuery<T> AsDocumentQuery()
        {
            return this._queryable.AsDocumentQuery();
        }

        public IEnumerable<T> AsRetryEnumerable()
        {
            return new RetriableEnumerable<T>(this._queryable.AsEnumerable());
        }

        public IRetryQueryable<T> Where(Expression<Func<T, bool>> predicate)
        {
            var queryable = this._queryable.Where(predicate);
            return new RetryQueryable<T>(queryable);
        }

        public IRetryQueryable<TResult> SelectMany<TResult>(Expression<Func<T, IEnumerable<TResult>>> predicate)
        {
            var queryable = this._queryable.SelectMany(predicate);
            return new RetryQueryable<TResult>(queryable);
        }

        public IRetryQueryable<TResult> Select<TResult>(Expression<Func<T, TResult>> predicate)
        {
            var queryable = this._queryable.Select(predicate);
            return new RetryQueryable<TResult>(queryable);
        }
    }


    private const int MaxRetryCount = 10;

    private static async Task<V> ExecuteWithRetriesAsync<V>(Func<Task<V>> function)
    {
        TimeSpan sleepTime = TimeSpan.FromSeconds(1.0);
        int count = 0;
        while (true)
        {

            try
            {
                return await function();
            }
            catch (DocumentClientException de)
            {
                if ((int)de.StatusCode != 429)
                {
                    throw;
                }
                if (++count > MaxRetryCount)
                {
                    throw new MaxRetryException(de, count);
                }
                Trace.TraceInformation("DocumentDB async retry count: {0} - retry in: {1} - RUs: {2}", ++count, sleepTime.TotalMilliseconds, de.RequestCharge);
                sleepTime = de.RetryAfter;
            }
            catch (AggregateException ae)
            {
                if (!(ae.InnerException is DocumentClientException))
                {
                    throw;
                }

                DocumentClientException de = (DocumentClientException)ae.InnerException;
                if ((int)de.StatusCode != 429)
                {
                    throw;
                }
                if (++count > MaxRetryCount)
                {
                    throw new MaxRetryException(de, count);
                }
                Trace.TraceInformation("DocumentDB async retry count: {0} - retry in: {1} - RUs: {2}", ++count, sleepTime.TotalMilliseconds, de.RequestCharge);
                sleepTime = de.RetryAfter;
            }
            await Task.Delay(sleepTime);
        }
    }

    private static V ExecuteWithRetry<V>(Func<V> function)
    {
        TimeSpan sleepTime = TimeSpan.FromSeconds(1.0);
        int count = 0;
        while (true)
        {
            try
            {
                return function();
            }
            catch (DocumentClientException de)
            {
                if ((int)de.StatusCode != 429)
                {
                    throw;
                }
                if (++count > MaxRetryCount)
                {
                    throw new MaxRetryException(de, count);
                }
                Trace.TraceInformation("DocumentDB sync retry count: {0} - retry in: {1} - RUs: {2}", ++count, sleepTime.TotalMilliseconds, de.RequestCharge);
                sleepTime = de.RetryAfter;
            }
            catch (AggregateException ae)
            {
                if (!(ae.InnerException is DocumentClientException))
                {
                    throw;
                }

                DocumentClientException de = (DocumentClientException)ae.InnerException;
                if ((int)de.StatusCode != 429)
                {
                    throw;
                }
                if (++count > MaxRetryCount)
                {
                    throw new MaxRetryException(de, count);
                }
                Trace.TraceInformation("DocumentDB sync retry count: {0} - retry in: {1} - RUs: {2}", ++count, sleepTime.TotalMilliseconds, de.RequestCharge);
                sleepTime = de.RetryAfter;
            }
            Thread.Sleep(sleepTime);
        }
    }

    private readonly DocumentClient _client;
    public DocumentClientWrapped(string endpointUrl, string authorizationKey)
    {
        _client = new DocumentClient(new Uri(endpointUrl), authorizationKey);
    }

    public async Task CreateDocumentAsync(Uri documentCollectionUri, object obj)
    {
        ResourceResponse<Document> response = await ExecuteWithRetriesAsync(()=> _client.CreateDocumentAsync(documentCollectionUri, obj));
        LogResponse<Document>("CreateDocumentAsync", response);/* Do something with the response if you want to use it */
    }

    public async Task DeleteDocumentAsync(Uri documentUri)
    {
        ResourceResponse<Document> response = await ExecuteWithRetriesAsync( () => _client.DeleteDocumentAsync(documentUri));
        LogResponse<Document>("DeleteDocumentAsync", response); /* Do something with the response if you want to use it */
    }

    public async Task ReplaceDocumentAsync(Uri documentCollectionUri, object document)
    {
        var response = await ExecuteWithRetriesAsync(()=>  _client.ReplaceDocumentAsync(documentCollectionUri, document));
        LogResponse<Document>("ReplaceDocumentAsync", response);
    }

    public IRetryQueryable<T> CreateDocumentQuery<T>(Uri documentCollectionUri, SqlQuerySpec sqlSpec)
    {
        var queryable = _client.CreateDocumentQuery<T>(documentCollectionUri, sqlSpec);
        return new RetryQueryable<T>(queryable);
    }
}

Hosting Jekyll website on Azure Web app using TeamCity automated deployment

The public website of my company Keluro is built with Jekyll. If you are a Jekyll user you are probably aware of Github pages. To me, the best feature of Github pages is the automated deployment of your website when pushing on a specific branch (namely ‘gh-pages’). We hosted the website almost a year in Github pages. However, we suffer many inconveniences with it:

  • Https is not supported by Github pages. As Google announced, http over SSL is now a bonus in term of rankings. At Keluro we do have a wild card SSL certificate, its a shame that we could not use it for our public corporate website!
  • We could not tune some server caching configuration (ETag, Cache-Control etc.), resulting on poor results from Google Page Speed Insights.
  •  With Githup pages you cannot use custom advanced gems. They are ruby extensions, which is the technology on which Jekyll is based on. I have already blogged about our solution to support multi-lang website. Even if it works, I am more and more thinking that a Jekyll gem would do a better job…
  • We had problem with Facebook scrapping our open graph meta tags and there was nothing we could do about it: see issue here.
  • You do not control the version of Jekyll run by Github pages. We found out that there are some breaking changes introduced when migrating from Jekyll 2 to Jekyll 3. We do not want our website to get down because of a silent release of a new Jekyll revision.
  •  At Keluro due to our business model we are windows users. However, the server running Github pages are Linux servers, so you face the technicalities coming from switching between Linux/Windows: CRLF vs LF etc. Even if there are solutions such as .gitattribute file etc. these are extra technicalities that our non-tech teammates working on the website are not aware of and we do not want them to spend time on this.
  • We use a project page rather than a personal page, it was complicated to configure DNS names etc. There are always exception when it comes to project pages with Github pages.

For all these reasons I wanted to quit Github pages. As the CTO of Keluro, I had not a lot of time to investigate all alternatives and wanted a simple solution. We are already Bizspark member, our web apps, APIs, VMs etc. are hosted on Azure. We are familiar with configuration of IIS server. Consequently, it was a reasonable solution to host the website on Azure with all our other resources. On the other hand, we already had an automated deployment solution: our continuous integration server, TeamCity, which is also hosted on Azure.

The solution proposed here is then very simple. Similarly to the automated deployment provided by Github pages, we use TeamCity to trigger changes on a given branch of the Git repository. Then, the website is built by Jekyll on the integration server virtual machine. Finally, it is synchronized with the Azure web app FTP using the sync library WinSCP. At the end, our static html pages are served using Azure Web app.

Once the TeamCity build configuration is created, you just have to write two Powershell build steps (see code below). You can also use two Psake build targets and chain them as I wrote here.

The prerequisite is to install Jekyll on the windows server VM with jekyll.exe on the environment variable $PATH. You can add WinSCP.exe and .dll in a folder within your source code under the ‘ignored\build’ location. Make sure that ‘ignored’ is indeed an ignored folder by Jekyll (you do not want Jekyll to output these to your deployed _site folder).

In the TeamCity build configuration you can set up environment variable that will be consumed by the Powershell script ($env:VARNAME). It is an acceptable solution for avoiding hardcoding passwords, location path etc. on the sources. For example, the variable $env.RepoDir is set to %system.teamcity.build.checkoutDir%. You use such environment variable, to store ftp settings. To recover the FTP settings of an Azure Web App, see this stackoverflow question.

REMARK: We did not manage to redirect the WinSCP ouput of the sync logs in real time to TeamCity. We log the results when the syncing is completed. If someone has a solution we will be glad to hear it.
We tried the WinSCP powershell CMDLets but they seem heavily bugged at the time of the writing.

Build the website with Jekyll

$repoDir = $env:RepoDir
cd $repoDir #change location and go to the repository
Exec{Invoke-Expression "& jekyll build"} #invoke Jekyll from Powershell command line

Sync the website with WinSCP

$hostName = $env:FtpHostName
$repoDir = $env:RepoDir
$userName = $env:FtpUserName
$pwd = $env:FtpUserPwd

$sitePath = Join-Path $repoDir -ChildPath "_site"
$windscpPath = Join-Path $repoDir -ChildPath "ignored\build\WinSCP"

$dllPath = Join-Path $windscpPath -ChildPath "WinSCPnet.dll"
$exePath = Join-Path $windscpPath -ChildPath "WinSCP.exe"
if(!(Test-Path $dllPath)){
    Write-Error "No dll path found! " $dllPath
}
if(!(Test-Path $exePath)){
    Write-Error "No exe path found! " $exePath
}

Add-Type -Path $dllPath

$sessionOptions = New-Object WinSCP.SessionOptions
$sessionOptions.Protocol = [WinSCP.Protocol]::Ftp
$sessionOptions.HostName = $hostName
$sessionOptions.UserName = $userName
$sessionOptions.Password = $pwd
$sessionOptions.FtpSecure = [WinSCP.FtpSecure]::Explicit

$session = New-Object WinSCP.Session
$session.ExecutablePath = $exePath
$session.SessionLogPath = $logpath

try
{
    $session.Open($sessionOptions)

    Write-Host "Start syncing..."

    $transferResult = $session.SynchronizeDirectories([WinSCP.SynchronizationMode]::Remote, $sitePath, "/site/wwwroot", $True, $False,[WinSCP.SynchronizationCriteria]::Size)

    $transferResult.Check()

    foreach ($transfer in $transferResult.Downloads)
    {
    Write-Host ("Download of {0} succeeded" -f $transfer.FileName)
    }
    foreach ($transfer in $transferResult.Uploads)
    {
        Write-Host ("Upload of {0} succeeded" -f $transfer.FileName)
    }
    foreach ($transfer in $transferResult.Removals)
    {
    Write-Host ("Removal of {0} succeeded" -f $transfer.FileName)
    }
    Write-Host "... finish syncing."
}
finally
{
    # Disconnect, clean up
    $session.Dispose()
}

A rigorous management of Entity Framework migrations adapted to multiple app deployments

This post follows a question that I asked on stackoverflow several months ago. I did not receive a satisfactory answer at that time. I will expose the solution that we found at Keluro for this problem.

At Keluro, our client app products (for example the VSTO KMailAssistantKBilling and the SPA web apps) communicate with a REST web api. This web api uses EntityFramework 6.0  on top of SQL Server for the persistence. However, some of our web api deployments are not necessarily multitenant. Indeed, we do have multiple clients who do not want to share their infrastructure, they demand an isolated deployment mainly for security and confidentiality reasons. For us, in order to keep things simple, it was important to be sure that all of our client deployments share the same database schema, even if they got into production at a different step during the development of the products. Consequently, we have N deployments of our web api with as many database catalogs. We also want to have all our web api deployments to be up-to-date compared to a stable revision of the source code. To this aim, a continuous build is in charge to update all these web api. Necessarily, the associated databases also need to  be updated automatically. The Entity Framework is able, when the web app starts, to handle the migration process (if needed), the topic of this post is to propose a rigorous methodology to manage the migrations.

Entity Framework Code First supports migrations (see documentation here). To answer the problem explained above, when reading the documentation it is not clear how we should use the migrations. Indeed, we are told to call Update-Database from Visual Studio or to use Enable-Automatic migrations. Let us explain how to use the set of features proposed by Entity Framework to handle multiple deployment in a clean and rigorous way. Note also that this approach works well for local databases that are deployed with your ‘rich client‘  application, for example with SQL Server CE which we use for our VSTO addins KMailAssistant and KBilling.

TLDR; Determine a “stable production schema”:  which is the database schema corresponding to the web app code for a stable branch/tag in your source control. Avoid the so-called AutomaticMigration and always create new code based migration using Add-Migration with respect to an empty database that have been updated to the “stable production schema” after applying all existing migrations. Do not use Update-Database command to update your database in production, let the framework do it for you at startup using the MigrateDatabaseToLatestVersion initializer.  Then, when you release a new environment starts the application with an empty database. You will also have to take care of version control when working with feature branches.

Generating clean code based migrations

For the following, I will assume that you have read the Entity Framework documentation. It is very important that you decide what is the “stable database schema”. It corresponds to the schema determined by the source code (remind that we use EF Code First) for the selected revision in your stable branch or tag. We advise you to avoid the AutomaticMigrations. Actually, AutomaticMigration does not mean that the migrations created will be applied automatically (we will discuss how to do that latter). It means that the migration needed, which is the piece of SQL needed to change the database between its actual stated and what it should be, will be generated and applied on the fly. This is dangerous in our situation, think of our multiple deployments, they have not been started at the same time. Indeed, with automatic migrations some migrations could have been generated and applied for some older client environments while you have to push a new environment right now. Consequently, the history of automatically generated migrations could be different even for the same revision of the source code.

The best solution to avoid this situation is that all deployment share the same series of code based migrations. On a stable source code revision, the succession of existing code migrations applied to an empty database produce a database with the so-called “stable database schema” introduced above. Then, if a new client is deployed, an empty database (no tables, no data) is created and then all existing migrations will be applied automatically by Entity Framework when the web app starts the first time. For example, suppose that we have the following list of migrations: 201401050000000_MigrationA (January 5st 2014), 20150300000000_1MigrationB, 201504120000000_MigrationC, 201511150000000_MigrationD. This means that, if a client web app and its database is put in production on March 2015, all migrations will be applied (including MigrationA and MigrationB).

When using code base migrations, it is important to keep in mind that the migrations are presented as csharp files that represent the SQL instructions to be applied (e.g. drop a table, adding a column etc.). In addition, a given database has also a table _MigrationHistory which keeps all migrations that have been applied to it. Then, if all your web apps are up-to-date with respect to the same web app source code, for all your databases, you will get exactly the same rows in the table _MigrationHistory.

When an application starts, to automatically migrate the associated database to the latest migration, you have to run this code at startup (e.g. Globalasax.cs for an asp.net web app).

Database.SetInitializer(new MigrateDatabaseToLatestVersion<MyDbContext, Configuration>());
Code based list of migrations in Visual Studio

Code based list of migrations in Visual Studio

Keep a clean list of code based migrations

Let us explain how to keep a clean list of code migrations. I suggest to create a clean code based migration anytime a database schema change is required. To do so, you will need to use the Add-Migration command in your Package Manager Console in Visual Studio. Remind that if you do not specify a database connection string in the Powershell command, the connection string used will be the first found in your app.config or web.config file. This selected database may not have the proper “production schema”, it is error prone. My advice is to create only for the generation of this new code migration a database with no data but with the actual “production schema”. This is extremely simple and is also a sanity check of your existing migrations: create a new empty database, in one click in VisualStudio SQL Server Express (see picture below).

Create empty database from Visual Studio

Create empty database from Visual Studio

To update this database schema, take the connection string (right click on your database > Properties) then update the database by targeting the last migration, e.g.201511150000000_MigrationD, with the following command:

Update-Database MigrationD -ConnectionString "<yourConnectionString>" -ConnectionProviderName "System.Data.SqlClient" -Verbose

Then now this local database is “up-to-date” and you can generate your new migration named MigrationE (choose something more meaningful in your case) with the command:

Add-Migration MigrationE -ConnectionString "<yourConnectionString>" -ConnectionProviderName "System.Data.SqlClient" -Verbose

Then the migration files are generated, it is recommended to read them and make sure they correspond to the changes you intended to introduce. Now they are ready to be committed in a single and clean commit. As we have seen the migration is prefixed with number which corresponds to its generation date (e.g. 20150501000000_MigrationE). This number is effectively used by when using the MigrateDatabaseToLatestVersion database initialize and it can be a problem when not carefully used with version control.

Migrations and version control

There may be troubles when branching, to see this let us explain how the Entity Framework applies the code base migrations. Have a look at the table _MigrationHistory, the rows are the migrations, the date when the migration was generated is also there, because it is included in the name of the migration. Entity framework takes the date of most recent migration applied in the _MigrationHistory table and applies all migrations in the web app code that have been generated latter.

The _MigrationHistory table generated by Entity Framework

The _MigrationHistory table generated by Entity Framework

You see the potential problem? Say that you have created two feature branches: X and Y. Suppose that you have generated a migration for each of these branches, for X then for Y. But for some reason, you merged Y into your stable branch before X, the migration of X will not be applied!

To avoid this as much as possible, I suggest that you generate a minimum of migrations and , for each newly generated migration, put it in a dedicated commit with nothing but the code of the migration and with a clean indicator in the commit log (e.g. put a “[MIGRATION]” tag). Remark that the git rebase interactive command of git can be useful (take care when rebasing pushed commits!). For example you can remove all intermediate [MIGRATION] commits and regenerate a single one. or if you decide that a migration (not deployed!) is no longer needed you can drop the commit etc. I think it is wise thing to name a “database” master in your dev team. This person should be the one responsible for merging branches involving database migrations. He will be aware of the potential problem with migrations date generation and will know how to fix it.

TeamCity on Windows Azure VM part 2/2 – enabling SSL with reverse proxy

In the previous post we explained how to install a TeamCity server on a Windows Azure Virtual Machine. We used an external SQL Azure database for the internal database used by Teamcity. The first post ended with a functional TeamCity web app that could not be visible from outside. The objective of this second post is to show how to secure the web app by serving up the pages with SSL. Similarly as the previous post I will detailed out all the procedure from scratch so this tutorial will be accessible to an IIS beginner. Indeed, we are going to use a reverse proxy technique on IIS (Internet Information Service, the Microsoft web server).  I would like to thank my friend Gabriel Boya who suggested me the reverse proxy trick. Let us remark that it could also be a good solution for serving under SSL any other non IIS website running on windows.

If you have followed the steps of the previous post, then you should have a TeamCity server that is served on port 8080. You should be able to view it with the IE browser under the remote desktop VM at localhost:8080. Let us start this post by answering the following question:

What is a reverse proxy and why use it?

If you try to enable SSL with Tomcat server who serves TeamCity, you are going to suffer for not a very good result. See for instance this tutorial and all the questions it brought. Even if you manage to import your certificate on the java certificate store, you will have problem with WebSockets…

So I suggest you to implement a reverse proxy. The name sounds complicated but it is very basic concept: this is simply another website that will act as an intermediate for communicating to your first and primary website (in our case the Tomcat server). Here, we are going to create an empty IIS website, visible from outside on port 80 and 443. It will redirect the incoming request to our TeamCity server which is listening on port 8080.

Representation of the reverse proxy for our situation

Representation of the reverse proxy for our situation

Install IIS and the required components

First we will have to install and setup IIS with the following features on our Windows server 2012. It is a very easy thing to do. Search the ServerManager in windows  then, choose to configure this Local Server (this Virtual Machine)  with a Role-based installation.

servermanager

Add roles and features

Then, check the “Web Server (IIS)” as a role to install.

servermanager1

Install IIS

Keep the default feaures.

servermanager2

Keep default installation features

In this recent version, TeamCity uses the WebSockets. To make them work, our reverse proxy server will need them: check WebSocket Protocol.

servermanager3

Check the WebSocket Protocol

Check that everything is right there… and press Install.

servermanager4

Check that everything is prepared for installation

Now that we have installed IIS with our required features (WebSocket),it is accessible from the menu. I suggest you pin that to the easy launch if you do not want to search it each time you will need it.

iis

IIS well installed

Install URL rewrite module

The most simple way to set up the reverse proxy is to have the IIS URL rewrite module installed. Any Microsoft web modules should be installed using the Microsoft Web platform Installer. If you do not have it yet, install it from there.

Then, in the Web Platform Installer look for the URL Rewrite 2.0 and install it.

urlrewrite

URL Rewrite 2.0 installation with Web Platform Installer

The Reverse proxy website

Ok now we are going to create our proxy website. IIS has gently created a website and its associated pool. Delete them both without any mercy and create a new one (called TeamcityFrom) with the following parameters. Remark: that there is no website and nothing under the C:inetpubwwwroot this is just the default IIS website directory.

TeamcityFront

New TeamCityFront IIS website that point to the inetpub/wwwroot folder

Create the rewrite rule

We are going to create a rule that basically transform all incoming http request to request targeting locally localhost:8080. Open the URL rewrite menu that should be visible in the window when you click on your site and create a blank rule with the following parameters.

inboundRule1

URL Rewrite for our TeamcityFront website

inboundRule2

second part of the rewrite rule

Now let us go back to the management Azure portal and add two new endpoints http for port 80 and https for 443 in Windows Azure

endpoints3

Add HTTP on port 80 and HTTPS on port 443 with the Azure Management Portal for our VM

Now check that you are able to browse your site at testteamcity.cloudapp.net from the outside. But you could object what was the point? Indeed, we could have setup TeamCity on port 80 and add the HTTP endpoint on Azure and the result would be the same. Yes you would be right, but remember, our goal is to serve securely with SSL!

Enabling SSL

PersonalPfx

Certificate installation

To enable SSL you need to have a certificate, you can buy one at Gandi.net for example. When you get the .pfx file install it on your VM by double clicking and put the certificate on the personal store.

An SSL certificate is bound to a domain and you do not own cloudapp.net so you cannot use the subdomain testteamcity.cloudapp.net of your VM. You will have to create an alias for example build.keluro.com and create a CNAME that will redirect to the VM.

Here is the procedure if you manage your domains in Office365.

office365

Creating a CNAME subdomain in Office365 that point to the *.cloudapp.net address of your VM

Now in IIS, click on your site and edit SSL bindings, add this newly created  subdomain build.keluro.com and use the SSL certificate that should be recognized automatically by IIS.

SSLBindings

Create an HTTPS binding for the proxy server under IIS

At this stage, you should be able to browse your site on https from the outside with a clean green lock sign.

Browsing in security with https

Browsing in security with https

Redirection Http to Https

You do not want your user to continue accessing insecurely your web app with basic Http. So a good mandatory practice is to redirect your http traffic to a secure https endpoint. Once again, we will benefit from the reverse proxy on IIS. Simply create a new URL rewrite rule.

redirecturlrewrite

An HTTP to HTTPS redirect rewrite rule

Then place you HTTPS redirection rule before the ReverseProxy rule.

urlrewrite2

Place the HTTPS redirection rule before the ReverseProxy rewrite rule

Then know when you type http://build.keluro.com or simply build.keluro.com you’ll be automatically redirected and the job is done!

browsinghttp

A working website that redirects automatically to https