Windows with WSL2, a good configuration for a developer team?

In the Talentoday tech team we have regular discussions about the best Operating System (OS) workstation configuration for the team. While experienced engineers are free to choose what they prefer, it is reasonable to recommend and maintain a kind of preferred configuration for interns and juniors to facilitate the setup and onboarding. Talentoday tech stack being *Nix based, for long most of the developers were using MacOS with MacBooks. This is definitely an acceptable solution but we confronted that, and some of us (me included), started to use Linux distributions. However, I did not find yet an ideal setup. I will share here some criticisms about various OS configurations and discuss the new Microsoft WSL2 and see how this could be the best of both all worlds, yet to be confirmed.

Usual OS configurations, main flaws

First I am presenting here my main criticisms, as you will notice most are not on the technologies themselves but more on their daily usage within a team and a potential large fleet of devices.

Windows OS. Unless you develop solely .NET applications (the regular .NET FX not the new .NET Core) that can only run on Windows OS then, Windows is probably not the ideal dev OS. While Python interpreters work well on Windows some other technologies, like Ruby for instance, are really a pain on Windows. It is a fact that most of developer tools and technologies are not thought for Windows (or at least not as a first class citizens). On the positive side, administrating the team machines would be easier compared to others. Indeed, Windows is really adapted for maintaining a large fleet of machines and most everyday applications run perfectly on Windows. Unfortunately, this latter argument does not counterbalance the poor dev experience on many technologies.

Linux standalone OS setup. In this case, you have a real Linux OS that will probably be the closest thing that hosts and runs your applications in production. Using Linux, you can use a great package manager such as apt. Good code editors like VSCode or Sublime work perfectly. Yet… in the real world other problems arise. First, the ones that are due to Linux desktops and window manager themselves. For example, having different resolutions and scaling for your screens is not well supported unless you go with Weyland. Well with Weyland you will stumble on troubles for sharing your screens. In our team, we do remote work, we need to share things when huddling. We have to present to clients and teammates around the world everyday. You need a comfortable setup with multiple monitor and the ability to jump on in case clients invite you on a GoToMeeting, Zoom.us or whatever videoconf system and this needs to work. Similarly, it happens frequently that we receive *.xlsx, *.docx files with macros than only Office desktop can process. Therefore, even if you manage to have everything working on your Linux desktop workstation there are things, independent of your will, that can sometimes be problematic.

Dual boots. Honestly, I never was really convinced by dual boots. It is quite cumbersome. In addition, most of the time you need to disable secure boot, which is, for a company setup, probably not a good recommendation.

Leveraging virtualized OS. That’s an option and here are two possibilities. First, host OS is Windows/MacOS and you develop in a guest Linux. Well if you are a developer your primary work will probably be development, then, even though you can make your VM as seamless as possible, it is always preferable to do your everyday work in the host OS. The other way is setting a guest Windows OS (you cannot guest MacOS) on Linux host. Again, some problems like sharing your Linux screen for showing your local devwork will not go away. Let me also point out that you depend a lot on the hypervisor and its ability to handle well displays. My experience proved me that this is often bugged. Fixing this kind of bugs does not seem to be the top priority of hypervisors. Take for example Oracle and VirtualBox that keeps ignoring for years this bug on HighDPI screen.

Mac OS. It seems that this has become a very popular solution for developers in the past decade. This is the configuration that I have now for almost two years and it works!  Nevertheless, there are some criticisms that cannot be ignored. Even if MacOS is Unix based and POSIX compliant this is no Linux. This makes a bit of difference: you cannot use apt-get but you have to rely on homebrew instead. One can not ignore that Docker For Mac also has some serious flaws. In addition, it is clear that maintaining a fleet of Macbook Pros is quite a pain: no native group policies . For device maintenance, you have to go with Apple support etc. Finally, the cost associated with purchasing Macbook Pro compared to its equivalent at Dell is higher (even that is not so much as it is often mentioned in discussions).

The best of all worlds? Windows 10 with WSL2?

Ubuntu on Windows

Ubuntu on Windows

WSL stands for Windows Subsystem for Linux. At the time of  writing its first version WSL1 is shipped on all up-to-date Windows 10 builds, to get access to WSL2 you need to register to Windows Insider Program and retrieve a preview build.

With WSL, Microsoft brought a Linux Kernel inside Windows 10. It can be used directly from within Windows host using a bash shell. In its first version, it was really useful for many small tasks, such as sshing easily from Windows (and not relying on PuTTy). Yet, some limitations on networking and filesystem did not make it a viable solution for an everyday developer workstation setup. WSL2 is now a much more complete solution that makes usable for development (involving a full VM on top of Hyper-V that some may even see as a regression).

In addition, some really important news came along. First, the popular code editor VSCode fully supports WSL2 and brings extensions. Second, and probably more important, Docker in the latest tech preview edition of Docker Desktop let you use WSL2 to run the containers.

If you are working on existing projects, there is a strong possibility that some (if not all) of your services leverage Docker. Therefore, you must have a way to run and access Docker containers from your new WSL2 environment. This was our situation at Talentoday where we rely a lot on Docker.

I tried to install our *Nix based Talentoday stack on Windows 10 with WSL2 and it worked like a charm.

As a big overview, at Talentoday, the applicative server services are built on top of Ruby on Rails (with jobs and queues etc.) accessing SolR indexers. We also have a cache mechanism on top of Redis, a PostgreSQL database and a Flask based Python WebAPI. For the developer experience, we rely a lot on Docker containers (including the database) for this various components. With the Docker Tech Preview, I had no problem building and starting my images and contacting them within WSL2 and accessing them from both WSL2 code and/or host Windows.

We will need some step back and see if WSL2 is the right way to go in the long run. We will have to pursue our investigation and see if there are no other problems (performance, networks etc.) that I could not spot during this one day trial. In all cases, this early attempt sounds extremely promising and could be a good solution for developers in teams who want to keep the benefit of a commonly distributed OS and a real developer experience leveraging *Nix based technologies.

Talentoday stack running on WSL2

Talentoday stack running on WSL2

Remote debugging Python with VSCode

I truly think that no matter what your platform is, you must have access to a comfortable development environment and a working debugger is one of the most important part of it. Remote debugging can be very helpful: it is possible to execute code on a remote machine and benefit from a nice debugging experience locally in your favorite code editor.

In this blog post I propose to review the setup of Python remote debugging with the portable and popular code editor VSCode. Actually VSCode documentation provides some very short instructions. In this blog post we will provide more explanations.

Use remote debugging capabilities of VSCode with Python

Use remote debugging capabilities of VSCode with Python

Prerequisite

We will assume that we do not have any security constraints. Precisely, we do not care about MITM interceptions between our client and remote server. We will discuss in appendix how we could solve this using SSH portforwarding.

We assume that the reader is familiar with the usage of a debugger in VSCode. In addition, we assume that the reader knows how to logon on a remote machine using SSH.

Our example

In this blog post we used an Ubuntu Azure Virtual Machine. Its configuration, RAM, GPU etc. are independent so you can basically choose anything.

We assume now that the reader has an Azure Ubuntu server running and is able to logon through SSH. Note that in VSCode documentation SSH portforwarding is mentioned but we will ignore it for now.

Let us present precisely what remote debugging is.
In this post, the name remote stands for our Ubuntu VM on Azure while the client is our local, e.g. MACOS, computer. With remote debugging only a single Python process is executed on the remote VM then, on client computer, VSCode “attach itself” to this remote process so you can match the remote code execution with your local files. Therefore, it is important to keep exactly the same .py files on client and in host so that the debugging process is able to match line by line the two versions.

The magic lies in a library called ptvsd that makes the bridge for attaching local VSCode to remotely executed process. The remotely executed Python waits until the client debugging agent is attached.

Obviously network communication is involved here and that is actually the major pitfall when configuring remote debugging. The VSCode documentation is fuzzy about whether to use IP or localhost which port to set etc. We will try to simplify things so the debugging experience becomes crystal clear.

Networking

To make things simpler we decided to show an example where the Python process is executed on a remote machine whose IP address is 234.56.45.89 (I chose this address randomly). We use the good old port 80 for the communication (the usual port for http).

Before doing anything else we need to make sure that our remote VM network configuration is ok. We will make sure that machine 234.56.45.89 can be contacted from the outside world on port 80.

Firstly, using an SSH session on remote machine we will start a webserver using the following Python3 command. You may need elevated privilege for listening on port 80 (for real production usage give this privilege to the current user, do not sudo the process).

sudo python3 -m http.server 80

Secondly on a client terminal you should be able request your machine using wget (spider mode to avoid file download). In this command the target machine is accessed with IP:PORT

wget --spider 234.56.45.89:80

You should get response from the server. If you see some errors, you mat need to open the 80 port in firewall configuration, see instructions here for Azure.

Make sure you can contact your machine on port 80 by running a one line Python server

Make sure you can contact your machine on port 80 by running a one line Python server

At this stage your network configuration is ok. You can stop the Python command that runs the webserver.

Configuring VSCode

Make sure that you have the VSCode Python extension installed. Follow the instructions here to add a new Debug configuration in your launch.json containing the following JSON configuration.

{
    "name": "Attach (Remote Debug)",
    "type": "python",
    "request": "attach",
    "localRoot": "${workspaceRoot}",
    "remoteRoot": "/home/benoitpatra",
    "port": 80,
    "secret": "my_secret",
    "host":"234.56.45.89"
}

It is important to understand that this configuration is only for VSCode. The host corresponds to the machine where the remote Python process is ran. The port corresponds to the port that will be used by the remote process to communicate with the client debugging agent, in our case it is 80.

You must specify the root folders, on both local environment and on the remote machine.

That’s it for VSCode configuration.

The code under debugging

Let us debug the following Python script

import os
import ptvsd
import socket
ptvsd.enable_attach("my_secret", address = ('0.0.0.0', 80))

# Enable the line of source code below only if you want the application to wait until the debugger has attached to it
#ptvsd.wait_for_attach()

ptvsd.break_into_debugger()

cwd = os.getcwd()

print("Hello world you are here %s" % cwd )
print("On machine %s" % socket.gethostname())

As explained in the introduction, the Python file must be the same on client and on remote machine. There is one exception yet, the line ptvsd.wait_for_attach() must be executed by remote Python process only. Indeed, it tells the Python process to pause and wait that the client is attached to continue.

Of course in order to execute it you may need to install dependencies (for example using Pip) so it executes on the remote machine.

REMARK: looks like at the time of the writing version of ptvsd>3.0.0 suffers some problems. I suggest that you force the install of version 3.0.0, see this issue.

It is important to understand that enable_attach, enable_attach, break_into_debugger are instructions for the remote Python process. The first line ptvsd.enable_attach("my_secret", address = ('0.0.0.0', 80)) basically instructs the remote Python process to listen on all network interfaces, on port 80 for any client debugger that would like to attach. This client agent must provide the right secret (here my_secret).

The line ptvsd.break_into_debugger() is important, it is the line that allows to break and navigate in code with client VSCode.

Putting things together

Now you are almost ready. Make sure your Python file is duplicated on both local and remote at root location. Make sure the ptvsd.wait_for_attach is uncommented and executes on remote environment.

Now using an SSH session on remote machine. Start the Python process using elevated privileges
sudo python3 your_file_here.py

This should not return anything right now and should be hanging, waiting for your VSCode to attach the process.

Set a VSCode break point just after ptvsd.break_into_debugger(), make sure that in VSCode the selected debugging configuration is Attach (Remote Debugger). Hit F5, you should be attached and breaking in code !

What a relief, efficient working ahead !

Breaking in VSCode

Breaking in VSCode

Going further

The debugging procedure described aboved is simplified and suffer some flaws.

Security constraints

Here anybody can intercept your traffic, it is plain unencrypted http traffic. A recommended and yet simple option to secure the communication is to use SSH port forwarding tunnelling. It basically creates an encrypted network communication between your localhost client and the remote machine. When an SSH tunnel is setup, you can talk to your local machine on a given port and the remote receives call on another port (magic, isn’t it?). Therefore the launch.json configuration should be modified and host value is localhost. Note also that the port in Python code and in launch.json may not be the same, you have two different ports now.

Copying files

We pointed out that the files must be the same between local env and remote. We advise to group in a shell script: the files mirroring logic (using scp) and the execution of the Python process on remote machine.

Handling differences between local and remote files

We said that the files must the same between local env and remote but we need some differences at least to allow the execution of ptvsd.wait_for_attach on remote.
This is definitely something that can be handled in an elegant manner using environment variables.

if os.environ.has_key("REMOTE"):
    ptvsd.break_into_debugger()
end

Of course you need to pass now the environment variable to you remote process with SSH, see this stackexchange post to know how to do that.

Using Analytics in Application Insights to monitor CosmosDB Requests

Following Wikipedia, DocumentDB (now CosmosDB) is

Microsoft’s multi-tenant distributed database service for managing JSON documents at Internet scale.

The throughput of the database is charged and measured in request unit per second (RUs). Therefore, when creating application on top of DocumentDB, this is a very important dimension that you should pay attention to and monitor carefully.

Unfortunately, at the time of the writing the Azure portal tools to measure your RUs usage are very poor and not really usable. You have access to tiny charts where granularity cannot be really changed.

DocumentDB monitoring charts in Azure Portal

These are the only monitoring charts available in the Azure Portal

In this blog post, I show how Application Insights Analytics can be used to monitor the RUs consumption efficiently. This is how we monitor our collections now at Keluro.

Let us start by presenting Application Insights, it defines itself here as

an extensible Application Performance Management (APM) service for web developers on multiple platforms. Use it to monitor your live web application. It will automatically detect performance anomalies. It includes powerful analytics tools to help you diagnose issues and to understand what users actually do with your app.

Let us show how to use it in a C# application that is using the DocumentDB .NET SDK.

First you need to install the Application Insights Nuget Package. Then, you need to track the queries using a TelemetryClient object, see a sample code below.

public static async Task<FeedResponse<T>> LoggedFeedResponseAsync<T>(this IQueryable<T> queryable, string infoLog, string operationId)
{
	var docQuery = queryable.AsDocumentQuery();
	var now = DateTimeOffset.UtcNow;
	var watch = Stopwatch.StartNew();
	var feedResponse = await docQuery.ExecuteNextAsync<T>();
	watch.Stop();
	TrackQuery(now, watch.Elapsed, feedResponse.RequestCharge, "read", new TelemetryClient(), infoLog, operationId, feedResponse.ContentLocation);
	return feedResponse;
}

public static void TrackQuery(DateTimeOffset start, TimeSpan duration, double requestCharge, string kind, TelemetryClient tc, string infolog, string operationId, string contentLocation)
{
	var dependency = new DependencyTelemetry(
			"DOCDB",
			"",
			"DOCDB",
			"",
			start,
			duration,
			"0", // Result code : we can't capture 429 here anyway
			true // We assume this call is successful, otherwise an exception would be thrown before.
			);
	dependency.Metrics["request-charge"] = requestCharge;
	dependency.Properties["kind"] = kind;
	dependency.Properties["infolog"] = infolog;
	dependency.Properties["contentLocation"] = contentLocation ?? "";
	if (operationId != null)
	{
		dependency.Context.Operation.Id = operationId;
	}
	tc.TrackDependency(dependency);
}

The good news is that you can now effectively keep records of all requests made to DocumentDB. Thanks to a great component of Application Insights named Analytics, you can browse the queries and see their precise request-charges (the amount of RUs consumed).

You can also add identifiers (with variables such as kind and infolog in sample above) from your calling code for a better identification of the requests. Keep in mind that the request payload is not saved by Application Insights.

In the screenshot below you can list and filter the requests tracked with DocumentDB in Application Insights Analytics thanks to its amazing querying language to access data.

Getting all requests to DocumentDB in a a timeframe using application Insights Analytics

Getting all requests to DocumentDB in a a timeframe using application Insights Analytics

There is one problem with this approach is that for now, using this technique and DocumentDB .NET SDK we do not have access to the number of retries (the 429 requests). This is an open issue on Github.

Finally, Analytics allows us to create a very important chart. The accumulated RUs per second for a specific time range.
The code looks like the following one.

dependencies
| where timestamp > ago(10h)
| where type == "DOCDB"
| extend requestCharge = todouble(customMeasurements["request-charge"])
| extend docdbkind = customDimensions["kind"]
| extend infolog = customDimensions["infolog"]
| order by timestamp desc
| project  timestamp, target, data, resultCode , duration, customDimensions, requestCharge, infolog, docdbkind , operation_Id 
| summarize sum(requestCharge) by bin(timestamp, 1s)
| render timechart 

And the rendered charts is as follows

Accumulated Request-Charge per second (RUs)

Accumulated Request-Charge per second (RUs)

Powershell srcset image generator

If you have a website and SEO matters for you, then you probably had to optimize images. To this aim, you may want to have responsive images. As explained here,

a responsive image is an image which is displayed in its best form on a web page, depending on the device your website is being viewed from.

One of the modern way to serve quickly responsive images is to benefit from the srcset html attribute. Shortly, depending on parameters and your viewport (i.e. browser window) the srcset attribute will tell the browser to download the best appropriate image for the current display.

For example, if you put the following HTML element

<img src="images/fcnantes-champions-95.jpg"
srcset="images/fcnantes-champions-95.jpg 200w, images/fcnantes-champions-95-400.jpg 400w,
images/fcnantes-champions-95-600.jpg 600w,
images/fcnantes-champions-95-800.jpg 800w">

Your server logic can serve up to four different images representing the same pictures.

You may guess that creating all this different resized pictures can be painful manually. In this blog post we propose the following Powershell script to help you for the automation of this task.

Param ( [Parameter(Mandatory=$True)] [ValidateNotNull()] $imageSource, [Parameter(Mandatory=$true)][ValidateNotNull()] $quality )

if (!(Test-Path $imageSource)){throw( "Cannot find the source image")}
if ($quality -lt 0 -or $quality -gt 100){throw( "quality must be between 0 and 100.")}

[void][System.Reflection.Assembly]::LoadWithPartialName("System.Drawing")
$resolvedPath = Join-Path $PWD -ChildPath $imageSource
$bmp = [System.Drawing.Image]::FromFile($resolvedPath)

#hardcoded canvas size...
$canvasWidths = @(200, 400, 600, 800)

foreach($canvasWidth in $canvasWidths){
    #Encoder parameter for image quality
    $myEncoder = [System.Drawing.Imaging.Encoder]::Quality
    $encoderParams = New-Object System.Drawing.Imaging.EncoderParameters(1)
    $encoderParams.Param[0] = New-Object System.Drawing.Imaging.EncoderParameter($myEncoder, $quality)
    # get codec
    $myImageCodecInfo = [System.Drawing.Imaging.ImageCodecInfo]::GetImageEncoders()|where {$_.MimeType -eq 'image/jpeg'}

    #compute the final ratio to use
    $ratioX = $canvasWidth / $bmp.Width;
    $ratioY = $canvasWidth / $bmp.Height;
    $ratio = $ratioY
    if($ratioX -le $ratioY){
        $ratio = $ratioX
    }

    #create resized bitmap
    $newWidth = [int] ($bmp.Width*$ratio)
    $newHeight = [int] ($bmp.Height*$ratio)
    $bmpResized = New-Object System.Drawing.Bitmap($newWidth, $newHeight)
    $graph = [System.Drawing.Graphics]::FromImage($bmpResized)

    $graph.Clear([System.Drawing.Color]::White)
    $graph.DrawImage($bmp,0,0 , $newWidth, $newHeight)

    $targetFileName = [System.IO.Path]::GetFileNameWithoutExtension($imageSource) + "-" + $canvasWidth + ".jpg"
    $dir = [System.IO.Path]::GetDirectoryName($resolvedPath)
    $targetFilePath = Join-Path $dir -ChildPath $targetFileName
    Write-Host "Saving file" $targetFilePath
    #save to file
    $bmpResized.Save($targetFilePath,$myImageCodecInfo, $($encoderParams))
    $graph.Dispose()
    $bmpResized.Dispose()
}
$bmp.Dispose()

Now you can simply invoke the script like this: .\SrcsetBuilder.ps1 "..\images\MyImage.jpg" 85. Then all generated images: MyImage-200.jpg, MyImage-400.jpg, MyImage-600.jpg, MyImage-800.jpg are located next to MyImage.jpg.
You can modify the generated images widths by changing the values in the array $canvasWidths (line 11).

Debugging locally REST API webhooks with Visual Studio

Modern REST APIs such as Outlook REST Api, Microsoft Graph or Facebook Graph expose very powerful capabilities called webhooks. They allow push notifications. After subscription, when something change these API send notifications to your service by calling the URL you provided. For example, in Outlook REST API the push notification services will send a request when something has been modify in the user mailbox such as a mail received or an email marked as read.

I am not going to explain how you register subscriptions to a particular webhook. In this blog post, we provide a solution in order to be able to “break” with your Visual Studio debugger in a callback webhook you subscribed to. The approach is not windows/.NET specific, actually the mechanism exposed here is generic, but these are the tools I am using at the moment so they will serve as example in this post.

Problem: when you subscribe to a webhook you specify what would be your notification URL (see Outlook REST API example). This url must be https and visible from the ‘outside’ internet. Therefore, you cannot set an url such as https://localhost:44301/api/MyNotificationCallBack where https://localhost:44301 is the url of your local development website.  However, it would be convenient in order to ‘break’ directly in your server side code responsible for handling the request. In addition, if you are using Visual Studio and IIS express for development you cannot simply expose a website with custom domain and SSL to the outside internet.

Solution: take a (sub)domain name you own (e.g. superdebug.keluro.com) then create an A record to point to your public IP. If you are in a home network this IP is the one of your ISP box. Configure this box to redirect incoming traffic for superdebug.keluro.com on port 443 to your personal developer machine (still on port 443). In your machine, configure an IIS web server with a binding for https://superdebug.keluro.com on port 443 that will act as reverse proxy and will redirect incoming traffic to your IIS Express local development server  (e.g. https://localhost:44301). Finally, set a valid SSL certificate on the reverse proxy IIS server for superdebug.keluro.com. Now, you can now use https://superdebug.keluro.com/api/MyNotificationCallBack as notification Url and the routing logic will redirect incoming push notification requests to https://localhost:44301/api/MyNotificationCallBack where you can debug locally.

Debug IIS Express website visible from the outside internet

Debug locally IIS Express website visible from the outside internet

Pitfalls:

  • Unfortunately, in case of home network, I cannot give precise instructions on how to configure your ISP box to reroute incoming traffic. Also make sure that the box IP does not change and is static.
  • Take care of your own Firewall rules, make sure that 443 port is open for both Inbound and Outbound rules.
  • In IIS Application Request Routing (ARR), the module that may be used for creating the reverse proxy, an option is set by default that modifies ‘location’ request response Headers. It may break your application that probably uses OAUTH flow. See this stackoverflow response.
  • If you never setup IIS to work as a reverse proxy. That is quite simple now with ARR or Rewrite Request modules. In this previous blog post we explained how to setup a reverse proxy with IIS.