Image of laptop and printed reports

Generate HTML based PDF reports from Azure App Service

Wouter Huysentruit
7 min readJan 21, 2021

--

Imagine having an ASP.NET Core based application deployed to Azure App Service when suddenly the customer wants you to add PDF reporting to the application. Now what?

There are a lot of libraries, tools and even SaaS solution to reach that goal, so we’ll have to pick a feasible solution for the given use-case. In this article I’ll be giving a short overview of some available options I came across and when to pick what.

After selecting the right solution, we’ll have a deep dive into the implementation. Spoiler: we will be implementing server-side PDF generation using a wkhtmltopdf-based approach.

Options

Too many options

Most of us are already familiar with HTML and CSS, so we want to reuse that knowledge as an easy way to design our templates.

The available options can be categorized under:

  • SaaS
  • Client-side generated PDF
  • Server-side generated PDF

SaaS

Cloud-based solutions are the fastest way to get started: buy a plan and start sending it your HTML/CSS + data in exchange for a nice PDF report. A great SaaS product for generating PDF reports, which I have been using in the past, is jsreport.

When to pick SaaS?

  • You don’t have time to get things up and running.
  • You don’t want to maintain library versions or related infrastructure.

When to avoid SaaS?

  • You don’t want to depend on the availability of a third-party service.
  • You don’t want to share sensitive data with an untrusted party.
  • Your application needs to be able to generate PDF reports while working offline.

Client-side generated PDF

Your customer will use your application through a web-browser, so why not generate the PDF using JavaScript at client-side (‘in the browser’)?

A library like jsPDF can be used to generate a PDF from JavaScript but requires us to script the generation of the PDF in a procedural manner; we can’t use HTML/CSS to style our report templates. This is useful for simple reports, but not so for more complex designs. An additional library like html2canvas comes into play when we want to be able to start from HTML/CSS. Here’s a small example that combines both:

A downside of this approach is that each page in the PDF will be a single image, so text selection and copy/paste will not work. Then there’s also this from_html plugin for jsPDF to convert HTML to jsPDF commands, but it looks very restrictive.

There’s an other, rather simplistic, idea to go from HTML/CSS to PDF using JavaScript: create your HTML/CSS, render it off-screen and use window.print() to open the browsers print-dialog. On most operating systems, the user will have to option to select Print as PDF in the print dialog, but you’re never sure. Here’s some sample code for this idea:

When to pick client-side?

  • Web-applications that need to be able to generate PDF reports while being offline (f.e. applications backed with a service-worker).
  • You don’t want to send sensitive data to a third-party provider.

When to avoid client-side?

  • When you don’t want full-page images in the PDF or rely on the user having a Print-to-PDF printer installed.
  • When you also need to generate reports outside of the browser (by API call f.e.).

Server-side generated PDF

Most server-side solutions I have found are either based on a headless Chrome instance, PhantomJS or wkhtmltopdf (PhantomJS and wkhtmltopdf both use QtWebKit which is a Qt port of webkit that makes it run cross-platform).

jsreport can also be installed on-premise or on our own VM. I also found Puppeteer v5.5.0 (pptr.dev). Since both are based on node, they are not feasible to run next to your ASP.NET Core application inside Azure App Service. You’d need to setup a separate App Service for NodeJS and take additional measures to prevent access to the report service from the outside world (configure firewall and private networking).

wkhtmltopdf on the other hand, is a low-level C library which we can include in our ASP.NET Core deployment as .NET can call C library functions using a technique called P/Invoke. To use wkhtmltopdf in our ASP.NET Core applications, we need to provide the correct binary for Windows or Linux (depending on the Azure App Service plan we have created) and a .NET wrapper, so we can easily access the low-level C library from C#/.NET.

When to pick server-side?

  • You want to be able to cache or store generated PDF reports.
  • You want full control over the PDF report generation.
  • You don’t want to send sensitive data to a third-party provider.

When not to pick server-side?

  • Your application needs to be able to generate PDF reports while working offline.
  • You don’t want to invest time building and maintaining this yourself.

There’s also the commercial product PDF SDK | PSPDFKit which seems to have solutions for both client-side and server-side PDF generation. I am in no way affiliated with PSPDFKit but wanted to mention it here for completeness.

Selection

People picking oranges

Since I’m building a SaaS product that can contain sensitive information, we don’t want to depend on a third-party provider. Text in the PDF-files should be selectable and we have no control over the OS and the availability of a PDF printer installed on the users system. The application also doesn’t require offline-support.

Since the application I’m building is still a PoC and I like to keep things simple, I ended up selecting server-side PDF report generation based on wkhtmltopdf. DinkPDF is a popular wrapper for wkhtmltopdf but we still have to provide and deploy the wkhtmltopdf C-libraries ourselves. Apart from that, DinkPDF seems to be no longer maintained. This is where another library comes in: Haukcode.WkHtmlToPdfDotNet, it’s a fork from DinkPDF and also includes the wkhtmltopdf libraries and decides at runtime which platform specific library needs to be used (MacOSX, Linux or Windows, x86 or x64).

Implementation

Man writing code on laptop

The first thing we’ll do is adding the required NuGet package to our project:

dotnet add package Haukcode.WkHtmlToPdfDotNet

In the Startup.cs add this line to the ConfigureServices method:

services.AddSingleton(typeof(IConverter), new SynchronizedConverter(new PdfTools()));

Now, we can create an API controller with an endpoint that generates a PDF file. For brevity, I have included all code inside the controller; in your project, I highly suggest to move this code out of the controller class as you’ll probably also need to fetch data and do tons of other stuff before actually generating the report. Anyway, here’s a quick example:

We can run our project locally or deploy it to Azure App Service, the WkHtmlToPdfDotNet package will make sure the wkhtmltopdf library is places in the correct location.

Because WkHtmlToPdfDotNet currently supports up to 5 runtimes, the generated build artifact will increase by around 80MB. To avoid that, we can add an additional step in our build pipeline to clean-up runtimes we do not need. F.e., when the Azure App Service is running on the Windows operating system, only the win-x86 (by default App Service uses x86 runtime) version is required and we can remove the other runtimes with these glob patterns:

**/runtimes/linux-x64
**/runtimes/linux-x86
**/runtimes/osx-x64
**/runtimes/unix
**/runtimes/win-arm
**/runtimes/win-arm64
**/runtimes/win-x64

Template engine

In the real world, you’ll want to inject some data coming from a database or other kind of data source into the report. This is where a template engine comes into play. Since we’re using ASP.NET Core, a logical choice would be to use Razor syntax/templates. A well-known library for parsing Razor templates is RazorLight.

Personally, I’m more leaning towards Handlebars template instead of Razor templates because they’re more portable. Especially, when building a SPA based on JavaScript (Angular, React, Vue, …), Handlebars seems to be a better fit (f.e. if you also want to show client-side template previews). Also, when we want to switch to a SaaS product in the future, this will be a very easy switch as most of them have Handlebars support, but never Razor syntax support.

A great library for parsing Handlebars templates in .NET is, conveniently named, Handlebars.NET.

Wrapping up

Now we have touched the basics and have a solution that we can deploy to Azure App Service, the next step would be to create a more rich and dynamic PDF report by loading data from a data source and running it through a template engine before generating the final report. But I’ll leave that up as an exercise to you 🙂

That’s all I wanted to write for now. If you think there are parts missing, wrong or unclear, feel free to contact me.

Have a nice day!

--

--

Wouter Huysentruit

Software Engineer and Architect focusing on .NET and Microsoft technologies. Microsoft MVP. Practitioner of clean code. #solid #tdd #ddd #cqrs #es #graphql