17/07/2018

EPPlus Excel Template Report Engine

EPPlus Excel Template Report Engine

Introduction

As anyone who as worked in corporate environments knows, is that they love their Excel reports with logos, formatting and formulas, which is in all honesty probably the best way to present data to “higher uppers”.  

Recently I was helping a client upgrade an old .net 2 code base and in it there was an ActiveX component which created Excel reports from templates. Since this component was out of support and forced the app to be x86 (which limits the available memory it can use) and we wanted to become “any cpu” and use more memory, so we needed a replacement.

This library is based on the great EPPlus found here : https://github.com/JanKallman/EPPlus

How it works

The basic idea is simple in that we take an existing Excel file with all the formatting and formulas it has and fill it with data. While Excel does this as a feature, it does impose that the user have read access to the the corporate database which may not be ideal in most corporations for security and performance reasons (anyone can run none optimized queries and slow the database server for example or have unsanctioned access to the data) and can be a real IT headache managing it all.

The data

For this to work you need a dataset with tables, in the example code there are 2 tables called table1 and table2.

EPPlus Excel Template Report Engine

The template file

You take your template file and for the places you want to insert data you put placeholders like %table1.col1% where the table1 is the reference to the dataset table name and the col1 is the column name for the data you want.

As you can see the default for the library is using % character (which you can change if you want).

EPPlus Excel Template Report Engine

The image above is using Excel’s table formatting (the green region) for the rows and column6 is a formula =column3*2.

When referencing non rows of data like %table1.col1% or %table2.col2% the library will use the data in first row of the table in question. 

Rows of data

For rows of data you need to first create an Excel named region by selecting the rows your data should appear in and clicking toolbar->Formulas->Define Name and setting the name to the dataset table name. You can define workbook or worksheet specific regions and the library will use either.

EPPlus Excel Template Report Engine

In the row you defined you can use %column1% names and omit the table name since it is implied by the region.

How to use the library

Once you have your Excel template and your Dataset with as many DataTables you need, you can call the library to create a new filled Excel file for you.

// uses % as the default deliminators -> %table1.column1%
EPPlus.Template.FillReport("new.xlsx", "template.xlsx", dataset);
// user defined deliminators -> {table1.column1}
EPPlus.Template.FillReport("new.xlsx", "template.xlsx", ds, new string[] {"{" , "}" });

The Code

The code is really simple as follows:

using OfficeOpenXml;
using System.Data;
using System.IO;

namespace EPPlus
{
    public class Template
    {
        public static void FillReport(string filename, string templatefilename, DataSet data)
        {
            FillReport(filename, templatefilename, data, new string[] { "%", "%" });
        }

        public static void FillReport(string filename, string templatefilename, DataSet data, string[] deliminator)
        {
            if (File.Exists(filename))
                File.Delete(filename);

            using (var file = new FileStream(filename, FileMode.CreateNew))
            {
                using (var temp = new FileStream(templatefilename, FileMode.Open))
                {
                    using (var xls = new ExcelPackage(file, temp))
                    {
                        // process workbook regions
                        foreach (var n in xls.Workbook.Names)
                        {
                            FillWorksheetData(data, n.Worksheet, n, deliminator);
                        }

                        // process worksheet regions
                        foreach (var ws in xls.Workbook.Worksheets)
                        {
                            foreach (var n in ws.Names)
                            {
                                FillWorksheetData(data, ws, n, deliminator);
                            }
                        }

                        // process single cells
                        foreach (var ws in xls.Workbook.Worksheets)
                        {
                            foreach (var c in ws.Cells)
                            {
                                var s = "" + c.Value;
                                if (s.StartsWith(deliminator[0]) == false &&
                                    s.EndsWith(deliminator[1]) == false)
                                    continue;
                                s = s.Replace(deliminator[0], "").Replace(deliminator[1],"");
                                var ss = s.Split('.');
                                try
                                {
                                    c.Value = data.Tables[ss[0]].Rows[0][ss[1]];
                                }
                                catch { }
                            }
                        }

                        xls.Save();
                    }
                }
            }
        }

        private static void FillWorksheetData(DataSet data, ExcelWorksheet ws, ExcelNamedRange n, string[] deliminator)
        {
            // region exists in data
            if (data.Tables.Contains(n.Name) == false)
                return;

            var dt = data.Tables[n.Name];

            int row = n.Start.Row;

            var cn = new string[n.Columns];
            var st = new int[n.Columns];
            for (int i = 0; i < n.Columns; i++)
            {
                cn[i] = (n.Value as object[,])[0, i].ToString().Replace(deliminator[0], "").Replace(deliminator[1],"");
                if (cn[i].Contains("."))
                    cn[i] = cn[i].Split('.')[1];
                st[i] = ws.Cells[row, n.Start.Column + i].StyleID;
            }

            foreach (DataRow r in dt.Rows)
            {
                for (int col = 0; col < n.Columns; col++)
                {
                    if (dt.Columns.Contains(cn[col]))
                        ws.Cells[row, n.Start.Column + col].Value = r[cn[col]]; // set cell data
                    ws.Cells[row, n.Start.Column + col].StyleID = st[col]; // set cell style
                }
                row++;
            }

            // extend table formatting range to all rows
            foreach (var t in ws.Tables)
            {
                var a = t.Address;
                if (n.Start.Row.Between(a.Start.Row, a.End.Row) &&
                    n.Start.Column.Between(a.Start.Column, a.End.Column))
                    t.ExtendRows(dt.Rows.Count - 1);
            }
        }
    }

    public static class int_between
    {
        public static bool Between(this int v, int a, int b)
        {
            return v >= a && v <= b;
        }
    }
}

The code goes though Workbook regions then WorkSheet regions and finally processes all single cells within all the worksheets.

All the work is done in the FillWorksheetData() method which replaces the column placeholder with the associated data in the DataTable and sets the style for the row based on what is defined. The last part goes through formatted Excel table definitions and extends the formatting to all the rows added.

The library does require an addition to the EPPlus library which I have submitted a pull request for, so until the original EPPlus approves this addition or somehow allows me to do what is needed, you need to use the changed version here. The added code is as follows in the ExcelTable.cs file:

public void ExtendRows(int count)
{
     var ad = new ExcelAddress(Address.Start.Row,
                               Address.Start.Column,
                               Address.End.Row + count,
                               Address.End.Column);
     Address = ad;
}

The version of EPPlus which is used here is v4.5.2.1 as of writing this article.

To help simplify the code I have added an extension method at the end to handle Between() for int values which makes writing the range checking logic easier and more readable.

History

  • Initial version v1.0 : 15th July 2018