GIT DIFF - Creating an HTML file showing the differences between two text files

How to create an HTML file showing the differences between two pieces of text within a clinic letter. Additions shown in green with deletions shown in red.


Introduction

One of the features required by the clinic letters system I’m currently developing is the ability to visually see the differences between any two versions of a given letter. The finished report needed to be a web page which showed the deleted words in red and highlighted any additions in green text. Comparing two versions of a piece of text is something that software developers do all the time within their version control system. I knew that Git could do the heavy lifting of comparing the letters content by using the following command:

git diff --color-words --no-index v1.txt v2.txt

Which would output the colourised differences to the terminal like this:

screenshot showing the coloured output of Git's diff command

Sample taken from two revisions of a Wikipedia article.

So how do we get this into an HTML file? The first step was to pipe the output to a new file and see which control codes were being used to colourise the text within the terminal. Using that information I wrote a simple PHP script which reads the output from the Git command and substitutes the escape codes with HTML span tags to depict additions and deletions. The PHP script is shown below:

<?php
  $GLOBALS['data'] = '';

  while ($input = fread(STDIN, 1024)) {
    $GLOBALS['data'] .= $input;
  }

  convertControlCodes();
  echo $GLOBALS['data'];

  function convertControlCodes() {
    $ControlCodes = array(
      "\e[m"   => '</span>',
      "\e[1m"  => '<span class="diffdim">',
      "\e[36m" => '<span class="diffdim">',
      "\e[32m" => '<span class="diffadd">',
      "\e[31m" => '<span class="diffdel">',
    );

    foreach ($ControlCodes as $k => $v) {
      $GLOBALS['data'] = str_replace($k, $v, $GLOBALS['data']);
    }
  }
?>

The command used to pass the Git output through our newly created PHP script is shown below. The resulting HTML is piped into a temporary file as we need to combine it with a header and footer file to generate a complete web page.

git diff --color-words --no-index v1.txt v2.txt | php colour-diff-html.php > tmp.html

To create the complete web page I have a standard diff header and footer file which contain the necessary HTML tags to wrap the output in an HTML pre tag. The header file also contains a simple style sheet which colourises the span tags for additions and deletions. It also pulls in the Inconsolata font from Google to style the actual results. The command below demonstrates how to combine the files to make a final results page.

cat diff-head.html tmp.html diff-foot.html > result.html

Here is how the final results look in the browser

screenshot showing HTML coloured diff file