Essential Visual Basic (VB.NET)
A collection of useful code snippets & examples


A collection of real world examples and code snippets, mostly documented for my own use. You’re free to reuse them within your own Visual Basic projects if you find any of them useful.

Index


Embedding Word into a form using Windows API calls

One of the techniques I developed while writing the first version of my clinical letters system back during the autumn of 2009, was the ability to make Microsoft Word appear embedded into the VB.NET form. After some developers saw the system recently, they enquired how I had achieved the effect so I am posting the code here for anybody else who might be interested in learning this technique.

The form has a picture control on it which becomes the canvas for the Microsoft Word instance to be attached to. Make the picture control the size you want Word to appear on your visual basic form. The technique basically uses three Windows API calls FindWindowA, SetParent and SetWindowPos. Here are the key steps:

  1. You start by loading an new instance of Microsoft Word, initially making it invisible until you have finished doing the set-up processing and are ready to display Word to the user. I usually set a unique window title for the Word instance just created making it easier to find the window handle for that instance.
  2. Use the FindWindow API call to find the handle for the main Word window, using the unique title just assigned to it.
  3. Use the SetParent API call to make the picture control the parent window for Word, making Word a child window of the picture control means it will be bound by its dimensions.
  4. Use the SetWindowPos API call to resize Word to the same dimensions as the picture control and make it the top most child window.
  5. Finally, we set Word’s window to visible so that it’s displayed to the user.

The Visual Basic code

' load word and make invisible
ObjWord = New Word.ApplicationClass
ObjWord.Visible = False

' set a custom window title to make it
' easier to find the window handle
ObjWord.Caption = "ALMA-ENHANCED-WORD";
WordWND = FindWindow(vbNullString, "ALMA-ENHANCED-WORD")

' make Word a child window of the picture control
SetParent(WordWND, Me.WordPicture.Handle.ToInt32())

ObjWord.WindowState = Word.WdWindowState.wdWindowStateNormal

' position Word to the co ordinates of the picture control
SetWindowPos(WordWND, HWND_TOPMOST, _
    0, 0, Me.WordPicture.Width, Me.WordPicture.Height, Nothing)

' add a document and make Word visible
ObjWord.Documents.Add()
ObjWord.Visible = True

Spell checking text using Hunspell

I recently wrote a VB.NET utility that would take a piece of text piped in via the command line, checking the text for any spelling mistakes. The spelling mistakes had to be wrapped in an HTML span tag so that they could be highlighted to the users of the web application using CSS styling. The utility also had to return the number of spelling mistakes in the original text so that it could be displayed in an error message.

I chose to use the free and open source spell checking library Hunspell, which is used in projects like LibreOffice, Mozilla Firefox and Google Chrome. Hunspell comes with dictionaries for many different languages, in this example I’m using English GB. Actually I’m using NHunspell which is a C# library which warps the native libraries of Hunspell into a .NET framework.

The function below illustrates how the original text is split into a list of words, which are then passed to the spell function. If the word is not recognised as being spelt correctly than the variable countOfMistakes is incremented before being returned as the final result. The incorrectly spelt word is than replaced in the original text by searching for it and enclosing it with the desired span tag.

Many tutorials showing you how to count/split words simply demonstrate splitting the text on a space character. In our example this would result in the last word before a full stop not making it to our array of words to check. Also words with other punctuation after them, like commas, would have the comma in the entry to check which would trigger a false misspelling. To get around this I first convert the common punctuation’s into spaces. When the function iterates over the wordsToCheck array, it only proceeds to check entries which have a word in them by checking the length is greater than zero.

Private Function highlightSpellingErrors _
    (ByRef originalText As String) As Integer

    Dim wordsToCheck    As String()
    Dim spellingText    As String
    Dim currentWord     As String
    Dim countOfMistakes As Integer
    Dim arrayIndex      As Long

    countOfMistakes = 0

    spellingText = OriginalText
    spellingText = Replace(spellingText, ",", " ")
    spellingText = Replace(spellingText, ";", " ")
    spellingText = Replace(spellingText, "(", " ")
    spellingText = Replace(spellingText, ")", " ")
    spellingText = Replace(spellingText, ".", " ")

    wordsToCheck = Split(spellingText, " ", -1, CompareMethod.Text)

    Hun.Load _
        ("C:\path\to\spelling\dictionaries\en_GB.aff", _
         "C:\path\to\spelling\dictionaries\en_GB.dic")

    For arrayIndex = LBound(wordsToCheck) To UBound(wordsToCheck)
        currentWord = wordsToCheck(arrayIndex).ToString

        ' check if array entry has a word
        If Len(currentWord) > 0 Then
            If Hun.Spell(currentWord) = False Then
                countOfMistakes = countOfMistakes + 1
                originalText = Replace(originalText, currentWord, _
                    "<span class=""spelling"">" + _
                    currentWord + "</span>")
            End If
        End If
    Next

    Hun.Dispose()

    Return countOfMistakes
End Function

Setting file attributes using VB.NET

I had a requirement to set the file attributes on the PDF files that my clinic letters system generates. Each letter is published as a PDF file when the consultant has checked and signed the letter. The system then merges any newly created letters into a combined PDF file, one for each specialty. The combined PDF allows the consultants to scroll back through the history of the patients care.

All the patients PDF files are kept in a partitioned folder structure based on their medical record number. We use Microsoft’s Indexing Service to index the combined PDF’s to generate a catalogue which is used to offer a full text search facility across all letters.

As the single letter PDF’s are stored alongside the combined versions, I needed to exclude the single letter PDF’s from the indexing service by setting its file attribute appropriately. The file attribute to exclude a given file from the indexing service is called NotContentIndexed.

Below is the snippet of code I use. I first ensure that the file exists and then set its Normal and NotContentIndexed file attributes.

Private Sub setFileAttributes(ByVal filename As String)
    Try

        If File.Exists(filename) = True Then
            Dim fileAttribs As FileAttributes = _
                FileAttributes.Normal Or _
                FileAttributes.NotContentIndexed

                File.SetAttributes(filename, fileAttribs)
        End If

    Catch exn As Exception
        ErrorHandler(exn)
    End Try
End Sub

In this example I’m not checking for any particular exceptions which may be returned from the File.SetAttributes method, but here is a list of the possible exceptions you could encounter when setting file attributes:


Regular expression to find ordinal suffixes

I’m starting to use LaTeX to typeset clinical letters into PDF files. The letters are stored in a database as plain text, with the odd markdown formatting commands for bold and italic text. However when the letter is signed by the consultant it needs to be converted to LaTeX commands to ensure correct typesetting.

Consider this extract from a clinical letter:

Biopsy booked for 9th December 2013. This patients CT scan of the thorax does demonstrate mediastinal / hilar adenopathy

The goal is to create a VB.NET RegEx pattern which will find the th on the 9th of December and substitute it with the LaTeX superscript command while leaving the th in the word adenopathy untouched. The regular expression needs to encompass finding all the ordinal variations st, nd, rd and th preceded by one or more consecutive digits.

The regular expression I settled on is shown below. It matches on a word boundary \b at the start and end of the expression. After the first word boundary it matches any number of digits (\d+) followed directly by either st or nd or rd of th.

Dim re As Regex = New Regex("\b(\d+)(:st|nd|rd|th)\b")

Any matches found return a three element array, where the first element is the text found, the second element is just the number part and the final element is the ordinal suffix.

[0] => Array([0] => 9th)
[1] => Array([0] => 9)
[2] => Array([0] => th)

The complete function is shown below. It applies the regular expression to the text supplied in the parameter and iterates over any/all matches. As we iterate over each match we replace the value in the first element of the array with the value of the second element followed by the string \textsuperscript{ before adding the value of the last element and then closing the curly brackets. The processed text is then returned.

Private Function formatOrdinalSuffix(ByVal InputText As String) As String
    Dim idx As Integer = 0
    Dim re  As Regex = New Regex("\b(\d+)(:st|nd|rd|th)\b")
    Dim mc  As MatchCollection = re.Matches(InputText)

    For Each m As Match In mc
        For groupIdx As Integer = 0 To m.Groups.Count - 1
            InputText = Replace( _
                InputText, _
                m.Groups(0).Value, _
                m.Groups(1).Value  _
                  + "\textsuperscript{" + m.Groups(2).Value + "}")
        Next
        idx=idx+1
    Next

    Return InputText
End Function

While this function demonstrates how to replace the ordinal with a LaTeX command, it could be easily modified to return RTF formatting codes or wrap the ordinal in a <sup> HTML tag.


Loading piped files in a command line program

After recently publishing a software manual written in LaTeX, I needed to create a plain text version of the file with all the TeX/LaTeX commands removed so that I could perform a realistic word count on the document. I also wanted to provide a plain text version of the document to the customer so that the manual could easily be read on a Kindle.

Obviously I tried to use the detex command, which went a long way to removing all the LaTeX formatting commands. Unfortunately, it didn’t recognise the custom commands I use for my chapter headings, which give nice background coloured pages, and it didn’t fully remove the index commands.

So I decided to write a quick VB.NET command line tool which would read a file piped in via the command line, handle my special requirements using regular expressions and then pipe the results onto the normal detex command. I used VB.NET with version 2 of the .NET framework so that the command line utility could be run under Linux using the Mono framework.

The command is called pre-detex, and it’s used like this:

mono pre-detex.exe < input.tex | detex > output.tex

Loading the contents of a file piped into a VB.NET program

The process of reading in a text file piped into the program is shown below. We first initialise PipedInput as an empty string. We use console peek to check for the end of the file, and while there is no end of file marker found we continue to read to the end of the file, effectively loading the entire contents of the input file into the PipedInput variable.

Imports System.Console
Imports System.Text
Imports System.Text.RegularExpressions

Module PREDETEX
    Dim PipedInput As String

    Sub Main()
        PipedInput  = ""

        If IsPipedInput() = False Then
            While (Console.In.Peek() <> -1)
                PipedInput = Console.In.ReadToEnd()
            End While

            DetexChapterHeadings()
            DetexIndexes()

            Console.WriteLine(PipedInput)
            End
        End If
    End Sub

    Function IsPipedInput() As Boolean
        If Console.KeyAvailable = False Then
            Return False
        Else
            Return True
        End If
    End Function

After reading in the whole file we then process the chapter headings and indexes, before writing the new content out to the console so that it can be piped into other commands or piped into a new file. Finally we terminate the program with the end function.

DeTeX the chapter titles

This is how my chapter titles are formatted within the LaTeX source file:

\SetBookChapterTitle{AutoLisp}{Sepia}{White}

The normal detex program was munging the results like:

AutoLispSepiaWhite

I needed to drop the second and third colour parameters, just outputting the chapter title prefixed with a hash so that it matched the markdown formatting for titles. So let’s take a look at the VB.NET regular expression used to achieve this.

Dim re As Regex = New Regex("\\SetBookChapterTitle\{(.*)\}{(.*)\}{(.*)\}")
Dim mc As MatchCollection = re.Matches(PipedInput)

For Each m As Match In mc
    PipedInput = Replace(PipedInput, m.Groups(0).Value, "# " & m.Groups(1).Value)
Next

We first set-up the regular expression to grab all the chapter headings following this format, replacing the strings found within PipedInput with a hash (#) followed by the text found in the first captured grouping.

DeTeX the Indexes

This is how index entries are formatted in the LaTeX source file:

\index{AutoLisp}

The normal detex program was removing the index keyword and the curly brackets, but was leaving the word AutoLisp in the file. For a large document with lots of index entries this would skew the word count totals making them less accurate.

The resulting routine is a little simpler, it finds all the \index commands in the source content, and replaces them with a blank string which removes them completely from the file.

Dim re As Regex = New Regex("\\index\{(.*)\}\r\n")
Dim mc As MatchCollection = re.Matches(PipedInput)

For each m as Match in mc
    PipedInput = Replace(PipedInput, m.Groups(0).Value, "")
Next

Determine if the current time of day is between two specified times

One of the clinical systems I’m responsible for uploads documents to a private cloud supplier. The supplier does a fail over test every morning at 2am. During this time frame they’d prefer not to receive any documents. I had to make our clinical system check if the current time of day is between a given time frame and not process any files between these times.

If DateTime.Now.TimeOfDay >= New TimeSpan(1, 45, 0) AndAlso _
   DateTime.Now.TimeOfDay <= New TimeSpan(2, 15, 0) = True Then

   tmrTrigger.Enabled = True
   Exit Sub
End If

The example above demonstrates how to check if the current time of day falls between 1.45am and 2.15am. If the current time does fall between these times then it just exists the sub routine after re-enabling the trigger timer. If the current time of day is not within the specified time frame then it continues on to process any pending files.


Compressing & base64 encoding data

Sometimes it can be useful to compress a large body of text and then base64 encode the results before storing the resulting value in a database varchar field. One of my clinical systems stores a large blob of RTF text into a document commit table every time a given document is edited. While the base64 encoding of a compressed data stream will slightly expand the data, the original RTF text is so verbose that it is highly compressible before being base64 encoded.

Even with the overhead of base64 encoding I was still making a saving of around 42% on the original data size.

The function below takes the text you want to compress and passes it through a stream writer using the GZip compression algorithm before converting the compressed data to a base64 string and returning the value.

Private Function compressAndBase64(ByRef originalText As String) As String
    Dim mem As New IO.MemoryStream

    Dim gz As New System.IO.Compression.GZipStream(mem, _
                         IO.Compression.CompressionMode.Compress)

    Dim sw As New IO.StreamWriter(gz)
    sw.WriteLine(originalText)
    sw.Close()

    Dim compressed As String = Convert.ToBase64String(mem.ToArray())
    Return compressed
End Function

To get the original text back you can use the function below passing in the compressed text. This function expands the base64 encoded text back into an array of compressed bytes. The function uses a stream reader object to decompress the text before returning the original text.

Private Function decompressFromBase64(ByRef compressedText As String) As String

    Dim strAsBytes() As Byte = Convert.FromBase64String(compressedText)

    Dim ms As New System.IO.MemoryStream(strAsBytes)

    gz = New System.IO.Compression.GZipStream(ms, _
                    IO.Compression.CompressionMode.Decompress)

    Dim sr As New IO.StreamReader(gz)
    Dim decompressed As String = sr.ReadToEnd()

    Return decompressed
End Function