I’ll go through a simple tutorial for using regular expressions for VBA. This is a great quick start guide for people who have used regular expressions before in other languages. Just as a review, regular expressions are used to match patterns in strings. This is a powerful tool that can help you data cleanse and data mine. The directions are the same for Microsoft Access as well.
First thing first, let’s enable the reference in Excel so that we can use this functionality.
Go to Tools > References
Then find Microsoft VBScript Regular Expressions 5.5
Click Ok, and now you’ll be able to create a RegExp object to use in your VBA.
Here’s a basic function that finds 5 letter words in a string.
Here’s a copy of the Excel file that I am using in the example: RegEx Excel Tutorial XLS File
Function findFiveLetter(sentence)
Dim regEx As New VBScript_RegExp_55.RegExp
Dim matches, s
regEx.Pattern = "\W\w{5}\W"
regEx.IgnoreCase = True 'True to ignore case
regEx.Global = True 'True matches all occurances, False matches the first occurance
s = ""
If regEx.Test(sentence) Then
Set matches = regEx.Execute(sentence)
For Each Match In matches
s = s & " Position: " & Match.FirstIndex
s = s & " Word: " & Match.Value & " "
s = s & Chr(10)
Next
findFiveLetter = s
Else
findFiveLetter = ""
End If
End Function
Now let’s break it down.
First you need to declare your regEx obejct with:
Dim regEx As New VBScript_RegExp_55.RegExp
Next you give your object some parameters.
The most important is .Pattern, which is the pattern of characters that you want your function to find. Then there are a couple of settings like Global, if you want it to search for all occurrences (TRUE) or just the first occurrence (FALSE), and also IgnoreCase.
Here’s a simple guide to get you started on writing regex patterns.
regEx.Pattern = "\b\w{5}\b"
regEx.IgnoreCase = True 'True to ignore case
regEx.Global = True 'True matches all occurances, False matches the first occurance
Now to test if the pattern has been found in a string you can use
regEx.Test(sentence)
To get an array of matches from your regex you simply have the regex execute.
Set matches = regEx.Execute(sentence)
To get the results from this array object, you can reference matches(n) or use a For Each loop. There are two important pieces of information returned in the array, the index of the match and the actual match.
I used a For Each:
For Each Match In matches
s = s & " Position: " & Match.FirstIndex 'Position of the match
s = s & " Word: " & Match.Value 'The actual match
s = s & Chr(10) 'Prints a new line
Next
Here’s the example output:
Line | Matches |
What began as a small group of protesters expressing their grievances | Position: 5 Word: began Position: 16 Word: small Position: 22 Word: group Position: 53 Word: their |
about economic inequities last month from a park in New York City | Position: 0 Word: about Position: 31 Word: month |
has evolved into an online conversation that is spreading across the harry | Position: 69 Word: harry |
country on social media platforms. | Position: 18 Word: media |
Here’s a copy of the Excel file that I used in the example: RegEx Excel Tutorial XLS File
The regex function is called findFiveLetter(“put a string in here”)