You can often learn a lot from a simple little exercise. In this article, let's consider three ways to split a string in AVR for .NET. Our need is to split a full name field into two fields, one for the first name one for the last name. Initially, let's use the string 'Neil Young'.

This article uses both Rank and Dim arrays. If you're not familiar with the differences between the two array types read this article before continuing.

To make it clear that we're using a single space as the delimiter, we'll use this constant in the examples:

DclConst SINGLE_SPACE Value(' ')

Using brute force to split the string

An old-school way to split a string is to use a combination of the String class's IndexOf and SubString methods.

DclFld CustomerName Type(*String)
DclFld FirstName Type(*String)
DclFld LastName Type(*String)
DclFld DelimiterPosition Type(*Integer4)

CustomerName = 'Neil Young'
DelimiterPosition = CustomerName.Trim().IndexOf(SINGLE_SPACE)
FirstName = CustomerName.Substring(0, DelimiterPosition)
LastName = CustomerName.Substring(DelimiterPosition + 1)

A little on the old school side of things, this method is less declarative than using the Split method. It uses the Trim method to ensure leading and trailing blanks are dispatched if necessary, but it returns the wrong results if there is more than one space in the input field between the first name and the last name.

This was a popular, and necessary, method a long time ago. Ditch it today, there are much better ways.

Using the String class's Split method

A much better way to split strings is with the String class's Split method. It is, however, a little cumbersome--it has six overloads, four of which use arrays of *OneChar separators and two of which use arrays of strings separators. Of the six ways to use the Split method, the signature below is probably the one most frequently used by AVR programmers:

Split(String[], StringSplitOptions)

With this overload, you need to provide an array of strings as delimiters and a string splitting option value. That value can be:

  • StringSplitOptions.None or
  • StringSplitOptions.RemoveEmptyEntries

The first includes array elements that would contain an empty string and the second omits any array elements that contain an empty string.

For example:

DclFld CustomerName Type(*String)
DclArray Delimiters Type(*String) Dim(1)
DclArry Result Type(*String) Rank(1)

Input = 'Neil Young'
Delimiters[0] = SINGLE_SPACE
Result = Input.Split(Delimiters, StringSplitOptions.RemoveEmptyEntries)

This provides our desired result where Result[0] is 'Neil' and Result[1] is 'Young'. Using a ranked array, it looks like this:

DclFld CustomerName Type(*String)
DclArray Delimiters Type(*String) Rank(1)
DclArry Result Type(*String) Rank(1)

Input = 'Neil Young'
Delimiters = *New System.String[] {' '}
Result = Input.Split(Delimiters, StringSplitOptions.RemoveEmptyEntries)

The difference here is how the Delimiters array is declared and how it is populated. Using a ranked array adds flexibility because you easily add a delimiter without adding a single line of code:

Delimiters = *New System.String[] {SINGLE_SPACE, ','}

The use of StringSplitOptions.RemoveEmptyEntries avoids problems with spurious blanks in the input (ie, leading, trailing, and multiple blanks as delimiters).

For basic string splitting, this Split method overload provides a pretty good way to split simple strings.

The most flexible way to split a string

The most flexible way to split a string is with the System.Text.RegularExpressions.Split method. It uses a regular expression to split the string.

I know that regular expressions aren't in every programmer's kit bag. But they should be! Regular expressions have many uses in many places and a broad knowledge of them will serve you well. At the very least, consider this way to split a string an entry point in regular expressions.

This method needs this Using statement at the top of your code:

Using System.String.RegularExpressions
DclFld CustomerName Type(*String)
DclFld Pattern Type(*String)
DclArry Result Type(*String) Rank(1)

CustomerName = 'Neil Young'
Pattern = '\s*[ ]\s*'
Result = Regex.Split(CustomerName.Trim(), Pattern)  

This produces the desired number first and last name in the Result array. It works with one or blanks separating the first and last names. The magical part of this solution is this line which provides the regular expression:

Pattern = '\s*[ ]\s*'

Its parts are:

Pattern Description
\s match any white space
* match zero or more of the previous pattern
[ ] a group of characters to look for--in this case a single space
\s match any white space
* match one or more of the previous pattern

The Pattern field provides the pattern on which to split the string. In narrative form, the expression below says, "Look for zero or more spaces followed by the character(s) inside the brackets [ ] followed by zero or more spaces."

We could change the pattern to look for a space or comma separator like this:

Pattern = '\s*[ |,]\s*'

Inside brackets the pipe character (|) means "OR" so this says, "Look for either a space or a comma."

You can learn a lot about regular expressions with this online regular expression tester. I know if you aren't familiar with regular expressions this all seems pretty nutty. But, trust me, once you learn even a little bit about regular expressions, you'll use them, well, regularly.

For the highly curious

I've probably already overestimated your "I really care about this" ratio (a highly-cleansed version of a phrase my father often used), but check this out.

A feature of .NET regular expressions is capture groups. The code below uses that feature to capture the first name and last name from the CustomerName field.

DclFld m Type(Match) 
DclFld CustomerName Type(*String)
DclFld FirstName Type(*String)
DclFld LastName Type(*String) 
DclFld Pattern Type(*String)

CustomerName = 'Neil Young'
Pattern = "^\s*(?'firstname'.*)\s*[ ]\s*(?'lastname'.*)\s*$"

m = Regex.Match(CustomerName, Pattern) 
If (m.Success) 
    FirstName = m.Groups['firstname'].Value 
    LastName = m.Groups['lastname'].Value 
EndIf

You can study this regex, with a breakdown of what each token is doing, here.