The macros referenced in this article see through these types of issues and can parse most (English) names into their constituent parts:
Suffix (Jr., Sr., PH.D, III, MD, etc.)
With the code referenced in this article, you can have your own Excel functions. For example, you may be familiar with Excel's =Sum(). With Excel's macro language, you can write your own function, such as "=ReturnLastName()", essentially inventing your own keywords.
Even with no macro experience, you can take the code from this article, add it to your spreadsheet, and use these functions, without having to write a single line of code or debug any routines. You should be able to download, apply and use these routines in about 5 minutes.
The Macros were written as Excel User Defined Functions (UDF), where you can create your own Excel keywords. Once installed, the macros can be applied and used like any other Excel function. Today, I used these very routines to parse a 200,000 row spreadsheet.
See this article for a tutorial on how to build and use User-Defined Excel Functions or you can follow these step-by-step instructions and use the code as-is.
What it Handles:
These routines correctly interpret all manner of English (and likely other language) names, including those with:
- First-letter initials, punctuated or not (J. Smith, J Smith)
- First-letter initials, with a second name assumed as first= ("J. Alan")
- Two-letter firstnames (JP Morgan)
- Two-letter first names plus mid-names (JP Alan Morgan)
- Hyphenated first and last-names
- Common English names, such as Van, Vander, Mc, Mac, Saint, Le, Le', De, etc.
- Common Suffixes (Jr., Sr., "J. Smith III", PH.D, PHD, MD, M.D., DDS, etc)
- Common Suffixes, punctuated with/without commas (J. Smith, Jr. returns "Smith")
- Names with extra embedded / internal spaces
- Three, four and five-word names (Mary-Anne Lynn Miller, etc.)
- Single-word names ("Smith", "Cher", assumes LastName)
- All names are returned in the upper/lower case, as typed. A separate "Proper" routine may be written in the future.
Where this routine struggles:
There are some areas where even humans have a hard time interpreting.
- Names such as "Mary Anne Smith", where "Mary Anne" is the first name. Anne will be tracked as a middle-name. Note that "Mary-Anne Smith" will parse correctly.
- Similarly, "Kathy Smith Jones" will parse as a mid-name "Smith" and a last-name "Jones".
- Hyphenated last names, such as "Kathy Smith-Jones", will parse a last name as "Smith-Jones", as does "Kathy Q. Smith Jones".
- If the source name has a special character (~ tilde), "Kathy Smith~Jones", the name will be correctly stored as last-name "Smith Jones" (with the tilde removed) - a programming trick that may help you if you have control over the name assembly.
- Uncommon Suffixes, such as educational-degrees, are not detected and will be incorrectly treated as last-names. However, the code is easy to modify with additional suffixes.
- These routines do not deal with prefixes, such as Dr., Rev., etc. Advise if needed.
Also referenced are utility functions, such as "=SuperTrim" and "=ReturnLastWord", which are found in the A800_UtilStrings.bas module, downloaded below.
Installation and Use
Source-code can be downloaded from the Keyliner Public Gmail site (the code is too ungainly to post in a blog). You are welcome to use these routines in your personal or commercial projects.
1. Download two files from this link:
(You do not need to register as a user or login in order to download. Highlight the files, then click the 'Download' button on the top ribbon-bar)
From this link, download these two files; both are required for this article:
Keyliner Public Gmail - Excel Macros
2. Save the downloaded files to any directory, such as
(or any other directory of your choosing)
3. Launch Excel and Enable Excel Macros.
Microsoft considers all macros a security risk and this can be a nuisance.
Do the following to enable the macros. This is a one-time step for all macros.
- Launch Excel; click the Orb (formerly the File Menu); choose Excel Options
- Click Trust Center, Trust Center Settings
- Macro Settings
- Enable All Macros
- Older versions of Office have similar settings.
4. Open the sheet where you need to parse Names or open a blank worksheet to test the routines.
5. Expose Developer Options
a. Click the "orb" (or File Menu), Excel Options
b. On the left, choose "Popular"
c. Click [x] Show Developer tab in the ribbon
d. Click OK and return to the sheet
a. Click File, Excel Options, "Customize Ribbon"
b. In the second-column ("Customize Ribbon: Main Tabs"), check [x] Developer
c. Click OK and return to the sheet
6. Import the Downloaded Excel Macros
a. Start the VBA Macro Editor by pressing Alt-F11.
The VBA Macro Editor will open in a new window.
b. On the tree-diagram, illustrated below, select your sheet ("Book1"), then:
- Other-Mouse-Click (Book1),
- Choose "Import File"
- Browse and select the previously-saved ".bas" module,
- You will see the code appear under the "Modules" folder (not illustrated)
- Import C:\Data\Source\CommonVB\A800_UtilString.bas (the second macro file)
7. Close the Visual Basic editor by clicking the editor's "X". This returns you to the sheet and the macros are ready to use.
Note: All the macro editor did was "attach" the code. You can optionally choose to view the programming code by using the editor.
Also, the macros will save with the sheet, but only if you follow the Office 2007/2010 Save warning, documented near the end of this article.
A. Type a name "John Q. Smith" in cell A1. Press Enter to commit.
B. In Cell B1, type this formula:
Important Dependencies in a Sheet:
The A240_ParseNames macros has an inter-dependency, where the =ReturnMidName routine depends on the =ReturnFirstName and =ReturnLastName. This was done for efficiency and saves those routines from re-parsing previously parsed fields.
Because of this, build your parsing fields in this recommended order, with these formulas (technically, fields can be in any order, but the MidFormula may end up pointing ahead):
1. Parse the Suffix first, if they exist, using
2. Parse the FirstName next, using
3. Parse the LastName next, using
4. Parse the MidName last, using
=ReturnMidName(celladdress, firstname, lastname)
Note the inter-dependencies; this field uses two other fields as input
For example, with a name typed in cell B3 (John Q. Smith, Jr.), build the =ReturnSuffixName formula in cell C3. Continue with the other formulas, finishing with the =ReturnMidName in cell F3, as illustrated:
|Click illustration for a larger view; click right-X to return|
Once the formulas are built, you can "fill-down" columns C, D, E, and F, to calculate other names in the list. Commonly, once parsed, you would copy all of the names and Paste them using "Paste Special, Values only" -- converting them from formulas to fixed text.
If you were inclined to look under the hood, you would see the ReturnMidName macro's signature line in this VBA code. Note the three fields passed into the routine:
|Click illustration for a larger view; click right-X return|
This concludes the tutorial on how to use these parsing routines.
Possible Macro Errors:
If you get a "#NAME?" when typing =ReturnLastName, the library was not linked in properly or was not linked to this particular sheet (the macros are only available to the sheets where they were copied). Review the steps above. If a routine, such as "SuperTrim" or "FindLastWordPos" fails, confirm you linked in the A800_UtilStrings library.
Compiler Error, Variable not found
Starting in Excel 2010, Microsoft is much more particular about having to declare variable types, especially if "Option Explicit" is used or if, in the VBA Editor, Tools, Options, [Editor], [x] Require Variable Declarations. The Author's code worked properly in Office 2007 and earlier, but failed in 2010. New code has been uploaded.
Excel 2007/2010 Save Warning:
When Excel 2007/2010 saves the sheet, it will save it as an ".XLSX" and will strip the macros out as it saves (Microsoft is doing this as a security precaution.)
When prompted: 'Do you want to save as a macro-safe) workbook', click "Yes" or optionally, do a File-Save-As, and choose "Excel Macro-enabled Workbook (*.xlsm).
Be aware, if you modify the Macros and save the sheet, only this sheet will see the changes. From the macro editor, re-highlight the A240 and A800 routines and "Export" them back to the 'C:\data\Source\CommonVB' routines where they were originally stored. This way, the changes can be used in other spreadsheets.
Here are some normal and unusual names this routine can handle:
John SmithJ Smith
John Q. Smith
John Q Smith, Jr.
John Q Smith,Jr (No space after comma)
John Q. Martin Smith, III
John Q. Smith
John Q Smith, Jr.
John Q Smith,Jr (No space after comma)
John Q. Martin Smith, III
J. Alan Smith
J Alan Smith
JP Albion Smith
JP Albion Smith PHD
John Mac Neal (LastName = Mac Neal)
John Mc Neal (LastName = Mc Neal)
John O' Brian (LastName = O' Brian, O' Leary, etc.)
John Mc Neal MD (extra embedded spaces)
(various Le', De' and other such names, with/without punctuation)
John Van Nuys (LastName = Van Nuys, also Vander, Von, etc.)
Mary-Anne Jacob Smith MD
Tom Smith, M.D.
Mary Ann Nichole Smith
Mary Q Anne Nichole Smith Barney
Mac Von Nuys, Jr (lastName = Von Nuys)
Lori Vander Hoff (LastName = Vander Hoff)
Excel Macros to Parse City State ZipCodes
Excel Macros to return First Word, Last Word, Supertrim, Count Words
Writing Excel UDF functions
Excel VLookup - A complete Tutorial
Excel Coloring Alternate Rows
Enable Excel Developer Options