EXCEL SUBSTITUTE FUNCTION
I have been coming across more and more cases where people have problems with lookup formulas because databases do not have matching information.
The most common type of incident I come across are where one database contains a name like “Adam O’Reilly” and another database has the name as “Adam OReilly”, with no apostrophe. So a simple VLookUp between the two databases will not work. The same goes for where spaces are used in one database and not the other, like “Breda McDonald” and “Breda Mc Donald”.
In these cases, I use the Excel substitute function to make both databases more compatible.
The two tables below will be used to demonstrate the Excel substitute function and multiple uses of the Excel substitute function.
Column B contains a VLookUp which is looking for the name in column A and retrieving the age from column F of the second table.
Only in 2 cases do the names match. The other 3 names do not return a valid result because the names are different in both cases.
Excel Substitute Function
The Excel substitute function replaces existing text in a string with other text specified by you.
The formula syntax that we will be dealing with is “=SUBSTITUTE(text, old_text, new_text)“.
- text: This is the existing string of text that you want to amend. In our case, the first name in A3 will be our text.
- old_text: This is the existing text in your string that you want to amend. We want to remove any spaces, so we put a space between double apostrophes (quotes) ” “.
- new_text: This is what we want to replace the old_text with. In our case we are removing the spaces and we do not want to replace them, so we use two double apostrophes to set the new_text as blank “”.
Single Excel Substitute Function
As above, we will use a single Excel substitute function to remove the spaces in the names and then instead of replacing the spaces with a character we will leave them blank.
In cell C3, our formula will be =SUBSTITUTE(A3,” “,””).
Copy this formula down for the other names.
In cell D3, do the same to remove the spaces from the names in column E.
If we change our VLookUp in column B to look for the name in column C in the table from column D to F we get…
The two remaining names are not comparable in the two tables because of the apostrophe.
Using a single Excel substitute function, we could remove the apostrophes, but we also need to remove the spaces. So, next we will amend our single substitute function so there are multiple substitutes.
Multiple Excel Substitute Function
We have already written a single Excel substitute function, this existing formula is now going to be used as the “text” parameter in our formula.
The multiple substitute formula will be:
- text: As stated above, the string will be the result of our existing string SUBSTITUTE(A3,” “,””).
- old_text: The old text will be the apostrophes that we want to remove, so we put a single apostrophe between two double apostrophes “‘”.
- new_text: Again, we do not want to replace the old_text with anything, so we will put blank as the new text, this is just two double apostrophes “”.
Amend the formula in C3 so that it becomes =SUBSTITUTE(SUBSTITUTE(A3,” “,””),”‘”,””)
We wrap a new substitute function around our existing one, so this will remove spaces and apostrophes.
The data in both tables is now comparable and the VLookUp gives the results we would expect.
Why Use The Excel Substitute Function
I often hear from people that they shouldn’t need the substitute function. Some people use the find and replace function to remove apostrophes, but run into problems when it comes to removing spaces this way.
Using the Excel substitute function leaves your raw data untouched and in its original format. You can put the substitute functions in hidden columns so any users of the file do not see them or have to worry about them. If they do need to amend them, they can unhide the columns.
Another possible issue that may come from amending databases is that other people may be referring to one of these databases and they may have set up their database to be compatible, if you change the database their data may no longer be comparable. I would suggest adding columns and hiding them, that way you are not upsetting anybody else’s files and functions.