How to delete duplicate values from a data set in all versions of Excel. This includes Excel 2003 and earlier and also Excel 2007 and later.
For Excel 2007 and later, we can use a simple feature within the program, but for Excel 2003 and earlier, we must use a macro, which is provided below.
Remove Duplicates in Excel 2007 and Later Versions
Remove Duplicates in Excel 2003 and Earlier Versions
Notice that Excel removed the entire row of data and not just the data in Column A.
If you had selected Column A and Column B in Step 3, then no value would have been removed. This is because, Excel looks at all the values in all the selected columns to see if there are duplicates across them instead of individually in each column. That is important if you want to perform more complex duplicate removal in Excel.
If you made an error, hit Ctrl + Z on the keyboard to undo the duplicate removal and start over.
To remove duplicate values from Excel 2003 and earlier versions, we need to use a Macro.
Here are the steps to install and use the macro:
Public Sub DeleteDuplicateRows()
' This macro will delete all duplicate rows which reside under
' the first occurrence of the row.
'
'Use the macro by selecting a column to check for duplicates
'and then run the macro and all duplicates will be deleted, leaving
'the first occurrence only.
Dim R As Long
Dim N As Long
Dim V As Variant
Dim Rng As Range
On Error GoTo EndMacro
Application.ScreenUpdating = False
Application.Calculation = xlCalculationManual
Set Rng = Application.Intersect(ActiveSheet.UsedRange, _
ActiveSheet.Columns(ActiveCell.Column))
Application.StatusBar = "Processing Row: " & Format(Rng.Row, "#,##0")
N = 0
For R = Rng.Rows.Count To 2 Step -1
If R Mod 500 = 0 Then
Application.StatusBar = "Processing Row: " & Format(R, "#,##0")
End If
V = Rng.Cells(R, 1).Value
If V = vbNullString Then
If Application.WorksheetFunction.CountIf(Rng.Columns(1), vbNullString) > 1 Then
Rng.Rows(R).EntireRow.Delete
N = N + 1
End If
Else
If Application.WorksheetFunction.CountIf(Rng.Columns(1), V) > 1 Then
Rng.Rows(R).EntireRow.Delete
N = N + 1
End If
End If
Next R
EndMacro:
Application.StatusBar = False
Application.ScreenUpdating = True
Application.Calculation = xlCalculationAutomatic
MsgBox "Duplicate Rows Deleted: " & CStr(N)
End Sub
This method works pretty much the same, though is not as robust, as the Remove Duplicates feature for Excel 2007 and later.
You cannot "undo" this action by hitting Ctrl + Z or anything else, so, be aware that whatever the macro removes is irreversible. Have a backup of the original data made before running any macro in case an error happens.
If you want to keep this macro in your workbook, make sure to save the workbook in a format compatible with macros; when you go to save the file, Excel will tell you if there is an issue, and, if you don't see a warning, it should save just fine in the spreadsheet.
Follow this tutorial to run the macro when you click a button in the worksheet.
If you are in Excel 2007 and later, use the built-in method to remove duplicates since it is so much easier compared to having to install a macro and since you can undo the action if you make a mistake.
If you use the macro, you cannot undo any changes it makes to the worksheet; always make a backup of your data before running a macro.
Be careful removing duplicates on spreadsheets with a lot of data spread-out throughout the worksheet because it might remove data to the right or left of your current data set. The best thing to do is to have a separate worksheet that only contains the data that you want to check for duplicates. This way, you are certain to not mess with any other data.
Make sure to download the accompanying spreadsheet for this tutorial. It contains the macro listed here and some sample data for you to try everything out.