In order to find and remove duplicates or just highlight them in an complex and messy Excel spreadsheet be prepared to work hard or just smart and elegant.
Too many times we end up with an extralarge Excel file that contains some very complex information and a lot of duplicates that we need to find and delete before processing the data. The source spreadsheet usually is received undesirable organised and we have to extract some summary conclusions from it.
Remove duplicates in Excel
The first step that needs to be done before removing duplicates is finding and highlight them and decide the way that the final result must look like.
My advise is to preserve the original source spreadsheet and start the summary work in a new excel sheet.
Find highlight and delete duplicates
The easiest way in identifying duplicatest is to use the Countif() Excel formula in a blank column, which will return how many times the current element was found in the whole column above it’s position. The formula for row 3 lets say should be: =COUNTIF($A$1:A2,A3). In this way we can highlight the elements that are duplicated, they will have countif() results greater than zero. You can then use autofilter and select only the duplicates. If you want only to highlight duplicates in excel then you must fill these duplicate cell with some color of font type. The summary sheet can use SUMIFS or some other summarising formula in order to pick up the information in an centralised and concise manner. You can also, of course, delete duplicates using the same autofilter for the Countif() column results.
Another way to remove duplicates in Excel
My favorite way to identify and remove duplicates in an Excel sheet consist in the following actions:
- Select the column that contains duplicates
- Sort that column using menu tab Data ->Sort smallest to Largest (can be done alse from Large to Small)
- Select Subtotal button from the same Data menu tab and choose Use function: Count for the column you need to process
After this step you will end up with a subtotal style sheet with the duplicated elements grouped together and can select the number 2 subtotal tab (upper left) to show only second level of elements (one example of each element).
4. Select the range of non duplicated elements, then use Select only visible cells function from Home menu tab Editing category Find & Select and choose Go To Special…
5. Click copy
6. Go to a new sheet and paste the copied content
7. Replace (Ctrl+H) ” Count” with nothing
You now have copied only non duplicating elements.