天道酬勤,学无止境

VBA - Search and remove duplicates

I'm looking for an algorithm for which I do not have the VBA knowledge to script myself. So I'm stuck. It isn't through lack of effort trying because I have given it a go (plus, this bit of code is the last remaining piece of my bigger VBA code) I simply lack the knowledge/experience/skill...

Basically, I have an Excel file. In this file is a sheet, "sheet1". Sheet1 contains many rows of data. The number of rows contained in sheet1 can vary from 1 to n. Sometimes, I may have 50 while other times I may have 30, etc. What is consistent is the layout of the book, i.e. I have codes in column A which identify a product in my database.

What I want to do is this:

1. Scan the sheet for empty rows (due to the way the workbook is generated, I sometimes have blank rows) and remove them. These blank rows are sometimes in-between rows with data while at other times may be trailing at the end of the sheet.

2. After removing the blank rows find the last used row. Store that to a variable. I have found this piece of code useful for doing that:

mylastrow = myBook.Sheets("Results").Cells.Find(what:="*", SearchOrder:=xlByRows, SearchDirection:=xlPrevious).Row

3. Starting from the row determined in (2), I want to take the product code in A(x where x = mylastrow) and find any other occurrences of it (in column A). If any are found, delete that entire row corresponding to it. Importantly, this loop must go in reverse. For example let's say mylastrow = 40, the loop will need to begin at A40 and on the next iteration do A39 (or 38 if a row has been removed?). This is because with any of the product numbers the corresponding data in the row contains more data further down the column (because of the way the sheet was generated). Essentially the entry closest to the last row is the most recent.

Hopefully I've been able to explain the situ properly. But if not and you're willing to take the challenge (my burden?) off me I would be very grateful.

QF

评论

The only way to develop that knowledge and skill is to get in there and code! I'm sure someone may come in and write you the entire procedure, but in the meantime these resources should give you the tools to do it yourself.

First, check out the method here to delete blank rows. It relies on "Selection" for the range, so you can either manually select all the cells of the sheet, then run the macro, or replace it with the following:

Dim r as range
set r = Sheet1.Cells 'now use r instead of Selection

OR (even better) use your code for finding the last used row and set the range from row 1 to "mylastrow".

Next, beginning from "mylastrow", start adding the values in Column A to a Dictionary object (example here). You can use a row counter to decrement from "mylastrow" to 1. Here's an example of how it would work. The key is assumed to be in the 1st column ("A").

Dim dict As Object
Dim rowCount As Long
Dim strVal As String

Set dict = CreateObject("Scripting.Dictionary")

rowCount = Sheet1.Range("A1").CurrentRegion.Rows.Count

Do While rowCount > 1
  strVal = Sheet1.Cells(rowCount, 1).Value2

  If dict.exists(strVal) Then
    Sheet1.Rows(rowCount).EntireRow.Delete
  Else
    dict.Add strVal, 0
  End If

  rowCount = rowCount - 1
Loop

Set dict = Nothing

Before: Before After: After

Note that the 1st row hasn't been touched since we stopped when rowCount is 1 (assumes there's a header).

受限制的 HTML

  • 允许的HTML标签:<a href hreflang> <em> <strong> <cite> <blockquote cite> <code> <ul type> <ol start type> <li> <dl> <dt> <dd> <h2 id> <h3 id> <h4 id> <h5 id> <h6 id>
  • 自动断行和分段。
  • 网页和电子邮件地址自动转换为链接。

相关推荐
  • Duplicate Removal using an array in vba [duplicate]
    This question already has answers here: How do I delete duplicates between two excel sheets quickly vba (2 answers) Closed 5 years ago. Hi have used some code from an answer to a question 'How do I delete duplicates between two excel sheets quickly vba' and tried to alter this code to suite my own VBA script. the code does delete rows the same amount as to what is in the array but it is just deleting the first 11 rows. I am fairly new to VBA and not completely understanding why it is doing this. Below is a copy of the script I am using. Dim overLayWB As Workbook 'Overlay_workbook Dim
  • 如何使用 VBA 在 Excel 中搜索单词,然后删除整行? [复制](How do I search for a word in Excel Using VBA and then Delete the entire row? [duplicate])
    问题 这个问题在这里已经有了答案: 如果单元格包含字符串 X 则删除整行(7 个回答) 8 年前关闭。 有人请帮忙。 我正在尝试编写一个 VBA 代码,在我的 excel 工作表列“D”中搜索特定单词“DR”,然后删除整行。 工作表中出现了很多特定单词。 我想要做的就是搜索这些出现,然后删除包含这些词的整行。 我的问题是我不确定要使用什么循环结构。 下面是我正在使用的代码。 列(“D:D”)。选择 Cells.Find(What:="DR", After:=ActiveCell, LookIn:=xlFormulas, LookAt:= _ xlPart, SearchOrder:=xlByRows, SearchDirection:=xlNext, MatchCase:=False _ , SearchFormat:=False). 激活 做 Cells.Find(What:="DR", After:=ActiveCell, LookIn:=xlFormulas, LookAt:= _ xlPart, SearchOrder:=xlByRows, SearchDirection:=xlNext, MatchCase:=False _ , SearchFormat:=False). 激活 ActiveCell.EntireRow.Delete Loop While (Cells
  • 在vba中使用数组进行重复删除[重复](Duplicate Removal using an array in vba [duplicate])
    问题 这个问题在这里已经有了答案: 如何快速删除两个 Excel 工作表之间的重复项 vba (2 个回答) 5年前关闭。 嗨,我使用了一个问题“如何快速删除两个 Excel 表格之间的重复项 vba”的答案中的一些代码,并尝试更改此代码以适应我自己的 VBA 脚本。 该代码确实删除了与数组中相同数量的行,但它只是删除了前 11 行。 我对 VBA 相当陌生,并不完全理解它为什么这样做。 下面是我正在使用的脚本的副本。 Dim overLayWB As Workbook 'Overlay_workbook Dim formattedWB As Workbook 'Formatted_workbook Dim formattedWS As Worksheet 'Current active worksheet (Formatted) Dim overLayWS As Worksheet 'Worksheet in OverLay Dim lastRowFormatted As Long Dim lastRowOverLay As Long Dim targetArray, searchArray Dim targetRange As Range Dim x As Long 'Update these 4 lines if your target and search ranges
  • How do I get vba loop result to populate a combobox?
    Problem: I need to search a list of worksheets in the active workbook and return the name of every worksheet that has a cell whose value matches a search input. The names of these worksheets need to then populate a userform combobox with duplicates. Partial Solution: I've been able to reverse-engineer a piece of code that does most of the above. However, the worksheet names currently populate a msgbox with duplication. How would I make this result populate a combobox instead? I've been experimenting with outputting to a collection as well as writing results to a new worksheet, but these option
  • 如何获得 vba 循环结果以填充组合框?(How do I get vba loop result to populate a combobox?)
    问题 问题:我需要在活动工作簿中搜索工作表列表,并返回每个工作表的名称,该工作表的单元格的值与搜索输入匹配。 然后这些工作表的名称需要用重复项填充用户表单组合框。 部分解决方案:我已经能够对完成上述大部分工作的一段代码进行逆向工程。 但是,工作表名称当前填充了重复的 msgbox。 我如何让这个结果填充一个组合框? 我一直在尝试输出到集合以及将结果写入新的工作表,但这些选项仍处于概念阶段,因此我没有要发布的代码。 更新(一些代码): Public Sub FindDate() 'find date data on all sheets Dim ws As Worksheet Dim rngFind As Range Dim myDate As String Dim firstAddress As String Dim addressStr As String Dim findNum As Integer Dim sheetArray(299) As Integer Dim arrayIndex As Integer myDate = InputBox("Enter date to find") If myDate = "" Then Exit Sub For Each ws In ActiveWorkbook.Worksheets 'Do not search the
  • 如何快速删除两个Excel工作表之间的重复项VBA(How do I delete duplicates between two excel sheets quickly vba)
    问题 我正在使用vba,我有两张纸,一张被命名为“ Do Not Call”,并且在A列中有大约80万行数据。我想使用此数据来检查第二张纸中名为“ Sheet1”的第一列。 如果找到匹配项,我希望它删除“ Sheet1”中的整个行。 我对在类似问题中发现的代码进行了定制:将Excel公式交叉引用2张纸,从一张纸中删除重复的纸并运行它,但是什么也没有发生。 我没有收到任何错误,但没有运行。 这是我目前正在尝试的代码,不知道为什么它不起作用 Option Explicit Sub CleanDupes() Dim wsA As Worksheet Dim wsB As Worksheet Dim keyColA As String Dim keyColB As String Dim rngA As Range Dim rngB As Range Dim intRowCounterA As Integer Dim intRowCounterB As Integer Dim strValueA As String keyColA = "A" keyColB = "I" intRowCounterA = 1 intRowCounterB = 1 Set wsA = Worksheets("Do Not Call") Set wsB = Worksheets("Sheet1") Dim
  • Excel 2011 VBA - 用于获取范围内最后一行的自定义函数[重复](Excel 2011 VBA - custom function to get the last row in a range [duplicate])
    问题 这个问题在这里已经有了答案: VB6 按值传递和按引用传递2 个回答上个月关闭。 我一直在尝试各种方法来查找范围中的最后一行并将此数字返回到子程序,但无济于事。 这是我的代码 Sub StartHere() Dim oSheet As Worksheet Set oSheet = WorkSheets(1) ProcessData(oSheet) End Sub Sub ProcessData(ByVal wkst As Worksheet) Dim rng As Range Dim lastRow As Long 'set range Set rng = wkst.Range("L:S") 'Range that i want to process data on 'get the last row (in Long datatype) lastRow = CLng(getLastRowInRange(rng)) End Sub Function getLastRowInRange(ByRef rng As Range) getLastRowInRange = rng.Cells.Find(What:="*", After:=[A1], SearchOrder:=xlByRows,_ SearchDirection:=xlPrevious).Row End Function
  • 删除 Excel 单元格中的重复项(Remove duplicates within Excel cell)
    问题 假设我在一个 Excel 单元格中有以下文本字符串: John John John Mary Mary 我想在另一个单元格上创建一个公式(所以没有菜单功能或 VBA,请) John Mary 我怎样才能做到这一点? 到目前为止,我所尝试的是在互联网上搜索有关该问题的 SO,我所能找到的只是涉及 Excel 的内置重复删除或涉及countif和替换""重复项的解决方案。 我还查看了 Excel 函数列表,尤其是“文本”类别中的函数列表,但找不到任何可以在一个单元格上完成的有趣内容。 回答1 答案在这里:https://www.extendoffice.com/documents/excel/2133-excel-remove-duplicate-characters-in-string.html Function RemoveDupes2(txt As String, Optional delim As String = " ") As String Dim x 'Updateby20140924 With CreateObject("Scripting.Dictionary") .CompareMode = vbTextCompare For Each x In Split(txt, delim) If Trim(x) <> "" And Not .exists(Trim(x
  • Possible for my vba code more compact and simplified
    My vba code below to check on the userform textbox for any duplicate data on 3 rows. Once duplicate found, it will notified user and select entire row of the duplicate data. Its working and get job done. But, seems like the code quite long and repetitive. Is it possible to simplified and make my code more compact? Im still learning with vba code and dont know much about more advance function to get more compact code. Thank you. Private Sub ISBNTextBox_Exit(ByVal Cancel As MSForms.ReturnBoolean) Dim ISBN Dim FoundISBN As Range Dim Search As String Dim ws As Worksheet Set ws = Worksheets(
  • 我的 vba 代码可能更紧凑和简化(Possible for my vba code more compact and simplified)
    问题 我下面的 vba 代码用于检查用户表单文本框是否有 3 行上的任何重复数据。 一旦发现重复,它会通知用户并选择重复数据的整行。 它的工作和完成工作。 但是,似乎代码很长且重复。 是否可以简化并使我的代码更紧凑? 我仍在学习 vba 代码,并且不太了解更多高级功能以获得更紧凑的代码。 谢谢你。 Private Sub ISBNTextBox_Exit(ByVal Cancel As MSForms.ReturnBoolean) Dim ISBN Dim FoundISBN As Range Dim Search As String Dim ws As Worksheet Set ws = Worksheets("booklist") Search = ISBNTextBox.Text Set FoundISBN = ws.Columns(5).Find(Search, LookIn:=xlValues, Lookat:=xlWhole) ISBN = Application.WorksheetFunction.CountIf(ws.Range("E:E"), Me.ISBNTextBox) If ISBN > 0 Then ISBN_checker.Caption = "Duplicate" & " " & FoundISBN.Address FoundISBN
  • 用于比较 2 个工作表并删除重复单元格的 VBA 脚本“大搜索范围”(VBA script to compare 2 sheets and delete duplicated cells “Big Search Range”)
    问题 我想要一个脚本,将“工作表 1 单元格”与“工作表 2 单元格”进行比较并从工作表 2 中删除重复项。我现在正在使用此脚本 Option Explicit Sub CleanDupes() Dim targetArray, searchArray Dim targetRange As Range Dim x As Long 'Update these 4 lines if your target and search ranges change Dim TargetSheetName As String: TargetSheetName = "Sheet2" Dim TargetSheetColumn As String: TargetSheetColumn = "A" Dim SearchSheetName As String: SearchSheetName = "Sheet1" Dim SearchSheetColumn As String: SearchSheetColumn = "A" 'Load target array With Sheets(TargetSheetName) Set targetRange = .Range(.Range(TargetSheetColumn & "1"), _ .Range(TargetSheetColumn & Rows
  • 删除excel 2003 vba列中的重复条目(Delete duplicate entries in a column in excel 2003 vba)
    问题 那么问题是,我有一个列,例如 Y 列中有很多条目,将近 40,000 个,并且每周都在增加。 问题是我必须检查 Y 列中的重复项并删除整行。 因此,Y 列应该只有唯一的条目。 假设我有 3,000 个条目,1 周后,我将有大约 3,500 个条目。 现在我必须检查这些新添加的 500 列值而不是 3,500 与旧的 + 新的即 3,500 个条目并删除重复的行。 不应删除或更改旧的 3,000。 我找到了宏,但它们对整个列都有效。 我想过滤新的 500 个值。 Cells(2, "Q").Formula = "=COUNTIF(P$1:P1,P2)=0" 'I have used these formula Range("Q2").Copy Destination:=Range("Q3:Q40109") 'it gives false for the duplicate values 我知道我们必须对重复条目使用countif 。 但我所做的是应用公式,然后搜索错误条目,然后将其删除。 我相信应用公式并发现错误,然后删除它的一点点时间。 Sub DeleteDups() Dim x As Long Dim LastRow As Long LastRow = Range("A65536").End(xlUp).Row For x = LastRow To 1 Step -1
  • VBA script to compare 2 sheets and delete duplicated cells “Big Search Range”
    I want a script that compare "sheet 1 cells" with "sheet 2 Cells" and delete duplicates from sheet 2. I am using now this script Option Explicit Sub CleanDupes() Dim targetArray, searchArray Dim targetRange As Range Dim x As Long 'Update these 4 lines if your target and search ranges change Dim TargetSheetName As String: TargetSheetName = "Sheet2" Dim TargetSheetColumn As String: TargetSheetColumn = "A" Dim SearchSheetName As String: SearchSheetName = "Sheet1" Dim SearchSheetColumn As String: SearchSheetColumn = "A" 'Load target array With Sheets(TargetSheetName) Set targetRange = .Range(
  • How do I search for a word in Excel Using VBA and then Delete the entire row? [duplicate]
    This question already has answers here: Delete entire row if cell contains the string X (7 answers) Closed 8 years ago. Someone please help. I'm trying to write a VBA code that searches for a particular word "DR" in my excel worksheet column "D" and then delete the entire row. There are lots of occurrences of the particular word in the worksheet. All I want to do is to search for the these occurrences and then delete the entire rows that contains those words. My problem is that I'm not sure what loop structure to use. Below is the code I'm using. Columns("D:D").Select Cells.Find(What:="DR"
  • Array processing VBA
    i'm prototyping a solution for a tidious task using vba because my company's security only allows this method, can't use python nor anything else. i have a table of 5K+ rows and about 15 columns, and i want to process it removing specific columns based on a search criteria. so here's my code so far Sub RstCr() Dim Sh As Worksheet Dim Ar() As Variant Dim Arr As Variant Dim i As Integer Dim j As Integer Dim k As Integer Dim p As Integer Set Sh = Sheets("Sheet1") Sh.Cells(1, 1).CurrentRegion.Select Ar = Sh.Range("A1").CurrentRegion.Value MsgBox UBound(Ar, 1) Arr = Array("COFOR", "Tri",
  • VBA:从集合中删除重复项[重复](VBA: Removing duplicates from a Collection [duplicate])
    问题 这个问题在这里已经有了答案: vba:从数组中获取唯一值(10 个回答) 4年前关闭。 我想要完成的事情 我想从一组单元格中获取一系列值,然后填充一个新单元格,其中仅包含我所做的选择中的唯一值。 到目前为止,这是我的代码: Const Delimiter = ", " Dim num As Range Dim a As Variant Dim Concat, bucket As New Collection #to create a collection that contains all the values from my selection For Each num In Selection a = Split(num, Delimiter) Concat.Add (a) Next num #to convert multidimensional collection to a single dimensional For i = 1 To Concat.Count For j = 1 To Concat(i).Count bucket.add(Concat(i)(j)) Next i Next j #to remove duplicate values [code] #to output to excel [code] 如您所见,代码不完整。 我遇到以下代码行问题
  • Remove duplicate rows Excel VBA
    I am writing a script in VBA that would remove duplicate rows in an Excel spreadsheet. However, I want it to delete duplicate rows considering only information in two columns. In other words, I have a table with the range B:F. I want the script to remove duplicate rows considering, for each row, only the values on columns D and E. In the end, only rows which simultaneously have the exact same values on columns D and E - regardless of other columns - will be removed. How could I go about doing this? Thank you
  • 删除重复行 Excel VBA(Remove duplicate rows Excel VBA)
    问题 我正在用 VBA 编写一个脚本来删除 Excel 电子表格中的重复行。 但是,我希望它仅考虑两列中的信息来删除重复的行。 换句话说,我有一个范围为 B:F 的表格。 我希望脚本删除重复的行,对于每一行,只考虑列 D 和 E 上的值。最后,只有在列 D 和 E 上同时具有完全相同值的行 - 无论其他列 - 将被删除. 我怎么能去做这件事? 谢谢 回答1 这是执行此操作的示例。 确保使用要用完的工作表运行它: Sub DeleteDupes() Dim x For x = Cells(Rows.CountLarge, "D").End(xlUp).Row To 1 Step -1 If Cells(x, "D") = Cells(x, "E") Then 'This line deletes the row: Cells(x, "D").EntireRow.Delete xlShiftUp 'This line highlights the row to show what would be deleted; 'Cells(x, "D").EntireRow.Interior.Color = RGB(230, 180, 180) End If Next x End Sub 突出显示的结果: 删除结果:
  • 从Excel VBA中的单元格区域中删除重复项(Remove Duplicates from range of cells in excel vba)
    问题 我正在尝试在excel 2013 VBA中删除重复项。 但出现错误“对象不支持此属性或方法”。 问题是我没有静态范围可供选择。 我想从列heaader'abcd'中删除重复项。 Cells.Find(what:="abcd").Activate ActiveCell.EntireColumn.Select Set rng = Selection ActiveSheet.rng.RemoveDuplicates 回答1 您需要告诉Range.RemoveDuplicates方法使用哪个列。 另外,由于您已经表示自己有标题行,因此应该告诉.RemoveDuplicates方法。 Sub dedupe_abcd() Dim icol As Long With Sheets("Sheet1") '<-set this worksheet reference properly! icol = Application.Match("abcd", .Rows(1), 0) With .Cells(1, 1).CurrentRegion .RemoveDuplicates Columns:=icol, Header:=xlYes End With End With End Sub 您的原始代码似乎想从单个列中删除重复项,而忽略周围的数据。 这种情况是非典型的,并且我包括了周围的数据,因此
  • How to remove only the duplicate row instead of removing all the rows that follows the duplicate row in VBA?
    I wanted to remove only the row that is identified as duplicate with some condition. But, the problem I am facing right now is I am getting all the rows below the duplicate row getting deleted with the coding I have. Hence, I loose all the entries that are below the duplicate row in a different table. Hope you get my problem. Please suggest a proper VBA code for this situation. I have given the code that I am using. Private Sub CommandButton2_Click() Dim rng As Range Set rng = ThisWorkbook.Sheets("Sheet1").UsedRange rng.RemoveDuplicates Columns:=1, Header:=xlYes End Sub Sample Table Image -