Home > DemandTools for AppExchange Modules > DemandTools Cleansing Modules > Single Table Dedupe Wizard
Single Table Dedupe Wizard
Capabilities & Features:
- Flexible identification of potential duplicates
- Easy to use interface for standard and custom objects
- Custom Merge capabilities (allows the user to select fields from all records in the duplicate group to retain in the final merged record)
- Pre-Built Scenarios aid in quicker setup (are customizable and should be reviewed by each user)
- Save custom scenarios for quick recall and use
- Sub-Object reassignment
- Custom Master Rule creation
DeDuplication Best Practices & Guidelines:
Please test your deduplication scenarios thoroughly before deduping a large number of records, as there is no automatic "undo"/"rollback" option. If you have a Salesforce sandbox, we recommended that you test in this environment first before deduping your production data.
More information on pointing DemandTools to the Salesforce Sandbox can be found here.
1. Dedupe in the following order:
- Account; Contact; Lead; Lead to Contact; Lead to Account; Opportunity; Custom Objects
2. Use at least a 3 pass strategy for identifying duplicates
- Start with very rigid criteria
- Rigid Criteria means matching on at least 4 or 5 fields and using strict matching types
- Loosen criteria by matching on less fields and choosing less strict mapping types
- Loosen criteria with each pass
- This strategy will help clean up the duplicates in a quicker more manageable manner
3. It is recommended that name, address, phone fields are standardized prior to searching for duplicates.
- This is not a requirement as DemandTools has lots of advanced mapping techniques to find matches when these fields are not standardized. However, the cleaner and more complete the data, the easier it will be to find matches.
- MassImpact can be used to standardize most of these fields. Please refer to the MassImpact documents for details on how to make basic changes (i.e. find records where state - Texas and change to TX), and how to use prebuilt formulas for more complex standardizations (i.e. "Normalized_US_Address", "NaPhoneFix" etc.).
- The Address Verification module is also extremely helpful in standardizing as well as completing address data. If Address Verification will be used, there is no need to use MassImpact to standardize address fields first, as this will be done as part of the verification process.
Screen One:
Step 1. Select the Object to De-Duplicate
The object for de-duplication can be selected by using one of the pre-built scenarios or by manual selection.
Pre-Built Scenario:
- Pre-built scenarios take care of some of the setup work and can be helpful for beginners and power users
- All pre-built scenarios can be modified and it is recommended that each user review (and modify the scenario to fit their data needs) the scenario they choose prior to running

- Scenarios are listed in the left hand column
- When a scenario is highlighted the center portion of the screen displays an overview of the scenario
- The right side of the screen shows the fields selected to view on the found duplicates/duplicate grid
- Load Scenario: Loads the all settings of the selected scenario. The user must manually click through to the next screens.
- Run Scenario: Loads scenario settings, automatically processes each step and applies the master rule to the found duplicates. The user must execute the merge.
- Delete Scenario: Removes the scenario from the list and deletes the source xml file.
- Save Scenario: Used for newly created or modified pre-built scenarios. When saving a modified pre-built scenario click the "Rebuild Description" button (this ensures the modification are saved in a new scenario). Scenarios will be saved in the DemandTools directory.
- Clear Scenario: Clears out all selections and allows the user to start over either by selecting a different scenario or by creating one manually.
Manual Object Selection:

- Select the object to de-duplicate in the "Salesforce Object To DeDupe" drop down box and click "Use Object"
- Proceed to manually select all fields to view on the duplicate grid
Step 2: Select the fields to view on the dupe grid
- In pre-built scenarios some standard fields have been selected; phone, name, owner id, addres information, etc.
- Be sure to select fields that will help determine if the duplicates returned should be merged; DUNS number, account number, or any internal unique ID's (may or may not be fields chosen to match on)
- Other fields to include are those that will help in the selection of the master record
- To select a field, click the checkbox next to the field, to de-select a field un-check the fields checkbox

Step 3: Select the records to search
- Sub-setting records in the SingleTableDedupe can be done for various reasons.
Ex: Leads need to be sub-setted to only look at unconverted Leads (isconverted = False)- Sub-setting can also be done to ensure that records of different record types are not merged.
Conditions are used when subsetting which objects will be updated, exported, and/or searched, depending on the module. In most cases the option to use all objects or specify conditions on which object(s) to use will be available.

When "Use Conditions" is selected the conditions area is no longer greyed out and open for criteria specification.

Step 1. Select the Field to Base a Condition On
- Fields from the selected table will be listed on the left.
- To access fields in a related table select the "+" to the left of the table name. This will display the fields from the related table. Multi-table join is available for Parent tables only, based on foreign key relationships between tables. For example, you can access the Account table from within the Contact table, because the Account ID is physically stored on the Contact Table. You cannot access the Contact table from the Account table.
Step 2. Select the Operator
- Available operators will change based on the type of field selected.
- "!=" represents "not equal to".
- When using dates, operators will include "Today", "Yesterday", etc.".
Step 3. Specify a Value(s)
- Text Fields: Type in the value. Multiple values for the same field are to be separated by commas. Commas are translated as "or". For example, "ma,nh,me" will be translated to "ma or nh or me". Note: values entered are not case sensitive.
- Picklists: Select the value from the specified list. Multiple values can be selected and will be translated as "or".
- Valid picklist values are shown (as defined in Salesforce.com). Right click on a list of picklist values to "Add a Value". This is used to add a value not currently in the picklist and make it available for use in the criteria. The value is added for the purpose of selection and will not be added to the actual picklist in Salesforce.
- Right click on a list of picklist values to "Filter List" to subset a long picklist based on a search string to quickly find a value(s).
- Dates: The current date will default in the value window. Click the arrow on the right to access a calendar and select a specific date.
-
To isolate a specific day regardless of time when dealing with a Date/Time field (i.e. created date), specify 2 conditions, greater than the day before and less than the day after. This is not necessary when using an operator such as "Today" or "Yesterday".
Step 4. Add Condition(s)
- Select "Add Condition" to add the condition to the list of current conditions.
- By default all conditions are and'd together. To create "or" groups see Step 5.
- To delete one condition from the list, click the box to the left to highlight the condition and press the delete key on the keyboard.
Step 5. Creating "OR Groups" (this option is not applicable to all modules)
- To create "or" groups, check the "Expert Mode" checkbox.
- Change the "OR Group" number for specific conditions (click on the number on the right and adjust up or down). All conditions WITHIN a group are and'd and BETWEEN groups are or'd. For example:

Reads as:
(MailingState=ma AND LeadSource=External Referral) OR (MailingState=ny AND LeadSource= Trade Show)
- Unchecking the "Expert Mode" checkbox will cause all conditions to be and'd once again.
Screen Two:
Step 1: Editing the Match Criteria
- When running a pre-built scenario the match criteria is pre-populated on Screen 2 but can be modified to fit each individual users needs
To change a field click on the box of the field name and choose the new field, the mapping typed should be reviewed if the field is changed. - To change the mapping type click in the box of the mapping type and choose a suitable option
- If a row/condition needs to be added, click the "Add Mapping Condition" button (on bottom left of the screen)
- If a single row/condition needs to be removed, click the box to the left of the criteria and click the "Delete" key on the keyboard
- To clear out all mapping click the "Clear Mapping" button (on bottom left of the screen)


Step 2: Understanding Mapping Types:
- When a field is used in the match criteria the only records that will be returned in the dupe grid (screen 3) are those that have a value populated in all of the fields listed as a field to match
- Ex: "IBM Co 1236 Main St Cleveland 44115 4406678956" will match to "IBM 1236 Main cleveland 44115-2268 440-667-8956"
- "Match Blank Values" checkbox will allow for the statement above to be true, it will also allow for blank values to be matched to other blank values for the fields selected
Ex: Match Blank Values has been selected for phone: "ibm co 1236 main cleveland 44115 (blank phone number)" will match to "Ibm co 1236 Main St. Cleveland 44115 (blank phone number)"
- In order to have a populated field match to an unpopulated field, the field can be viewed on the dupe grid but should not be used as a field to match in the match criteria setup in Screen 2
Match Categories
- All - All Categories
- Alpha - Techniques for text
- Contact - Techniques specifically for dealing with Contact names
- CRM - Techniques for Customer Relationship Management Categories
- Geo - Techniques for address information
- Internet - Techniques for domain information
- Numeric - Techniques for dealing with number based matches
Mapping Types
|
Type Description: |
Found in Category: |
|
The Cleaned Account Name mapping type uses the built in Account Name Cleaning List. To see the Cleaning list tool select the "Edit Account Cleaning List" button.
The cleaning list standardizes punctuation, spaces and removes common business prefixes and suffixes. These lists are customizable for language(s) and/or line of business.
|
All, CRM |
|
The Country Match mapping type is used to standardize field values for the recognized countries of the world. It will recognize the long name of a country, the 2 digit ISO short form, the 3 digit ISO short form and the numeric ISO country value as possible matches of each other.
|
All, GEO |
|
The Domain mapping type is used when mapping web pages and/or email addresses. It allows for the independent analysis of the domain information contained within the URL or the email address. For email addresses it uses any information to the right of the @ sign. For web pages it parses the XXXXX.com portion. This mapping types allows for easy comparison of web page field vs. web page field or email field vs. email field. It also allows for the comparison of email addresses to web pages and vice versa.
|
All, Internet |
The Exact mapping type is a 100% match of every character (assuming no options apply (see Fuzzy, Transpose & Alpha Clean))
|
All, Alpha, Numeric |
|
The First Name mapping type uses the built in Nickname List. To see the Nickname list select the "Edit Nickname List" button at the top of the interface.
The Nickname list will see Bill, William and Billy as potential duplicates/matches of each other. This list is also customizable by the end user for localization or for non contact substitution on any field by replacing the nickname list with synonyms.
|
All, Contact |
The First XX Letters mapping technique is used to compare the first XX letters in a field. When selected a pop up box will appear in order to specify the number of letters on which to match. Applicable to text fields only.
|
All, Alpha |
The Numeric mapping type is used to compare only the numeric values in a field. All other characters, such as spaces or punctuation, will be ignored by the deduper. A field with a value of "Apt # 31" is seen by the deduper as the numeric characters "31", ignoring "Apt #".
|
All, Numeric |
|
The Relaxed Address Match mapping type parses the street address to the lowest common denominator. Based on North American standards it has also proved effective with most country address formats.
Example:
Apt#4, 123 Pavillion Street 123 Pavillion, Apt 4 4-123 Pavillion Ave NW
Are all seen as "123 Pavillion"
|
All, CRM |
The Relaxed NA Phone Match mapping type removes all non-numeric characters and spaces, leading 0's and 1's, area codes and extensions on the back end, returning the 7 primary digits of the phone number. If just 7 digits are left use those 7 digits, else just return digits 4 - 10. It will not match the "Phone-word" values and will trim off the "SPOT" in the phone number and only look at the numeric portion.
Example:
+1 879 555 1212 ext 500 (879) 555 1212 1.879.555.1212 ext 408
Are all seen as "5551212"
Note: This is based on a 10-digit North American phone number. Although it can be used with international phone numbers (will just return 7 digits in the middle), the Numeric mapping type is the recommeded mapping type to be used with international phone numbers.
|
All, CRM |
The Salesforce.com ID Match mapping type is used to match any Salesforce.com objects 15 digit ID to its 18 digit equivalent ID and vice versa.
|
All, CRM |
The Street Address Match mapping type is a slightly more rigid criteria than the relaxed address match mapping type. The deduper will ignore the differences in street type short forms such as crescent - cres, road - rd, street - st, etc.
|
All, CRM |
| The Zip 5 and 9 mapping type will automatically match USPS 5 and 9 digit zip codes without the need to standardize them to a common number of digits. |
All, CRM |
Additional Mapping Options
| Option: |
Description: |
Applicable Mapping Types:
|
| Fuzzy |
Phonetics engine capable of analyzing words for how they sound when pronounced. Through a technique of removing vowels and analyzing the remaining consonants the fuzzy engine works well for matching fields with spelling mistakes. |
Cleaned Account Name Exact FirstName
|
|
Transpose |
The transpositional engine allows for fields to appear as duplicates even when there are differences in their word order. For example, "Jones, Smith and Jackson" will appear to be a duplicate of Jackson, Smith and Jones.
|
Cleaned Account Name Exact FirstName Street
|
|
Alpha-Clean |
The alpha cleaner extends some of the capabilities of the account name cleaner to other fields for easier matching. The alpha cleaner is used for ascii (north american) data to ensure that the only characters that are analyzed are the 26 characters of the english alphabet and the numbers 0-9. Any other character that the field may contain will be ignored by the deduplication matching algorithms.
|
Cleaned Account Name Exact FirstName Numeric Street Zip 5 and 9 |
Select the "Screen 3 - Merge Control" Button to search for potential duplicates and move to screen 3 to analyze and eventually process.

Screen Three:
Step 1: Grid Options
- Grid Field Arrangement:
- By default, populated fields are shown at the top of the interface
- Fields on the grid can be rearranged by dragging and dropping them in the desired order

- Remove a field from the grid by dragging and dropping it out of view

- View Record Details and Sub-Objects:
- Explore the records with the view dialog box with sub-object viewing:
- Duplicate Set Controls:
- Expand All: will expand all duplicate groups
- Collapse All: will collapse all duplicate groups
- Check All: will place a check mark in all checkboxes of the records in the duplicate grid
- Uncheck All: will remove the check marks from all checkboxes of the records in the duplicate grid
- Clear All: will clear all checks and master record selections from all records in the duplicate grid
- Current Master Rule:
- Apply Rule to All: will apply the selected Master Rule to all records in the duplicate grip and place a check in their checkboxes
- Apply Rule to Checked: will apply the selected Master Rule only to the records with a check mark in their checkbox
- Grid Setup:
- Show More Fields: will allow for additional fields of data to be added to the grid for all records in the grid
- Export Grid Data: will allow for the grid data to be exported to a spreadsheet
- Edit Field Values:
- Right clicking on any field in the duplicate grid will give the user an option to:
- "Update Value in Salesforce" (will only update that field and only for the record that was right clicked on with a value specified by the user) and
- The single field update option can also be accessed by double clicking on the value to change
- "Update Groups Rows in Salesforce" (will update that field for all records in the duplicate group with a value specified by the user)
- Either option will make the change in real time and does not require a merge to make the change to the field

Step 2: Selecting a Master Record
- Each duplicate set needs a master record designation in order to be merged (as well as a check mark in their check box)
- There are 3 ways to select a master record
- Double clicking the pin (of record in duplicate set)
- Setting up a Custom Merge
- Application of a Master Rule (see step 3)
- Double Clicking the Pin:
- Double click the pin beside the duplicate record to activate it as the master record, turning the pin green.
- Gray pins
indicate no record has been selected as master - Green pins
indicate that this record has been selected as the master record - Red pins
indicate the records that have been selected or determined to be servant records
- Setting up a Custom Merge:
- Right click on the duplicate key and select "Setup Custom Mergeā


- Anywhere from 2 - 100 records may be placed in the custom merge window
- Select the Master Record and double click the field values to retain from the displayed record sets
- The values that will be retained for the final merged record will be highlighted in yellow
- When the field selections are complete, click the "Accept" button
- The user will be returned to the duplicate grid
- The pins of these records will now be Blue
to indicate that they are a part of a custom merge
- Other Options from the Duplicate Key Right Click Menu:
- Apply Action to Group:
- "Check Group": will place check marks in the checkboxes for all records in a single duplicate group
- "Uncheck Group": will remove marks from the checkboxes for all records in a single duplicate group
- "Merge Group": will merge the records with a check in their checkbox in a single duplicate group (if three records are in a group and only two are checked, only the checked records will be merged)
- Apply Rules to Group:
- will apply the selected rule in the menu to the duplicate group
NOTE: All merges, whether manual or automatic, will be processed based upon the defined Merge Options listed along the right hand side!! (See Step 4)
Step 3: Choose or Design a Master Rule
- Select a Pre-Built Master Rule
When using strict/rigid deduplication criteria the decision may be made (after a quick review of the records) to process all the records in the grid view - Master rules do not need to be applied to each duplicate set individually (but can be as mentioned above using the duplicate key right click menu)
- Select the "Check All" button at the bottom of the grid and all duplicate sets will be checked
- To view the master-slave designations from the selected rule, click the "Apply Rule to All" button (the "Apply Rule to All" button can also be used, this will check the boxes as well as apply the selected master rule to all records)
- When the user is satisfied with the master record selection, they should review the options in the right hand pane (see step 4)
- Alternatively, masters records can be selected based on the currently selected master rule in the upper right hand of the interface when the end user selects "Merge Checked Objects" and the merge process will be completed (this option will not allow for master-slave designation to be reviewed before merging, the master selection and merging happen in one step)

Currently selected master rule
Back to top
- Creation and Editing of Master Rules
- "New Rule" button: opens a pop up box for the user to create a custom master rule
- "Edit Rule" button: used to edit an existing rule or to just view the setup of an existing rule (will open the rule with the check mark)
- "Delete Rule" button:
- A master rule can be created for any object
- When building a master rule there is the option to add unlimited conditions for analyzing either the record (as a whole) or the field values that the record contains.
- At the record level, options such as oldest, newest, most recently modified (note that RECORD is the field selection for these options to appear) and many other advanced constraints are available. For field values, the Single Table Dedupe displays constraints appropriate for the data type of the field being analyzed.
| |
Sample constraints on different field types | |
- Scoring a rule
- After the setup of the initial field and constraint the user can score the positive incident of the constraint with an arbitrary point value.
- The default rules that ship with the software are based upon a 100 point scoring system but any system can be used.
- When adding multiple constraints the record is scored in cumulative fashion with each additional positive expression adding to the total score of the records.
- Additionally negative scores can also be applied to penalize a record.

A complex rule with multiple expressions and a score delta set
- Using the Score Delta
- The final component to creating a Master Rule is the decision to use or not to use a score delta
- The score delta will be used by the deduper to help determine master vs. servant records
- The value of the score delta determines how many points away a master record has to score from the next nearest servant record in order to be selected as the master
If the score delta threshold is set and not met: (records tie and have the same point value)
- The deduper will use the rule as a scoring system only
- It will not be able to select a master
- The scores will be displayed, but the pins will remain grey and the boxes, unchecked
- If no score delta is set:
The deduper will score the records - The record with the highest score will be chosen as the master
- If there is a tie, it will pick the first record in the list
Step 4: -Select the merge control options
- Update Fields where Master is Empty
- For any field that is selected in the "Update Fields Where Master Empty" option box, the deduper will keep the value in that field from a subordinate record if that field is not populated with a value on the master record.
will condense the options box below, when clicked again this icon will expand the options box below

- "Check all" - will select all fields
- "Clear all" - will de-select all fields
- Manually select fields - the user can check and uncheck the fields they wish to use
- Combine Field Values

- Used to select the fields to concatenate in the master and subordinate records
- Available for:
- Multi-select pick lists
- Long text area fields
- Boolean Fields (like Check Boxes)
- Choose Long Text Area Field separator:
- Default is a Carriage Return/Line Feed; the user may select a specific character like ; by selecting the radio button next to "Custom"
- Salesforce Merge CheckBox (use is recommended):
- If the Salesforce merge is used, all objects will be merged to the master and the slave records will be moved to the recycle bin.
- Reassign to Master Owner

- Allows for sub-objects to be reassigned to the ownership of the master record owner during the merger (for both subordinate record sub objects and master record sub objects)
Step 4: Processing records in batch mode
Once the records to merge are checked, and the master records determined (either by master rule or manual selection) select "Merge Checked Objects" to begin the merge process.

Step 5: Saving A Custom Scenario:
- Even if a user begins the de-dupe process using a pre-built scenario it is very likely that some changes/modifications will have been made to fit the users data needs.
- Modified pre-built scenarios can be saved for future use with the changes made by the user
- On screen 3, once all settings have been selected, click the "Save Scenario" button on the lower left hand side
- The pop up screen will still show the pre-built settings
- To save the modifications the user must click the "Re-Build Description" button
- This forces the pop up to refresh and reflect the modifications
- The user can rename the scenario (this is reccomended so there are not two scenarios with the same name; an effective naming convention is to use the company name or abbreviation as a prefix to the scenario name)
See also