Skip to end of metadata
Go to start of metadata

Duplicate records often exist in one or more source systems, so the goal of data matching is to determine whether records refer to the same entity. This involves evaluating how well the individual fields, or record attributes, match each other.

Matching algorithms can help correct data entry errors, character transposition, and other data errors to match records. You can set rules based on combinations of various elements matching at a certain threshold - for example, you may require the address line information and the first-name information to match in order for records to be flagged as a possible match.

Once matches have been identified, data from these matched groups can be salvaged and posted to form a single best record, or posted to update all matching records.

Match User blueprint

New in DS 4.0:
• New Approximate Substring Matching options to support name matching in Mexico and South America (or Latin America and South America?)
• New Algorithms
- Numeric range ( for example +/- 3mm)
-  Date range (for example, +/- 35 days)
- Geographic (for example, closest store to a customer record, all stores in a radial range)

• Performance improvements for transactional matching and large break groups
• Match Wizard enhancements to support non-party data custom fields

Think about the answers to these questions before deciding on a match strategy:

• What does my data consist of? (Customer data, international data, and so on)

• What fields to I want to compare (last name, firm, and so on)

• What are the relative strengths and weaknesses of the data in those fields?

Tip: You will get better results if you cleanse your data before matching. Also, data profiling can help you answer this question.

• What end result do I want when the match job is complete? (One record per family, per firm, and so on.)

Return to Data Quality main page

  • No labels