Fuzzy Matching (Identity Mapping and De-Duplicating)
In an enterprise, merging master data, like customer data, from multiple sources is a common problem. Typically, you do not have a single key, i.e., the same key identifying a customer in different sources. You have to match data based on the similarity of strings, like names and addresses. In this session, we are going to introduce a couple of matching algorithms. Then, we will use Power Query Fuzzy Merge and SQL Server 2025 built-in and user-defined functions to perform the fuzzy matching. We will develop an efficient custom algorithm for the task.
O Predavaču
Dejan Sarka, MCT and Data Platform MVP Alumni, is an independent trainer and consultant who focuses on database development and data science.
He is to founder of Slovenian community. He is author or coauthor of twenty books on SQL Server and data science.
