On the Importance of Multiple Personalities
A common business problem that companies face today is trying to get to a single view of the customer. I’ve been in enough meetings to know that this concept is well-understood at its surface - we want to make sure that we have the best information for “George” and that we use it across the company - but its implications are quite a bit less well-understood.
Any problem like this - which falls into the broad category of Master Data Management (or MDM) problems - requires a combination of business intelligence and data intelligence to address effectively. And there are plenty of tools that can give businesses a hand up in solving these problems. But for a minute, let’s take a look at what we’re trying to do.
First, we’re trying to combine our behavior records so that, for example, sales information for a given individual is all stored with that individual. Second, we’re trying to update our demographic records so that we can determine when someone has moved, or changed their name, or updated some critical preference, or whatever else. Third, we’re trying to select the best information for any given individual so that we can contact them in a way that feels appropriate, natural, and frankly familiar.
The challenge, of course, is that these three typical goals have competing approaches from an implementation standpoint. Combining behavioral records means somehow tagging all of the records belonging to “George” as, in fact, belonging to George. But this is typically necessitated by the fact that two or more records disagree on the details about who George is. As someone with a surname that’s a bit atypical, I’ve been a victim of this throughout my life. The “L” in my last name has been chopped off entirely, it’s been confused as my middle initial, the apostrophe has been removed, or replaced with a period, or a dash. The “H” in my surname has been capitalized, uncapitalized. Some people, feeling a lack of vowel representation, have thrown an “a” into the mix for good measure. As a result, I’ve had more variations on the theme of my last name than I can count.
So when a system attempts to combine my records, it ends up with a mish-mosh of names for me - which frustrates the other two goals of updating and selecting the best information.
Similarly, simply having a single record that is updated based on some rules has the impact of potentially updating a name (or address, or phone number, or so forth) with new information that is, in fact, less accurate than the previous information. I’ve seen some systems that try this, with the result that one feed updates a phone number in the morning and another feed updates it right back in the afternoon. This tug-of-war can continue for weeks.
Finally, with selection, there are multiple ways to do this, but one way I’ve observed is to use heuristics to determine the “best” information, and then to go back across all the raw records and update them with the “winning” details. In many ways, this is a success - you’ve homogenized the data that you’ve combined, and you’ve found the best information, and updated everything appropriately. But you’ve lost all that other data!
Treating data as an asset and being a bit more data fluent suggests a different approach: let your customers have multiple personalities. What do I mean by that?
For starters, every time you get information from some system, store it. Forever. Just the way it came. No “L”? Lowercase “H”? Missing apostrophe? Store it all. As your business grows, it will become increasingly more challenging to know the ground truth for each and every one of your customers, so start early, storing it all.
Next, design systems to combine your records that aren’t destructive. Consider each of your raw records from Step 1 as “personalities” that you’re combining into “customers”. You never get “customer” information directly - only “personalities”. Remember, you’re only getting the information that your customer wants to share with you or that a third party is able to share with you. You’re never getting the full story. It’s on you to combine your customers’ multiple personalities, and you can call that entity a “customer” - and you can relate it to all of the personalities that make it up.
Have your “best” information update frequently by rerunning the process that calculates it. Make it available to your users for each customer, but don’t use it to overwrite your raw “personality” data. That’s an asset that you don’t want to destroy. What if you get new information tomorrow that tells you that there are actually two different individuals that you’ve been treating as one? If you’ve destroyed your “personality” information, there’s no going back. But by allowing multiple personalities, you can recalculate your customers, and their best information, without missing a step.
It’s not the simplest approach, but it’s not terribly taxing to implement either, with a bit of consideration as to the business need and the nature of the data. Getting your whole team thinking about “dirty” data and its implications can lead to more solutions - like multiple personalities - that improve your data position.