Zero, One, Infinity.

blue-background-with-text-overlay-1314536.jpg

I began my career at LexisNexis Risk Solutions and spent a lot of my time there designing and implementing approaches for the company’s Person, Business and Location Reports. These are reports that drew on public and non-public information to develop a single cohesive view of everything we knew about a person, a business or a street address: who their associates might be, what property they owned, what cars they drove, whether they had filed for bankruptcy… all that data compiled into a single place, for use by law enforcement, financial institutions and others to ensure they knew who they were dealing with, or to track down criminals, or other good-for-society purposes.

In the course of designing these, a question that would inevitably arise would be how many of these should we allow each person to have? How many phone numbers? How many addresses? How many cars should we list on a given report? How many of one thing should we allow to be associated with this other thing? My answer was always the same and it led to a joke among my colleagues that it was L’Heureux’s Maxim Number One.

So, without further ado: L’Heureux’s Maxim Number One: there are no numbers except for zero, one, and infinity.

Now, we know that there are actually more numbers than that, but when it comes to developing data models, I always thought in these terms. There could be either no relationship between two entities, there could be a single reference that any given entity could have to instances of another entity, or there could be literally no limit to the number of references. Zero, one, or infinity.

For practical purposes, we often want to fudge this: there might not be room on our screens to allow for more than, say, three phone numbers. So we assume that the number of phone numbers we can associate with a person is three. Or, conversely, we assume that a person has a single mailing address. So we display only a single mailing address, and we account for only a single mailing address in our systems.

Unfortunately, it’s rarely that simple. When modeling we should try to take into account that our mechanism for receiving data, or displaying data, may change, but the underlying relationships between data are much less likely to. I always prefer to see a data model that permits a boundless number of phone numbers or addresses, and UI/UX that knows how to choose or display the proper items for the use case at hand. This cuts down on a couple of practices that end up causing lots of problems.

First, it eliminates the loss of data. If you only allow for the storage of a single address, for example, then there’s no other way to handle that address, and so it just won’t get entered. Your snowbirds who split time between Maine and Florida, or the student whose parents are separated and who thus lives in two different places. That data is lost, and often leads to another problem.

Misappropriation of data fields. When you want another address on your screen, you’ll find an empty field - notes, or something like that, and populate it there. Now you’ve got valuable data, sitting unstructured in a catchall data field, doing little for you.

By instead taking a moment to consider what the true relationship is between your data entities - zero, one, or infinity - and designing your databases and data warehouses appropriately, you set yourself up to be much more flexible in the future, and ready to handle the more unexpected data situations that might arise.

Previous
Previous

On the Importance of Multiple Personalities

Next
Next

Asking the Simple Questions