Traveling at Right Speed
With the computing power available to companies today there’s a drive to make data and analytics available, all the time, through every channel, with zero lag and with up-to-the-second accuracy. While these are good goals, the one thing that needs to be kept in mind is that not everything needs to fit that pattern.
For some types of data, in some applications, for some organizations, it’s going to be critical - and a huge help - to have it available and summarized in near real time. For example, it is extraordinarily important for traders on the floor at Wall Street to have up-to-the-moment information: critical breaking news releases, key shareholder transactions, and more will all drive a price up and down and will shape traders’ views on whether to buy, hold, or sell, and at which prices. It isn’t something for which they can wait until the next day.
On the other hand, a report that summarizes the effectiveness of your snail-mail-based marketing campaign does not need to be updated at some sort of subsecond interval. It’s a level of precision that’s just not required; nor is it going to be helpful, given that the campaign is filtering through the USPS delivery process, which is probably best measured in increments of days.
For that report, a daily summary, prepared in time for the stakeholders to view it when they walk into the office in the morning, is likely more than sufficient. Attempting to keep it more up-to-date than that will simply cause problems that you don’t need.
You don’t have to make data available at light speed; just the right speed.
Trying to accelerate the creation of that report and make it so that it’s done every minute is an unnecessary stretch, for several reasons:
You’re taking compute time away from your servers that could be used for something else (or, in the Cloud, you may be paying for server time that you don’t have to)
You’re introducing additional opportunity for failure. Every process that you run represents a potential point of failure, and the potential need for operators to have to identify, diagnose and address such a failure. Why give a process the opportunity to fail thousands of times a day when it really is needed only once a day?
You’re setting up an unrealistic expectation. In this specific case, the rate at which the data are likely to change is very slow; the existence of a report that’s updated much more rapidly implicitly suggests that it would change much more quickly. As a result, you’re likely to lose productivity because report consumers are refreshing the output, expecting changes.
Taking the time to understand the use patterns of the data within your organization, and building your processing and reporting pipelines accordingly, will save money in addition to time. It’s not to say that you can’t change your mind down the road; you might want to or might need to. But in the meantime, build data to arrive at the right speed.