What is the modern data stack?
And why are so many people talking about it?
Kevin Koenitzer | June, 13, 2023What is the “modern data stack”?
The elements that make up the “modern data stack” are a hotly debated topic today. Speaking with many frustrated operators and investors, many feel that after 15 years of rapid development in the data space it’s time for a return to fundamentals in the way companies develop data infrastructure.
Whether struggling with multiple conflicting sources of data, or complex data with little to no context or documentation available, companies are now intimately familiar with the problems that arise over time from poor data management.
The discussion of the modern data stack (MDS) is really a discussion about the best way to organize data to enable maximum value generation from insights, and the tooling and infrastructure needed to facilitate that value creation. There are myriad combinations of tools and practices that can be combined to meet the needs of a specific organization.
The two principles described in the article below are the same Snowpack Data use to assess our clients’ needs when working on custom builds:
The “Max/Min” Principle –
What does the modern stack look to maximize? What does it minimize?
This guiding principle is the foundation for assessing the value of each component of the MDS. It’s a sort of benchmark you can use to figure out whether the current implementation of your stack is having a net positive/negative impact on your organization and identify the problem areas you need to address in order to maximize cost to value.
The modern data stack maximizes ease-of-use.
In data circles, when we think “ease-of-use” we’re usually talking about the end user. In this case however, it’s important to consider ease-of-use for every party involved at every level of the stack. The modern data stack is easy to use for everyone–analysts, engineers, data scientists, CEOs–everyone.
The modern data stack minimizes cost.
In this case, literal dollar cost and other costs as well–For example: minimizing the time it takes to pull metrics for a report, deploy new models, build a dashboard, or add a new data source to your cloud db. The two components of cost to consider are monetary and efficiency-focused–an effective MDS minimizes cost and time-to-complete for key workflows.
When assessing a component of your data stack, ask yourself the following questions based on the “Max/Min” principle:
Is this tool easy to use?
Is the process of completing work using the tool fast? Repeatable? Automatable?
Does the tool integrate easily with the rest of my stack?
Does this tool do something unique in my stack?
If I had to change tools, how easy or difficult would it be?
Will this product scale with me as my organization grows? To what point?
Is there a cheaper product that suits my needs? What would I have to give up if I went with a cheaper option?
Can I build a solution for this myself? What would be the cost/benefit to doing so?
If the tools in this section of the stack are all expensive, am I maximizing functionality relative to what I’m paying?
Do I have the resources available to leverage this tool effectively?
Do I need this tool for a specific purpose?
The answers to these questions for each of the tools in your stack will let you know quickly whether your organization is getting the most out of its data stack. Not only that, you’ll know based on your answers where to focus to reduce cost and maximize efficiency.
The “More later” Principle –
You can always add more, later.
Many clients who find themselves building a data stack from scratch for the first time tend to over-invest in proportion to their need for data tooling. It often feels like one needs a specific product for every part of their analytics stack. I’ve heard too many stories about “multiple sources of truth” stemming from incongruencies in documentation across tools and over time to believe that you need a tool for every problem you have–the truth is; having a dedicated data product for a specific function does not mean your teams will make use of it.
One of the most common issues organizations encounter as they scale is the need for removal or replacement of legacy tools. Entrenched products that no longer meet the needs of an organization can be near impossible to remove, and the process is time-intensive and costly.
In many cases, starting with a lean stack or revisiting your stack to cut back on underused products has huge benefits in terms of simplifying and streamlining data processes and operations.
Companies starting to build an analytics practice for the first time can benefit from the “more later” principle: Start with the minimum, get to know your tools inside and out, identify gaps and grow your stack from there. You can always add more, later.
The modern data stack isn’t a thing, but rather just a concept. A modern stack is just the simplest combination of data tools needed to deliver the desired result at the lowest possible cost.
Snowpack specializes in helping organizations answer these questions, build, and deploy a modern data stack built for them that will scale as they grow. For more information, follow our blog, or shoot us an email at [email protected]