Agreeable Data
Back to all articles

What is a Data Stack?

Explore the evolution of data stacks from traditional to modern and post-modern architectures, and understand how each approach impacts organizational data capabilities.

John Wessel
John Wessel
CEO
Jan 11, 2022
5 min read
Data StackData ArchitectureData Engineering
What is a Data Stack? featured image

A Data Stack refers to a collection of technologies that come together to form a company's data storage, transformation, and analytics capabilities. A properly functioning data stack enables everyone in an organization to easily access relevant data quickly to make better decisions.

The Traditional Data Stack

Decentralized systems produce data silos

Traditional Data Stack Architecture

The traditional data stack was often implemented to produce reports to supplement the data produced from an existing centralized ERP system. As cloud computing and SaaS software has evolved many companies no longer have all of their data in a centralized ERP system but have many specialized systems that each have their own data. As these decentralized systems start to produce useful data, companies want to easily combine this data with their centralized ERP data to create new insights.

The Modern Data Stack

A centralized modular systems allows for data to stay organized and be accessed by anyone.

Modern Data Stack Architecture

The modern data stack has several elements to help better consolidate data for analysis:

Data Warehouse Layer

The foundational element of the data stack is the modern data warehouse. A modern data warehouse is optimized for storing large amounts of data from multiple sources. It is tuned to return data quickly to analytics tools for analysis.

Extraction Layer

The next layer is the extraction layer. Previously IT team would often try to perform the extraction and transformation of data in one step. This method locked business rules(transformations) with the IT team that belonged in the hands of data analysts. In this new model, the data is extracted from the source system and stored unaltered in the data warehouse in a raw state. Since the data is unaltered from the source system it also makes it much easier to audit for wrong or missing data.

Transformation Layer

The next layer is the transformation layer. This layer has all of the business logic and rules. Rules such as how data from different sources relate to each other as well as formulas to calculate things like gross margin, average order value, or customer lifetime value.

Analytics Layer

This is the final layer. This connects to the transformation layer but not the raw extraction layer. This allows for presenting the business with a clean, organized, curated number of datasets for analysis. It also allows for removal of sensitive information such as personally identifiable information (PII).

The Post-Modern Data Stack

The future is clean organized data that can actually be used in Machine Learning and AI.

Post-Modern Data Stack Architecture

The post-modern data stack is the next phase of data stack development, but it is in a very early stage for most companies. The primary additions are Machine Learning and AI capabilities. The capabilities will help companies improve demand forecasting, lead scoring, fraud detection along with many other possibilities.

Key Benefits of Modern Data Stacks

  • Scalability: Modern data stacks can handle growing data volumes without significant performance degradation
  • Flexibility: Modular architecture allows for easy addition of new data sources and analytics tools
  • Cost-Effectiveness: Cloud-based solutions offer pay-as-you-go pricing models
  • Data Quality: Clear separation of concerns between layers helps maintain data integrity
  • Accessibility: Business users can access and analyze data without deep technical knowledge

Implementation Considerations

When implementing a modern data stack, organizations should consider:

  • Data governance and security requirements
  • Integration capabilities with existing systems
  • Team expertise and training needs
  • Budget constraints and ROI expectations
  • Compliance and regulatory requirements

Conclusion

The evolution from traditional to modern and post-modern data stacks represents a significant shift in how organizations handle and leverage their data. Modern data stacks provide the foundation for data-driven decision-making, while post-modern stacks open the door to advanced analytics and AI capabilities. Organizations that successfully implement these architectures will be better positioned to compete in an increasingly data-driven world.

Share this article