Services for Organizations

Using our research, best practices and expertise, we help you understand how to optimize your business processes using applications, information and technology. We provide advisory, education, and assessment services to rapidly identify and prioritize areas for improvement and perform vendor selection

Consulting & Strategy Sessions

Ventana On Demand

    Services for Investment Firms

    We provide guidance using our market research and expertise to significantly improve your marketing, sales and product efforts. We offer a portfolio of advisory, research, thought leadership and digital education services to help optimize market strategy, planning and execution.

    Consulting & Strategy Sessions

    Ventana On Demand

      Services for Technology Vendors

      We provide guidance using our market research and expertise to significantly improve your marketing, sales and product efforts. We offer a portfolio of advisory, research, thought leadership and digital education services to help optimize market strategy, planning and execution.

      Analyst Relations

      Demand Generation

      Product Marketing

      Market Coverage

      Request a Briefing



        Matt Aslett's Analyst Perspectives

        << Back to Blog Index

        Explore Wide-Column Stores for Database Flexibility

        I have previously written about thefunctional evolutionandemerging use casesfor NoSQL databases, a category of non-relational databases that first emerged 15 or so years ago and are now well established as potential alternatives to relational databases. NoSQL is a term used to describe a variety of databases that fall into four primary functional categories: key-value stores, wide-column stores, document-oriented databases and graph databases. Each is worthy of further exploration, which is why I have examined them over a series of Analyst Perspectives. Following a closer look atgraph databases, document-oriented databases and key-value stores, I now conclude this series of Perspectives with wide-column stores. 

        The development and adoption of wide-column stores was initially driven by digital native enterprises in areas such as social media, the internet and gaming. Perhaps the most well-known, Apache Cassandra, was created at Facebook (now Meta), and early adopters included Reddit, Twitter (now X) and Rackspace. Other prominent open-source wide-column stores include Apache HBase and ScyllaDB, while commercial products include DataStax’s Enterprise and Astra DB (based on Apache Cassandra), ScyllaDB Enterprise and ScyllaDB Cloud from Scylla and Google Cloud Bigtable.  

        Managed cloud services are also available from Amazon Web Services, Microsoft Azure and Aiven, amongst others and are typically based on, or compatible with, Apache Cassandra. Tech companies continue to dominate adoption of wide-column stores alongside organizations in industries such as retail and communications. Known Apache Cassandra users include Apple, Best Buy, eBay, Home Depot, Hulu, Macy’s, Netflix, Spotify, Target, Uber and Walmart, while users of ScyllaDB include Comcast, Discord, Epic Games, Expedia, GE Digital, Grab, Rakuten and Strava. 

        Wide-column stores should not be confused with columnar relational databases. Although they have similar names, they have very different data models and use cases. Columnar databases are relational databases in which data is stored as columns rather than rows to improve query performance in analytic data processing use cases. Wide-column stores are non-relational and are typically used in operational use cases. The two have different approaches to storing data as columns: While each column in a columnar database is stored separately to disk, groups of related columns (called “column families”) can be stored to disk together in a wide-column store. More significantly, the two have different data models: Columnar databases are a type of relational database, while the wide-column data model is non-relational and provides a flexible approach that does not have the strict schema requirements associated with the relational model. 

        I previously explained that the key-value model that underpins key-value stores also forms the basis of wide-column stores. Key-value databases store data as simple pairs of keys and associated values, but wide-column stores extend the key-value model by enabling multiple values to be associated with an individual key. Each additional value is added as a new column, which results in a combination of rows and columns similar to a table in a relational database.  

        Unlike relational databases, however, wide-column stores do not require a strictly defined schema for all rows and columns in a table. Adding a new column to a row in a relationalVentana_Research_2024_Assertion_DataPlat_Distributed_Architecture_90_S database requires all other rows in the table to have values in the same column, necessitating the use of null or default values if no data exists. Storing and indexing null values can lead to performance and complexity implications. Wide-column stores are not impacted by storing and indexing null values, as there is no requirement for each row in a wide-column store to have the same set of columns. Another important characteristic of wide-column stores is that the storage of data can be distributed across multiple database nodes. Like Distributed SQL databases, wide-column stores can therefore be used to provide scalability, resiliency and availability by replicating data across multiple servers in a single data center, multiple servers across multiple data centers or even multiple servers across multiple cloud providers in multiple geographic regions. I assert that by 2027, more than one-third of enterprises will adopt data platforms that span distributed architecture, supporting applications that require data processing across geographic and availability zones. Unlike Distributed SQL databases, which by default provide strong data consistency across a distributed architecture, wide-column stores can be configured to deliver strong or relaxed (eventual) consistency, depending on the requirements of the associated application. It should be noted, however, that many wide-column stores do not currently fully support atomic, consistent, isolated and durable transactions, although it has been slated for inclusion in the forthcoming version 5.0 of Apache Cassandra. 

        The flexible data model makes wide-column stores well suited to write-intensive sparse and diverse datasets, while the distributed architecture is well aligned to the needs of storing and processing very large datasets, particularly those with high-performance and localized data sovereignty requirements. Primary use cases for wide-column stores include the storage and processing of application and infrastructure log data, sensor data from internet of things devices, time-series data and user preferences data. The latter can be used to drive personalization and recommendations as well as fraud detection and authentication, and wide-column stores are well-suited to intelligent operational applications driven by artificial intelligence and machine learning models that depend on the processing of large and diverse datasets. Wide-column stores are not suitable for all use cases, but I recommend that all enterprises considering options for databases evaluate the most appropriate data model to fulfill the task at hand and consider the potential suitability of wide-column stores where appropriate. 

        Regards,

        Matt Aslett

        Authors:

        Matt Aslett
        Director of Research, Analytics and Data

        Matt Aslett leads the software research and advisory for Analytics and Data at Ventana Research, now part of ISG, covering software that improves the utilization and value of information. His focus areas of expertise and market coverage include analytics, data intelligence, data operations, data platforms, and streaming and events.

        JOIN OUR COMMUNITY

        Our Analyst Perspective Policy

        • Ventana Research’s Analyst Perspectives are fact-based analysis and guidance on business, industry and technology vendor trends. Each Analyst Perspective presents the view of the analyst who is an established subject matter expert on new developments, business and technology trends, findings from our research, or best practice insights.

          Each is prepared and reviewed in accordance with Ventana Research’s strict standards for accuracy and objectivity and reviewed to ensure it delivers reliable and actionable insights. It is reviewed and edited by research management and is approved by the Chief Research Officer; no individual or organization outside of Ventana Research reviews any Analyst Perspective before it is published. If you have any issue with an Analyst Perspective, please email them to ChiefResearchOfficer@ventanaresearch.com

        View Policy

        Our Analysts

        Subscribe to Email Updates



        Analyst Perspectives Archive

        See All