Matt Aslett's Analyst Perspectives

DataStax Provides a Platform for Data in Motion and at Rest

Posted by Matt Aslett on Jul 27, 2022 3:00:00 AM

Streaming data has been part of the industry landscape for decades but has largely been focused on niche applications in segments with the highest real-time data processing and analytics performance requirements, such as financial services and telecommunications. As demand for real-time interactive applications becomes more pervasive, streaming data is becoming a more mainstream pursuit, aided by the proliferation of open-source streaming data and event technologies, which have lowered the cost and technical barriers to developing new applications that take advantage of data in motion. Ventana Research’s Streaming Data Dynamic Insights enables an organization to assess its relative maturity in achieving value from streaming data. I assert that by 2024, more than one-half of all organizations’ standard information architectures will include streaming data and event processing, allowing organizations to be more responsive and provide better customer experiences.

This trend is also impacting the vendor landscape. While streaming data was initially the preserve of specialist providers, increasingly we see data vendors looking to address the spectrum of customers’ data processing requirements, including data in motion and data at rest, with portfolios that include stream data processing and database products. DataStax is a prime example of an operational data platform vendor that has moved into the data streaming segment in recent years. The company offers both database and data streaming functionality for customers looking to reduce data silos and support the development of interactive real-time data-driven applications.

DataStax was founded in 2010 to build a business around Apache Cassandra, a distributed, non-relational database developed by Facebook in 2008 and made available as an open-source project the following year. One of the early NoSQL database projects, Apache Cassandra gained adoption thanks to its ability to manage large data volumes at scale with high availability and fault tolerance. DataStax was the first company to provide a commercial support subscription for Apache Cassandra, now known as Luna for Apache Cassandra. The company also launched its own DataStax Enterprise commercial distribution with added security and other enterprise features.

In addition to contributing to the development of both Apache Cassandra and DataStax Enterprise, the company also acquired capabilities for graph data processing as well as cloud management services. In 2020, it launched the Astra DB managed database-as-a-service offering. DataStax also recognized the growing demand for stream data processing as a complement to high-performance database workloads, and in 2021 announced the acquisition of cloud messaging managed service provider Kesque to support distributed event streaming, expanding its purview to address the management and processing of data in motion as well as at rest. DataStax supports hundreds of customers, including Audi, Barracuda Networks, CapitalOne, ESL Gaming, Macys, Saab and US Bank. It also raised over $340 million in funding, including a recent $115 million funding round led by the growth equity business within Goldman Sachs Asset Management.

As I recently explained, the range of use cases for which NoSQL databases are a valid option has grown in recent years, thanks to evolving functionality and vendor maturity. While products based on Apache Cassandra have always been suitable for high-performance workloads, organizations had to look to specialist streaming data and event technologies for continuous data processing. Data streaming and stream analytics have become more widely adopted in recent years as organizations look to support increasing customer demand for interactive applications as well as the proliferation of sensor data generated by the Internet of Things. Almost one-half (48%) of participants in our Analytics and Data Benchmark Research are using streaming data in operational workloads today, and an additional 33% plan to do so.

The acquisition of Kesque helped DataStax expand its addressable market with the addition of streaming data processing functionality and expertise based on the Apache Pulsar open-source project, which provides a cloud-native platform for publish-and-subscribe messaging and serverless stream data processing. DataStax now offers the Astra Streaming managed service, which is available on Amazon Web Services, Microsoft Azure and Google Cloud Platform.

For those that prefer to self-manage, the company also offers Luna Streaming, a commercially supported distribution of Apache Pulsar for deployment on-premises or on cloud infrastructure. Both offerings are targeted primarily at developers to enable them to create applications that take advantage of streamed data. While Astra Streaming and Astra DB can be used independently of each other, DataStax also highlights the benefits of using them in combination. Astra Streaming can be used to build pipelines to transport data into and out of the Astra DB database-as-a-service in real time, with Astra Streaming serving as a core component of CDC for Astra DB, providing change data capture functionality to synchronize data from Astra DB to other data platforms and applications as it is updated. The Astra DB managed service is now DataStax’s flagship database, delivering global scalability with data replication across multiple cloud providers, regions and availability zones, without the need for manual configuration or database sizing. DataStax Enterprise also provides native Kubernetes support for deployment on-premises or on multiple clouds, albeit with associated management and operations requirements. Both Astra DB and DataStax Enterprise support Storage Attached Indexing, which enables the creation of multiple secondary indexes on the same database table. Both also support Stargate, an open-source data application programming interface gateway designed to abstract Cassandra-specific concepts, making it easier for application developers to interact with data using APIs including GraphQL, REST, schemaless JSON and gRPC.

While Apache Cassandra and DataStax Enterprise have long been viable enterprise database platforms, DataStax has made significant inroads in recent years in making them easier for developers to work with, both through the Stargate API gateway and managed cloud services that reduce operational requirements. The addition of support for streaming data via Astra Streaming has further expanded the company’s addressable market with CDC for Astra DB, illustrating how the two can be used in combination. I recommend that organizations consider DataStax when evaluating data platforms to support applications requiring the processing of data in motion and at rest.

Regards,

Matt Aslett

Topics: Data, Streaming Analytics, Streaming Data & Events, operational data plaftforms

Matt Aslett

Written by Matt Aslett

Matt leads the expertise in Digital Technology covering applications and technology that improve the readiness and resilience of business and IT operations. His focus areas of expertise and market coverage include: analytics and data, artificial intelligence and machine learning, blockchain, cloud computing, collaborative and conversational computing, extended reality, Internet of Things mobile computing and robotic automation. Matt’s specialization is in operational and analytical use of data and how businesses can modernize their approaches to business to accelerate the value realization of technology investments in support of hybrid and multi-cloud architecture. Matt has been an industry analyst for more than a decade and has pioneered the coverage of emerging data platforms including NoSQL and NewSQL databases, data lakes and cloud-based data processing. He is a graduate of Bournemouth University.