I recently wrote about the growing range of use cases for which NoSQL databases can be considered, given increased breadth and depth of functionality available from providers of the various non-relational data platforms. As I noted, one category of NoSQL databases — graph databases — are inherently suitable for use cases that rely on relationships, such as social media, fraud detection and recommendation engines, since the graph data model represents the entities and values and also the relationships between them. The native representation of relationships can also be significant in surfacing “features” for use in machine learning modeling. There has been a concerted effort in recent years by graph database providers, including TigerGraph, to encourage and facilitate the use of graph databases by data scientists to support the development, testing and deployment of machine learning models.
TigerGraph was founded in 2012 and emerged from stealth five years later with its distributed graph database platform for real-time analytics. In addition to the graph data model, core features of TigerGraphDB include its massively parallel architecture based on co-located graph storage and graph processing engines, as well as a SQL-like graph query language (GSQL) to support ad hoc and interactive analytics. The combination enables TigerGraph to target high-performance analytics on very large datasets, including support for graph algorithms for in-database machine learning. The company also offers the GraphStudio interface for graph visualization as well as Graph Data Science Library and the recently introduced ML Workbench, illustrating TigerGraph’s growing focus on data science and machine learning workloads.
The company has attracted the attention of investors and customers alike. Customers include the likes of Ford, Intuit, Jaguar Land Rover, JP Morgan Chase, Microsoft Xbox, financial services firm NewDay, and UnitedHealth Group. TigerGraph has raised over $170 million in funding from investors, including Tiger Global and Susquehanna International Group, with the latest $105 million Series C round announced in February 2021.
Initial adoption of graph databases was reliant on developers and data teams understanding that they had a problem or opportunity that was suitable for the graph data model. As such, graph database vendors spent a lot of initial energy and marketing dollars evangelizing the potential benefits of the graph data model compared to the relational model, adoption of which is significantly more widespread. Usage of graph databases remains nascent, but it is growing. Fewer than 1 in 6 (15%) participants in our Analytics and Data Benchmark Research are in production with graph databases today, but 11% plan to use them within 12 months, and another 9% within two years.
The growing use of non-relational databases is driven by requirements for new applications, including personalization and artificial intelligence-driven recommendations, both of which are well-aligned with graph databases. As such, graph database vendors have been stepping up their engagement with key personas — including developers and data scientists — who are exerting increasing influence over databases selected to support new applications.
In addition to TigerGraphDB for deployment on-premises, TigerGraph also offers the TigerGraph Cloud managed service on Amazon Web Services, Google Cloud and Microsoft Azure, which lowers the barriers to adoption by removing requirements for upfront investment in related infrastructure. This is potentially significant in facilitating adoption for new application development projects. I assert that through 2026, incumbent relational database vendors will continue to be deployed for the majority of existing operational workloads, with emerging relational and non-relational database providers primarily adopted for new applications.
TigerGraph also provides GraphStudio, a graphical user interface for graph data model design, graph exploration and interactive analytics. GraphStudio provides a no-code environment for graph analytics via Visual Query Builder as well as an editor to write queries using the GSQL query language.
The company recognized that it needed to take a different approach for data scientists — one that would facilitate them applying their expertise to data in graph databases while reflecting the tools and skills used by data scientists today. Graph Data Science Library provides in-database data science algorithms — spanning clustering, similarity, centrality, dependencies, matching and flow — designed to enable users to bring machine learning workloads to data stored in TigerGraphDB, rather than extracting features engineered in the graph database to be trained in an external machine learning platform. ML Workbench, meanwhile, enables data scientists to explore the use of graph neural networks by providing a Jupyter-based Python development framework that is interoperable with deep learning frameworks such as PyTorch, Deep Graph Library and TensorFlow as well as cloud services such as Amazon SageMaker, Microsoft Azure ML and Google Vertex AI. While ML Workbench is initially focused specifically on graph neural networks, there is the potential for it to evolve over time to become a hub for data scientists to apply any suitable machine learning algorithms to graph data.
TigerGraph’s focus on data scientists is a work in progress, as is its embrace of cloud architecture and managed services. Both stand to gain from further research and development as well as increased sales and marketing. The company’s foundation for developing new, intelligent applications based on the graph data model are firm, however, as is the company’s expertise in relation to graph database and graph analytics use cases. I recommend that organizations consider TigerGraph when evaluating potential use cases for the graph data model as well as graph-based machine learning.