Kumo launches KumoRFM-2 for enterprise relational data

Tue, 14th Apr 2026

Kumo has launched KumoRFM-2, a foundation model for enterprise relational data that it says outperforms supervised machine learning models on a range of benchmark tests.

The model is designed to work directly with relational data spread across multiple connected tables, rather than requiring that information to be flattened into a single table before analysis. According to Kumo, it can scale to datasets with more than 500 billion rows and be queried in natural language without task-specific training.

KumoRFM-2 achieved impressive results across 41 predictive tasks and four benchmark suites, according to the company. On Stanford RelBenchV1, Kumo says the model outperformed its predecessor by 10% and exceeded the strongest supervised machine learning model by 5% across classification and regression tasks.

The company also highlighted results on the SAP SALT benchmark. There, the model reached 0.89 mean reciprocal rank when fine-tuned, compared with 0.77 for AutoGluon and 0.79 for CARTE, using a single model on a benchmark based on enterprise resource planning data with about 5 million records.

Customer Example

Kumo included a Databricks customer example to illustrate how its software is being used for lead scoring.

"Kumo.ai has transformed how we approach lead scoring at Databricks. Since deploying their platform, we've seen conversion rates from leads to opportunities improve from 1.2x to 6x, and we've doubled the volume of high-intent, quality leads entering our pipeline. The impact on our marketing performance has been substantial," said Anoop Muraleedharan, Sr Director Data & Analytics, Databricks.

Kumo argues that existing approaches to predictive AI on enterprise data lose valuable signals by flattening multi-table datasets before modelling begins. KumoRFM-2 instead uses what it describes as a Relational Graph Transformer architecture that preserves links across rows, columns and foreign keys.

The architecture was published at ICLR 2026, according to Kumo, and replaces graph neural network methods that are limited by local neighbourhoods. The company says the model processes data at 5 GB a second and handles 20 million lookups a second through a custom graph engine connected directly to SQL databases and cloud data warehouses including Snowflake, Databricks and Spark.

Founders' View

Kumo was founded by Dr. Vanja Josifovski, Dr. Jure Leskovec and Dr. Hema Raghavan. The founding team previously held senior roles at Airbnb, Pinterest, LinkedIn and Stanford, and created the PyTorch Geometric graph machine learning library.

"Enterprise data - customer records, transactions, product catalogs - holds enormous untapped revenue potential. Until now, using that data to generate business predictions required months of feature engineering and deep data science expertise, putting it out of reach for most teams," said Josifovski.

"KumoRFM-2 changes that: it's the only model that actually understands the relationships across your tables instead of destroying them, it scales to hundreds of billions of rows, and it lets any team ask predictive questions in natural language. No feature engineering. No data science expertise required," he said.

Leskovec explained the technical rationale for the system in terms of how structured data should be represented.

"For years, AI has been constrained by a fundamental limitation of not being able to reason over structured enterprise data. Database is not a document, it is a graph of relationships," said Leskovec.

"KumoRFM-2 is the first model that sees the full graph. We developed Relational Graph Transformers, where the AI model can attend to any datapoint, preserving the complete structure of relational data at arbitrary scale. And by adding a natural language interface, we make it possible for teams across the organization to ask not just what happened, but what will happen next, and why."

Model Claims

Kumo says the system is the first few-shot foundation model to outperform task-specific supervised approaches on common benchmark tasks for structured enterprise data. According to the company, it beat the best single-table foundation model by 18%, outperformed large language model-based approaches by more than 10%, and exceeded the best supervised relational models by 1.5%.

Kumo also says KumoRFM-2 can produce strong results using in-context learning alone, without task-specific model building or feature engineering. The approach may require as little as 0.2% of the labelled data used by supervised methods.

The model was pre-trained on a combination of synthetic and real-world relational databases, Kumo says, and none of the evaluation datasets were included in pre-training. The company is backed by Sequoia Capital and says its software is deployed at customers including DoorDash, Snowflake, Databricks, Reddit, Coinbase and Sainsbury's.

The system also includes a natural-language layer that translates questions into what Kumo calls Predictive Query Language, a structured representation used to generate predictions and explanations for results.

ChatGPT

Key takeaways Explain why it matters Create action plan Future watch

Claude

Key takeaways Explain why it matters Create action plan Future watch

Perplexity

Key takeaways Explain why it matters Create action plan Future watch

Grok

Key takeaways Explain why it matters Create action plan Future watch

Share Share

Add us as a preferred source on Google