Introduction to SAP-RPT-1

In this lesson, we are going to explore SAP’s RPT-1 (Relational Pretrained Transformer), a specialized foundation model designed specifically for relational and structured data prediction.

SAP-RPT-1 represents a new class of Relational Foundation Models (RFMs) - large-scale machine learning models designed to understand, process, and make predictions on tabular and relational data. Unlike traditional large language models that excel at text generation, RPT-1 is table-native and purpose-built for structured business data.

RPT-1 is a pretrained model that doesn’t require additional training or fine-tuning steps. It solves predictive tasks such as classification and regression out-of-the-box through tabular in-context learning. Due to its table-native architecture, prediction quality on enterprise data is typically very high, ahead of state-of-the-art narrow AI models and LLMs employed for such tasks.

The model supports both classification tasks (predicting categorical values like cost centers, sales groups, or payment terms) and regression tasks (predicting numerical values like prices, quantities, or amounts), making it versatile for enterprise business applications.


Key Concepts

Context vs Prediction Rows: RPT-1 requires a minimum of 2 complete rows (context rows) with actual data to understand patterns and relationships. These context rows serve as examples that help the model learn the underlying data patterns before making predictions on incomplete rows.

Prediction Placeholders: Cells that need predictions must contain the special placeholder [PREDICT]. This tells RPT-1 exactly which values need to be predicted while preserving other complete data in the same row for context.

Data Schema Detection: RPT-1 automatically detects data types (string, numeric, date) for each column, enabling it to apply appropriate prediction strategies based on the nature of the data.

Multi-Column Predictions: The model can predict multiple columns simultaneously within the same dataset, understanding cross-column relationships and dependencies.

Tabular In-Context Learning: RPT-1 produces predictions through tabular in-context learning. Prompts are given in table form, containing context examples and fields marked for prediction with a placeholder value ([PREDICT]).


Model Versions and Specifications

SAP-RPT-1 comes in two versions optimized for different use cases:

sap-rpt-1-small - Recommended when:

  • Prediction scenarios are of medium complexity
  • Low latency and high-prediction throughput are the primary objective

sap-rpt-1-large - Recommended when:

  • Prediction scenarios are complex
  • Best prediction quality and lowest error rates are the primary objective
Specification sap-rpt-1-small sap-rpt-1-large
Max. Context Length 2048 rows 65536 rows
Max. Columns 100 columns 256 columns
Number of Target Classes (Classification) 256* 1024*
Recommended Context Length 500–2000 rows+ 4000–8000 rows+

Both model versions allow up to 128 simultaneous prediction rows and up to 10 simultaneous prediction columns in each prediction request. Supported task types are classification and regression.

*The number of target classes is a recommendation for best prediction quality, not a hard limit. Depending on the specific use case, models can produce high-quality predictions for larger numbers of target classes.

+Generic recommendations on context length are best practice values based on internal tests, balancing prediction quality, model runtime, and costs. Optimal settings vary across use cases and data sets.


Business Use Cases

RPT-1 excels in various enterprise scenarios:

  • Order Processing: Predicting shipping methods, payment terms, or sales groups based on customer and product information
  • Customer Segmentation: Classifying customers into categories based on purchase history and demographics
  • Inventory Management: Predicting stock levels, reorder quantities, or product classifications
  • Financial Forecasting: Estimating prices, costs, or budget allocations based on historical patterns
  • Data Quality: Completing missing values in business datasets with intelligent predictions

Data Format Requirements

RPT-1 expects data in a specific JSON format with four main sections:

1. Prediction Configuration Defines which columns need predictions and their task types (classification or regression):

Code Snippet
123456789
"prediction_config": { "target_columns": [ { "name": "Sales Group", "prediction_placeholder": "[PREDICT]", "task_type": "classification" } ] }

2. Index Column Specifies which column serves as the unique identifier for each row:

Code Snippet
1
"index_column": "Order ID"

3. Rows Data Contains all the actual data rows, with [PREDICT] placeholders where predictions are needed:

Code Snippet
1234567891011121314
"rows": [ { "Order ID": "001", "Customer": "ACME Corp", "Sales Group": "SG301", "Region": "Europe" }, { "Order ID": "002", "Customer": "Beta Ltd", "Sales Group": "[PREDICT]", "Region": "Asia" } ]

4. Data Schema Defines the data type for each column:

Code Snippet
123456
"data_schema": { "Order ID": {"dtype": "string"}, "Customer": {"dtype": "string"}, "Sales Group": {"dtype": "string"}, "Region": {"dtype": "string"} }

Integration Architecture

RPT-1 integrates into SAP AI Core through a standard REST API endpoint:

Authentication: Uses SAP AI Core OAuth2 client credentials flow Endpoint: /v2/inference/deployments/{deployment-id}/predict Request Format: JSON payload with prediction configuration Response Format: JSON array with predicted values and confidence scores

The model processes requests synchronously, returning predictions with confidence scores for each predicted cell. Response times vary based on dataset size and complexity, typically ranging from seconds for small datasets to minutes for larger ones.

Resource Groups: RPT-1 deployments are organized within AI Core resource groups, allowing for proper access control and resource management.

Scalability: The service can handle datasets with hundreds of rows and dozens of columns, though performance is optimized for typical business datasets of moderate size.


Try it out!

Note

For this hands-on exercise, we’ll use the SAP AI Core Playground to interact directly with RPT-1 without writing code.
  1. Navigate to the RPT-1 Playground and ensure you have access to RPT-1 capabilities.

  2. Prepare Sample CSV Data: Create a CSV file with the following content to test RPT-1’s prediction capabilities. This example shows a table where cost centers are associated with ordered products, with [PREDICT] placeholders requesting predictions:

Code Snippet
12345
ID,PRODUCT,PRICE,ORDERDATE,COSTCENTER 44,Office Chair,150.8,02-11-2025,Office Furniture 104,Server Rack,2200.00,01-11-2025,Data Infrastructure 35,Couch,999.99,28-11-2025,[PREDICT] 22,Desk Lamp,89.99,15-11-2025,[PREDICT]
  1. Upload the CSV: Use the playground’s file upload feature to upload your CSV file containing the sample data with [PREDICT] placeholders.

  2. Submit for Processing: Use the playground interface to submit your CSV file to RPT-1 for processing.

  3. Analyze the Results: Examine the returned predictions, noting:

    • Predicted values for each [PREDICT] placeholder (likely “Office Furniture” for the Couch and Desk Lamp based on context)
    • Confidence scores for each prediction
    • How the model used context from complete rows to inform predictions
  4. Experiment with Variations: Try modifying the sample CSV data:

    • Change the products, prices, or dates in prediction rows
    • Add or remove context rows to see how it affects prediction quality
    • Modify which columns contain [PREDICT] placeholders
    • Test with different data types (numerical values for regression tasks)

Understanding Results

RPT-1 predictions come with several important characteristics:

Confidence Scores: Each prediction includes a confidence value between 0.0 and 1.0, indicating the model’s certainty in its prediction. Higher scores suggest more reliable predictions.

Context Dependency: Predictions are influenced by the patterns learned from context rows. More diverse and representative context data typically leads to better predictions.

Pattern Recognition: The model identifies relationships between columns, such as correlations between customer country, sales organization, and appropriate sales groups or payment terms.

Consistency: Given similar input patterns, RPT-1 tends to produce consistent predictions, making it reliable for business process automation.


Best Practices

To maximize RPT-1 effectiveness:

Key Benefits of RPT-1:

Pretrained Model: Removes the need to collect training datasets, manage compute infrastructure, or wait weeks for model training cycles. Deploy immediately with production-ready accuracy.

In-Context Learning: Accepts representative context examples during inference time to return instant predictions without the need for setup or customization for your specific use cases.

Fast Iterations and Adaptation: Adapts to the context data provided in your requests, without the need for additional training steps or deployments.

Grounded to Business: Produces predictions that are grounded in the provided context examples, resulting in a lower risk of prediction errors or hallucinations.

Data Quality:

  • Ensure context rows contain accurate, representative data
  • Include diverse examples that cover different scenarios
  • Clean data of inconsistencies before processing

Context Strategy:

  • Provide sufficient high-quality context rows (500-2000 for small model, 4000-8000 for large)
  • Include edge cases in your context data when possible
  • Balance the ratio of context to prediction rows

Column Selection:

  • Choose meaningful columns that have logical relationships
  • Avoid predicting highly random or unique values
  • Consider business logic when selecting target columns

Validation:

  • Review prediction confidence scores before accepting results
  • Validate predictions against business rules
  • Test with known data to verify model performance

This hands-on exploration demonstrates RPT-1’s capabilities for intelligent tabular data prediction. The model’s understanding of data patterns and relationships makes it a powerful tool for automating data completion tasks and enhancing business processes with AI-driven insights.

RPT-1 represents a significant advancement in applying transformer technology to structured business data, offering enterprises a practical solution for intelligent data processing and automation scenarios.

For further information, please check out the following resources:

SAP AI Core

SAP Generative AI Hub

SAP-RPT1 Transformer Model