Architecting the Unified Semantic Layer - The Data Foundation

Objective

After completing this lesson, you will be able to explain how a unified semantic data layer supports analytics and AI by separating data supply from data consumption in hybrid SAP landscapes.

In this lesson, the Enterprise Architect’s role as orchestrator shifts toward ensuring the integrity and consistency of enterprise data. In modern organizations, data is the shared foundation for both human decision-making and AI-driven automation. The challenge is not just accessing data, but ensuring data is understood, governed, and interpreted consistently wherever it is used.

By applying reference patterns from the SAP Architecture Center, architects separate data supply (systems of record) from data demand (analytics and AI use cases). This separation ensures that business logic is defined once, shared consistently, and governed centrally, regardless of the number of tools or consumers that rely on it.

Following Reference Architecture RA0013, architects assemble a cohesive data foundation using SAP Business Data Cloud. Each component plays a distinct role in transforming raw enterprise data into trusted, actionable insights.

Technical Pillars of SAP Business Data Cloud

1. SAP Datasphere: The Integration & Semantic Engine

SAP Datasphere forms the semantic backbone of the data foundation.

Semantic Consistency: Core business definitions—such as "Net Margin"—are defined once in a virtual model. This ensures that the same logic and numbers are used consistently across all reporting, analytics, and AI scenarios.
The Data Fabric: Metadata is automatically analyzed to identify relationships between technical tables and systems, linking key business objects across the enterprise landscape.
Semantic Onboarding: Metadata from SAP Signavio and SAP LeanIX are integrated, allowing data products to be directly connected to business processes. This provides the business context AI needs to interpret data correctly, ensuring that AI outputs remain aligned with real business processes and definitions.

2. SAP Analytics Cloud (SAC): The Decision Insights

SAP Analytics Cloud builds on this semantic foundation to support decision-making.

Just-in-Time Querying: Queries are executed in real-time against Datasphere, rather than relying on stored copies of data, ensuring that insights reflect the latest information, which is critical for timely and accurate business decisions.
Bi-Directional Planning: Planning is interactive: users can analyze data, adjust forecasts, and store those changes directly in the planning model rather than working with static or disconnected analyses.

3. SAP Databricks: Advanced AI & Compute Engine

SAP Databricks provides scalable compute for advanced analytics and AI workloads.

Clean Core for Compute: Resource-intensive tasks, such as training fraud detection models, are handled outside the ERP core, protecting transactional performance.
Zero-Copy Integration: Using Delta Sharing, data scientists can work directly with live SAP data products, eliminating the need to copy data into separate systems or maintain complex data extraction and transformation processes.

4. SAP BDC Cockpit: The Central Control Plane

The SAP BDC Cockpit provides centralized oversight of the entire data foundation.

Unified Identity: Integration with SAP Cloud Identity Services ensures that user roles and permissions are consistently applied through SCIM-based provisioning.
Formation Management: The cockpit manages lifecycle operations and monitoring across the full data stack, treating the environment as a single, coordinated system.

The Data Product Paradigm

In modern architectures, data access is no longer about exposing raw tables. Instead, organizations increasingly provide Data Products: clearly defined, reusable, and governed data assets designed for both people and AI to consume.

A data product bundles data with meaning, rules, and context, so consumers don’t need to understand the underlying technical complexity, enabling faster adoption of analytics and AI across the organization.

Data Tiering: Balancing Performance and TCO

Not all data requires the same treatment. Some information must be accessed instantly, some is queried frequently, and some is kept mainly for historical or analytical purposes. To balance performance, cost, and scalability, architects typically organize data into distinct tiers rather than storing everything in a single location.

A multi-tiered data strategy helps avoid performance bottlenecks and unnecessary cloud costs:

Hot Tier (Federated): Real-time data is accessed directly from the source system, such as SAP S/4HANA. This tier supports operational scenarios where up-to-date information is crucial, and delays are unacceptable.
Warm Tier (Replicated): Frequently used data is stored in the SAP Business Data Cloud (SAP HANA Cloud). This enables faster joins and more complex queries without requiring repeated loading of core systems.
Cold Tier (Data Lake): Large historical datasets or unstructured data ("big data") are stored in cost-efficient data lakes. These datasets remain connected through the Business Data Cloud fabric but are optimized for scale rather than speed.

Lesson Summary

This lesson focused on the role of a unified semantic data layer in enabling analytics and AI across hybrid landscapes. You learned how separating data supply from data consumption ensures consistent business logic, governance, and reuse across tools and use cases. By applying SAP Business Data Cloud and related components, Enterprise Architects can deliver trusted data products, balance performance and cost through data tiering, and provide a stable foundation for both human decision-making and AI-driven automation.

Next lesson