Exploring AI Foundation

Objective

After completing this lesson, you will be able to explore AI Foundation.

Introduction To The Lesson: Exploring AI Foundation

In this final lesson in the AI unit we will look at the role of AI Foundation which enables organizations to make AI development more productive and to simplify AI operations.

This lesson contains the following topics:

  • What are the challenges of adopting AI?
  • What's the solution?

What are the challenges of adopting AI?

Challenges Faced With Implementing AI

Of course. Implementing AI is far more than just writing an algorithm; it's a complex, multi-faceted process that presents significant technical challenges. Organizations often underestimate these hurdles, leading to stalled projects and wasted resources.

Here is a breakdown of the key technical challenges, categorized by the different stages of an AI project lifecycle:

Data-Related Challenges (The Underpinning)

This is often the most underestimated and time-consuming part of any AI implementation. Without good data, even the most advanced algorithm will fail.

  • Data Scarcity and Sufficiency: Many advanced AI models, especially deep learning, are data-hungry. An organization might not have enough relevant data to train a robust model. For example, predicting a rare manufacturing defect requires many examples of that defect, which by definition are scarce.
  • Poor Data Quality: The "Garbage In, Garbage Out" principle is paramount in AI. Common quality issues include:
    • Incompleteness: Missing values in datasets.
    • Inaccuracy: Incorrect or outdated information.
    • Inconsistency: The same data point is represented in different ways (e.g., "New York," "NY," "N.Y.C.").
  • Data Labeling Most supervised machine learning (the most common type of AI) requires labeled data (e.g., images of cats tagged with the label "cat"). This process is often:
    • Expensive: It requires significant human labor (domain experts are often needed).
    • Time-Consuming: Labeling millions of data points can take months.
    • Subjective: For complex tasks like sentiment analysis, different labelers may disagree, introducing noise into the data.
  • Data Bias: AI models learn from the data they are given. If the data reflects historical or societal biases, it's possible that the model might learn and amplify them. Care must be taken to make sure this doesn't happen.
  • Data Accessibility and Silos: In large organizations, data is often fragmented across different departments, stored in various formats (databases, spreadsheets, legacy systems), and protected by different access controls. Simply getting the right data into one place for an AI project can be a massive engineering and political undertaking.

Model and Algorithm Challenges (The Nucleus)

Once you have the data, building and training the model presents its own set of technical difficulties.

  • Choosing the Right Model: There is no "one-size-fits-all" AI model. The choice depends on the problem, the type of data, and the business constraints. A simple logistic regression model might be fast and easy to interpret, while a complex deep neural network might be more accurate but is a "black box."
  • The "Black Box" Problem (Explainability): Many powerful models, like deep learning networks, are notoriously difficult to interpret. If an AI model denies a customer a loan, a bank needs to explain why. This lack of explainability is a major barrier to adoption in regulated industries like finance and healthcare. This has led to the rise of a whole field called Explainable AI (XAI).
  • Overfitting and Underfitting: This is a classic machine learning challenge.
    • Overfitting: The model learns the training data too well, including its noise and outliers. It performs great on the data it has seen but fails to generalize to new, unseen data.
    • Underfitting: The model is too simple to capture the underlying patterns in the data. Finding the right balance (the "Goldilocks zone") requires careful tuning and validation.
  • Hyperparameter Tuning: AI models have numerous "dials" or settings (hyperparameters) that are not learned from the data but must be set before training. Finding the optimal combination of these settings is a computationally expensive and often brute-force process that requires significant expertise.

Infrastructure and Computational Challenges (The Engine)

AI models, particularly for deep learning, require immense computational resources.

  • High Computational Cost Training large models can require specialized, expensive hardware like GPUs (Graphics Processing Units) or TPUs (Tensor Processing Units). This can involve a significant capital investment for on-premise hardware or substantial ongoing costs with cloud providers (like AWS, Google Cloud, or Azure).
  • Scalability: A model that works on a data scientist's laptop is very different from one that can serve millions of users in real-time. Engineering the infrastructure to handle this scale, with low latency, is a complex software engineering problem involving concepts like microservices, containerization (Docker, Kubernetes), and load balancing.
  • MLOps (Machine Learning Operations): Getting a model into production is not the end. Organizations need a robust pipeline (MLOps) for automating, monitoring, and governing the entire machine learning lifecycle. This includes tools for:
    • Experiment Tracking: Logging every model, dataset, and result.
    • Model Versioning: Keeping track of different model versions, just like code.
    • Automated Deployment: Pushing new models into production seamlessly.

Deployment, Maintenance, and Governance Challenges (The Long Tail)

This is where many AI initiatives fail—after the model has been built.

  • Integration with Existing Systems: An AI model is useless in isolation. It must be integrated into existing business processes and software (e.g., a CRM, an e-commerce website, a manufacturing control system). This requires skilled software engineers to build APIs and ensure seamless communication.
  • Model Drift and Concept Drift: The world is not static. The statistical properties of the data the model sees in production can change over time, causing its performance to degrade. For example, a fraud detection model trained before a new type of scam emerges will fail to detect it. This "model drift" requires continuous monitoring and a plan for periodic retraining.
  • Monitoring and Alerting: Unlike traditional software, an AI model can fail silently. It might still produce an output, but the output is wrong. Organizations need sophisticated monitoring systems to track model accuracy, data distributions, and other key performance indicators to know when a model needs attention.
  • Security and Adversarial Attacks: AI models can be vulnerable to unique types of attacks. Adversarial attacks involve making tiny, often imperceptible changes to input data (like an image or a text file) to trick the model into making a wrong prediction. Securing AI systems against these attacks is an active and challenging area of research.

What's The Solution?

How SAP AI Foundation Can Help

Recall back in the first lesson of this unit, Describing SAP's AI Strategythat we talked about AI Foundation as one of the "building blocks" of SAP's AI strategy. More specifically we mentioned that AI Foundation is a "one stop shop that has everything an organization needs". Let's explore further what we mean by that phrase.

An AI Operating System

A good way to think of AI Foundation is that it's an "operating system" for AI. Let's review that term:

An operating system is the fundamental software that manages a computer's hardware and software resources, providing a platform for applications to run. It acts as an intermediary between the user and the computer hardware, making it possible for users to interact with the machine and its various components without needing to understand the complexities of the underlying hardware.

AI Foundation works in the same way. So taking the above definition and adjusting it we would have the following:

AI Foundation is comprised of several fundamental services (all running on SAP BTP) that together provide a platform for AI models to be trained (production) and AI enabled applications to be designed and to be run (consumption). It acts as an intermediary between the user (whether that user is a enduser, developer or administrator) making it possible for them to both produce and consume AI capabilities without needing to understand the complexities of the underlying hardware.

Going deeper into AI Foundation we can say that the "operating system" capabilities it provides allow AI practitioners to do three things:

  • Build
  • Extend
  • Run
Let's take a closer look at the tools providing these capabilities.

Build, Extend & Run

AI Foundation contains a powerful set of products with every feature needed for AI success. They are:

  • Unified AI Portal
  • Joule Studio
  • Generative AI Hub
  • SAP Document AI

Unified AI Portal

Unified AI Portal

The Unified AI Portal like all portals offers a secure, web-based entry point enabling acesss to all tools, applications, and services that comprise AI Foundation. Available through SAP BTP it will benefit from SAP BTP's security, compliance and other built in features.

Joule Studio

Joule Studio

Recall that in an earlier lesson in this unit (Exploring Joule & Joule Agents)we discussed the difference between GenAI and Agentic AI. Briefly restated:

  • Reactive (GenAI) vs. Proactive (AgenticAI): Most AI models (like current large language models in their basic form) are primarily reactive. They await a prompt, generate a response, and then stop, waiting for the next prompt. They don't inherently decide what to do next to achieve a larger, unstated goal.
  • Tool (GenAI) versus Executor (AgenticAi): Non-agentic AI acts as a powerful tool that humans wield. Agentic AI, while still a tool, takes on more of an executor or assistant role, autonomously driving towards an objective.

We also have discussed Joule. Again briefly restated we can think of Joule as the conversational AI assistant you talk to (like Alexa or Google Assistant).

So the final piece of the puzzle so to speak is Joule Studio. Joule Studio is the "developer toolkit" (like the Alexa Skills Kit for Alexa) that allows developers and IT teams to teach Joule new tricks, connect it to new information, and build custom capabilities tailored specifically to their company's needs.

In short: Joule is the copilot. Joule Studio is where you build and enhance the copilot.

Going a bit deeper Joule Studio is a low-code/no-code development environment built on SAP BTP and available through SAP Build. It provides the tools, frameworks, and services necessary to connect Joule to various data sources, define new skills and agents (further explained in a moment), and embed custom AI capabilities directly into the flow of work across SAP applications and beyond. It's the bridge that connects the generic power of a large language model (LLM) with the specific, secure, and context-rich data of an enterprise.

Joule Studio is designed to achieve three main goals:

  • Extend Joule's Knowledge (Grounding): Out of the box, Joule understands SAP processes. But it doesn't know an organizations specific policies, product catalogs, or internal knowledge base documents. Joule Studio (by utilizing SAP Knowledge Graph) allows the "grounding" of Joule in an organizations data.
    • How: Organizations can connect Joule to internal SharePoint sites, Confluence pages, custom databases, and other document repositories. Using Retrieval-Augmented Generation (RAG; discussed in the previous lesson), Joule can securely search this private data to provide relevant, context-aware answers.
      • Example: A user asks, "What is our company's travel policy for international flights?" Joule Studio enables Joule to find and summarize the answer from your internal HR policy PDF.

  • Create New Skills And Agents: This is about teaching Joule how to do things (Agentic AI), not just answer questions (GenAI).
    • How: Developers can define new intents (what the user wants to do) and link them to actions (like calling an API). It allows Joule to read, create, and update business objects.
      • Example: A developer uses Joule Studio to build a "Create Sales Order" skill. A sales rep can then say, "Joule, create a sales order for Customer XYZ with 100 units of Product A." Joule Studio provides the framework for Joule to understand the request, call the appropriate SAP Cloud ERP API, and complete the transaction.
.

  • Integrate with Third-Party Systems (Connectivity): Businesses don't run on SAP alone. Joule Studio provides the connectors and API management tools to link Joule to non-SAP systems.
    • How: Leveraging SAP BTP's integration capabilities, you can connect Joule to systems like Salesforce, ServiceNow, or custom-built applications.
      • Example: A skill can be built that allows a manager to ask, "Joule, show me the open high-priority support tickets for my team." Joule could pull data from both SAP Cloud ERP (for team structure) and another different custom built application (for the tickets) to provide a single, unified answer.

Joule Skills and Joule Agents

Joule Skills and Joule Agents

As Joule Studio is the solution for making the promise of Agentic AI a reality we need to briefly discuss Joule "Skills" versus Joule "Custom AI Agents". Agentic AI is not an "all or nothing" proposition. In some use cases the task to be achieved is straightforward. For example imagine an agent simply needing to obtain the latest weather forecast from an API. Other times the task can be much more complicated. Imagine a flight being cancelled due to weather and an agent needing to rebook travel for all passengers keeping in mind those who may need to make one or more connections to ultimately reach their final destination. In the first use case the calling of the API is simple, straightforward and predictive. In the second case however the ask of the agent is much more demanding and requires for it to engage in much more analysis (i.e., "thinking"). While the agent may certainly as part of its work call an API, that call is just a means to an end. It's the end goal (the rebooking of passengers) that determines success or failure. Recognizing the difference between skills and agents is paramount in making sure the correct approach is used when designing and implementing Joule scenarios.

The following should help in identifying skills versus custom AI gents use cases:

 Joule SkillsCustom AI Agents
AspectsAre part of an advanced AI model, and can act on specific operation, by leveraging the context of the conversationAre autonomous, multi-tool systems that solve complex problems by selecting and executing appropriate tools dynamically
PurposeSingle, predefined and repetitive, atomic operationsMulti-step, adaptive problem solving
ComplexitySimple operations with known inputs/outputsComplex, dynamic workflows
FlexibilityLimited to predefined actionsHighly flexible and goal-oriented
IntegrationSpecific APIs via SAP Build actionsMulti-tool orchestration
Resource consumptionMinimal resource consumptionMore resource-intense due to its use of large language models

Generative AI Hub

Generative AI Hub

Think of Generative AI Hub as a central control panel for developers and administrators. It provides instant access to a range of large language models (LLMs) from different providers (e.g., OpenAI's GPT models, Anthropic, Cohere, and others). This "bring your own model" flexibility allows organizations to choose the best LLM for a specific task while benefiting from SAP BTP's orchestration and grounding capabilities. Prompts can either be created directly in the hub using its dedicated front end (as shown in the graphic above) and the response can be seen (to the right) or the prompt and response can programmatically be integrated into custom applications using generative AI hub SDK.

A Short Digression on SAP AI Core and SAP AI Launchpad

These two SAP BTP services still exist and still have the functionality that implements the technical details of SAP AI. They:

  • Manage the lifecycle of AI scenarios and executables,
  • Run pipelines as batch jobs for preprocessing, model training, or batch inference,
  • Host trained machine learning models ,
amongst other things.

However as we shift focus to SAP BTP providing an "AI Foundation" serving as the "operating system" for AI these services may not be as explicitly referred to as often as they have been in the past. However it is okay to think of them as incorporated into and a part of AI Foundation.

(Another) Short Digression on APIs

In addition to generative AI hub SDK there are two other SDK's that are available and that can be utilized:

  • SAP AI Core SDK: Python only based API for interacting with SAP AI Core.
  • SAP Cloud SDK for AI: Supports Python, Java and JavaScript based access for interacting with SAP AI Core.

SAP Document AI

SAP Document AI

Think of the following types of documents processed by organizations large and small every day:

  • Invoices
  • Purchase Orders
  • Payment Advice
  • Delivery Notes

SAP Document AI is available as part of AI Foundation and can automate the processing of these types of business documents. In simple terms, it "reads" and "understands" unstructured documents like PDFs, scans, and images (e.g., invoices, purchase orders, delivery notes) and transforms the information into structured data that can be used directly within SAP and other business applications.

Imagine the following use cases:

  • Accounts Payable Automation:
    • Automatically extracts data from supplier invoices, matches them against purchase orders, and posts them in SAP Cloud ERP, flagging only exceptions for human review.
  • Sales Order Processing
    • Reads customer purchase orders received via email/PDF and automatically creates sales orders in the system, reducing order-to-cash cycle time.
  • HR Onboarding
    • Extracts information from résumés, ID cards, and other employee documents to automatically populate employee profiles in SAP SuccessFactors.
  • Procurement and Logistics
    • Processes delivery notes, proofs of delivery, and other shipping documents to automate goods receipt and other supply chain transactions.

The benefits of SAP Document AI are:

  • Increased Efficiency: Drastically reduces or eliminates manual data entry.
  • Cost Savings: Frees up employees from tedious, low-value tasks to focus on more strategic work.
  • Improved Accuracy: AI models are often more accurate and consistent than human data entry, reducing costly errors.
  • Faster Processing Times: Accelerates business cycles like paying suppliers (enabling early payment discounts) or fulfilling customer orders.
  • Enhanced Compliance & Data Quality: Ensures data is validated against master data before it enters the core ERP system.

The Final Pieces of The Puzzle

SAP Knowledge Graph and SAP Foundation Model

As a final piece of AI Foundation we return to the topic of grounding which we earlier defined as ensuring the responses and actions of GenAI and AI agents are grounded in reality and related to the context of what they're doing. This is especially critical in the realm of business software as AI can be involved in business processes spanning logistics, financials and HR. This is something that cannot be left to chance and AI Foundation doesn't. Part of what makes it up is SAP Knowledge Graph along with SAP Foundation Model.

SAP Knowledge Graph and SAP Foundation Model

While LLMs are useful for generating text, they have limitations when it comes to structured business data commonly found in database tables. SAP Foundation Model seeks to overcome this limitation. With it LLMs can incorporate business data in their design and execution.

SAP Knowledge Graph builds on SAP Foundation Model by ensuring that Skills and Agents created using Joule Studio are grounded in an organizations business data. The best way to understand SAP Knowledge Graph is with an analogy:

Imagine an organizations business data (customers, products, suppliers, orders, employees) is like people who don't know each other. They exist in different departments (data silos) like Sales, Manufacturing, and HR.

SAP Knowledge Graph acts like a professional network (like LinkedIn) for this data. It doesn't move the people; it simply maps out their relationships:

  • This Customer (a node) placed this Sales Order (another node).
  • This Sales Ordercontains this Product.
  • This Productis manufactured using parts from this Supplier.
  • This Supplieris currently experiencing a shipping delay due to an event.

Suddenly, both GenAI and Agentic AI can see the entire network of connections. Users can ask complex questions like, "Which customer orders will be affected by the shipping delay from this specific supplier?"—a question that SAP Knowledge Graph makes very easy to answer.

Summary

SAP's AI Foundation is a comprehensive suite of tools, services, and runtimes built on SAP BTP. Its primary purpose is to empower developers, data scientists, and business experts to build, deploy, manage, and scale powerful, business-centric AI solutions.