Using generative-AI-hub-SDK to interact with Orchestration Services

Objectives

After completing this lesson, you will be able to:
  • Describe orchestration feature in generative AI hub
  • Using generative-AI-hub-SDK to interact with Orchestration services

Need for Orchestration

You have seen how we can use generative AI hub to get an output that can be used in creating custom applications. However, business scenarios usually require more than the bare consumption of Foundation Models for generative tasks. They need to scale, secure, and manage these solutions.

A flowchart illustrating the orchestration process for handling a request. Steps include Grounding, Prompt Templating, Data Masking, Content Filter, and LLM Access, leading to a response.

Access to generative AI models often needs to be combined with other functionalities. These include:

  • Prompting models using scenario-specific templates from a Prompt Repository.
  • Ensuring compliance with AI Ethics and Responsible AI through Content Filtering.
  • Maintaining data privacy by using Data Masking techniques.
  • Enhancing models with business context through Retrieval-Augmented Generation (RAG) .

You need a service for coordinating and managing the deployment, integration, and interaction of various AI components.

In the Facility solutions company scenario in this learning journey, you've seen that you need to manually update the model each time for each prompt. You'll see that even in generative-AI-hub-SDK code, you need to write different functions for each model.

This can be an erroneous and time-consuming process leading to complex and redundant code and workflows.

This is where orchestration services can be helpful.

AI orchestration is the process of coordinating and managing how various AI components are deployed, integrated, and interact within a system or workflow.

Orchestration services and workflows in generative AI hub are useful in creating sophisticated workflows without complex code.

We can use orchestration services for different foundational models without changing the client code.

This approach reduces maintenance, enhances control, and optimizes efficiency, helping teams focus on innovation rather than integration. It helps you design powerful AI workflows visually and bring your AI vision to life faster with modular capabilities and intuitive interfaces.

In addition, it allows seamless integration and management of diverse components like data pipelines, AI models, and prebuilt modules (content filtering, data masking). It also ensures efficient execution of multiple AI models, optimizes computational resources, and automates the end-to-end AI lifecycle.

Before starting to develop prompts using generative-AI-hub-SDK, let's explore orchestration services in generative AI hub.

Introduction to Orchestration in Business AI

Orchestration in business AI scenarios integrates content generation with several essential functions.

Key Functions:

  • Templating: Templating allows you to create prompts with placeholders that the system fills during inference.
  • Content Filtering: Content filtering lets you control the type of content that's sent to and received from a generative AI model.
  • Orchestration Workflow: In a basic orchestration setup, you can combine different modules into a pipeline executed with a single API call. Each module's response serves as the input for the next module.
  • Configuration and Execution: Orchestration defines the execution order of the pipeline centrally. You can configure each module's details and exclude optional modules by passing a JSON-formatted orchestration configuration in the request body.

    See the following video for an example of the execution order.

    Note

    The screens shown in the demo refer to one of the possible models.
  • Harmonized API: The harmonized API enables the use of different foundational models without changing the client code. It uses the OpenAI API as a standard and maps other model APIs to it. This includes standardizing message formats, model parameters, and response formats. The harmonized API integrates into the templating module, model configuration, and orchestration response.

You'll see an implementation in the next lesson to use multiple models in a seamless manner.

Using Orchestration Service

You need to perform the following steps to use Orchestration service:

  • Get an Auth Token for Orchestration
  • Create a Deployment for Orchestration
  • Consume Orchestration

You can refer to the detailed steps here.

Using Orchestration Service Through Generative-AI-hub-SDK

Here are some codes to demonstrate how generative-AI-hub-SDK can interact with the Orchestration Service.

You need to perform the following steps:

  • Initializing the Orchestration Service: Before you use the SDK, set up a virtual deployment of the Orchestration Service. After deployment, you'll receive a unique endpoint URL.
    Python
    1
    YOUR_API_URL = "..."
  • Define the Template and Default Input Values: You can use the URL to access orchestration service.
    Python
    123456789101112131415
    from gen_ai_hub.orchestration.models.message import SystemMessage, UserMessage from gen_ai_hub.orchestration.models.template import Template, TemplateValue template = Template( messages=[ SystemMessage("You are a helpful translation assistant."), UserMessage( "Translate the following text to {{?to_lang}}: {{?text}}" ), ], defaults=[ TemplateValue(name="to_lang", value="German"), ], )

    We are considering an example of a translation assistant here. The code sets up a template for a translation assistant. It defines system and user messages to guide the translation process. The system message establishes the assistant's role, while the user message includes placeholders for the target language and text to translate. It also sets a default target language, making the template ready for immediate use in translating text to German.

  • Define the LLM:
    Python
    1234
    from gen_ai_hub.orchestration.models.llm import LLM llm = LLM(name="gpt-4o", version="latest", parameters={"max_tokens": 256, "temperature": 0.2})

    The code imports the LLM class from the gen_ai_hub module, then creates an instance of an advanced language model, specifically "gpt-4o". This instance is configured with the latest version and tailored parameters like maximum token limit and temperature setting to control response variability. It sets the stage for scalable AI-driven text generation tasks.

  • Create the Orchestration Configuration:
    Python
    1234567
    from gen_ai_hub.orchestration.models.config import OrchestrationConfig config = OrchestrationConfig( template=template, llm=llm, )

    The code here imports the OrchestrationConfig class from the gen_ai_hub.orchestration.models.config module. It then initializes an instance of OrchestrationConfig using the template and LLM variables. This setup is necessary to configure the orchestration process, ensuring the system uses the specified template and language model parameters.

  • Run the Orchestration Request:
    Python
    12345678
    from gen_ai_hub.orchestration.service import OrchestrationService orchestration_service = OrchestrationService(api_url=YOUR_API_URL, config=config) result = orchestration_service.run(template_values=[ TemplateValue(name="text", value="The Orchestration Service is working!") ]) print(result.orchestration_result.choices[0].message.content)

    This code leverages the OrchestrationService to run a predefined task using specific template values. It connects to the service through the provided API URL and configuration, executes the task by supplying a text value, and then prints the resulting content. The code ensures streamlined communication with the orchestration system and retrieval of results.

You can further use modules for content filtering and other tasks. See a notebook that demonstrates how to use the SDK to interact with the Orchestration Service.

You will also see an example in the Facility solutions company scenario in the next lesson.

Log in to track your progress & complete quizzes