Multi Model with Vectors and Text Embeddings

Multi Model with Vectors and Text Embeddings

The SAP HANA Cloud vector engine works independently as well as in conjunction with the other multi-model engines such as spatial, graph, relational, and the JSON document store. Because SAP HANA Cloud provides a unified approach to data management, the database supports comprehensive solutions to use cases using multiple types of data.

In a sales strategy scenario, the relational engines can store transactional and master data such as products, suppliers, and sales orders. The graph engine makes known the hierarchy of relationships that exist between the sales organization, customers, suppliers and products. The spatial engine enables geometric analysis allowing the user to view and interact with the data from a GIS perspective – interactive maps are great examples. Product catalogs with varying schemas can reside in the JSON document store yet easily combine with any data accessible by the database. Finally, the vector engine can compare the vector representations of all the products within a given inventory in order to determine alternatives that are most similar.

SAP HANA Cloud can then apply machine learning to this full sales perspective exposing hidden patterns and predicting future behavior. Thanks to SAP HANA Cloud’s advanced engines working in unison to provide a “full-spectrum view” across many different types of data, sales management can then proceed to optimize sales strategy.


Try it out!

This lesson demonstrates some further benefits of vectors with text embedding and how they can be used in conjunction with the SAP HANA Cloud spatial and document engine to derive a practical use of data.

Note

The exercises in this lesson use the table
  1. Create the GX_VECTOR_PRODUCTS table with vector data in the local schema:

    Code Snippet
    1
    CREATE TABLE GX_VECTOR_PRODUCTS AS (SELECT * FROM HC_DEV.GX_VECTOR_PRODUCTS);

    Note

    The
  2. Inspect the data of the newly created table in the local schema :

    Code Snippet
    12345
    SELECT TOP 10 PRODUCT_ID, PRODUCT_NAME, DESCRIPTION FROM GX_VECTOR_PRODUCTS;

  3. Create embedding on product description, a new EMB field will be added to the GX_VECTOR_PRODUCTS table and using the VECTOR_EMBEDDING() function update the EMB filed with embedded product description data:

    Code Snippet
    12
    ALTER TABLE GX_VECTOR_PRODUCTS ADD (EMB REAL_VECTOR); UPDATE GX_VECTOR_PRODUCTS SET EMB = VECTOR_EMBEDDING( DESCRIPTION, 'DOCUMENT', 'SAP_NEB.20240715' );

    Note

    The VECTOR_EMBEDDING function in SAP HANA Cloud converts text into vector representations. This function enables users to perform advanced text search and similarity queries.
  4. Inspect the data, using the TO_REAL_VECTOR() function to display the VEC_TEXT field as a REAL_VECTOR data type:

    Code Snippet
    12
    SELECT TOP 10 PRODUCT_ID, PRODUCT_NAME, DESCRIPTION, EMB, TO_REAL_VECTOR(EMB) AS VEC_VECTOR FROM "GX_VECTOR_PRODUCTS";

  5. The next query finds products that are most similar to a ‘3M Wireless Mouse’ based on their vector representations. Order the records by SIMILARITY cosine similarity - indicating they are highly similar. Note that this technique can be useful for recommending alternative or related products in a catalog.

    Code Snippet
    12345
    SELECT TOP 4 PRODUCT_ID, PRODUCT_NAME, DESCRIPTION, COSINE_SIMILARITY( VECTOR_EMBEDDING( '3M wireless mouse', 'QUERY', 'SAP_NEB.20240715' ), EMB) AS "SIMILARITY" FROM GX_VECTOR_PRODUCTS ORDER BY "SIMILARITY" DESC;

  6. Now create a local JSON document collection table called GX_RAG_PRODUCTS and populate it with pre-built data:

    Code Snippet
    1234
    CREATE COLLECTION TABLE "GX_RAG_PRODUCTS"; INSERT INTO "GX_RAG_PRODUCTS" ( SELECT * FROM "HC_DEV"."GX_RAG_PRODUCTS");

    Note

    The data in
  7. Inspect Collection data. The data in the GX_RAG_PRODUCTS table contains latitude and longitude attributes. This can be used with the SAP HANA Cloud geospatial function, ST_POINT, to identify the points on a map.

    Code Snippet
    1
    SELECT * FROM "GX_RAG_PRODUCTS";

  8. Create a database view VW_SUPPLIER_LOCATIONS from the GX_RAG_PRODUCTS collection which contains all the relevant details of the suppliers and their location data:

    Code Snippet
    1234567891011121314151617181920
    CREATE VIEW VW_SUPPLIER_LOCATIONS AS ( SELECT PRODUCT_ID, PRODUCT_NAME, CATEGORY, DESCRIPTION, UNIT_PRICE, UNIT_MEASURE, SUPPLIER_ID, SUPPLIER_NAME, LEAD_TIME_DAYS, MIN_ORDER, CURRENCY, SUPPLIER_COUNTRY, SUPPLIER_ADDRESS, CITY_LONG, CITY_LAT, AVAILABILITY_DAYS, SUPPLIER_CITY, STOCK_QUANTITY FROM GX_RAG_PRODUCTS);

  9. Create table function TF_VW_PRODUCTS. This function is designed to identify and rank similar products based on their embedding vectors using cosine similarity, enriching the results with detailed supplier information and geospatial data, including calculated distances to a fixed point.

    By joining product and supplier datasets, it provides insights such as supplier country, city, stock quantity, and availability days, while also calculating the supplier’s geographical location and distance.

    This functionality is particularly useful for recommending similar products, enhancing product discovery with supplier and logistical details, and supporting supply chain or logistics analyses by integrating geospatial considerations:

    Code Snippet
    123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657
    CREATE OR REPLACE FUNCTION TF_VW_PRODUCTS (PRODUCT_NAME NVARCHAR(100), SIM DECIMAL(10,8)) RETURNS TABLE ( SIMILARITY DECIMAL(10,8), R_PRODUCT_NAME NVARCHAR(255), R_PRODUCT_ID NVARCHAR(50), R_SUPPLIER_ID NVARCHAR(20), SUPPLIER_COUNTRY NVARCHAR(100), SUPPLIER_ADDRESS NVARCHAR(255), CITY_LONG DECIMAL(10,6), CITY_LAT DECIMAL(10,6), AVAILABILITY_DAYS INT, SUPPLIER_CITY NVARCHAR(100), STOCK_QUANTITY INT, SUPPLIER_DISTANCE DOUBLE, GEO_LOCATION ST_POINT ) LANGUAGE SQLSCRIPT AS BEGIN RETURN SELECT DISTINCT SIMILARITY, R_PRODUCT_NAME, R_PRODUCT_ID, R_SUPPLIER_ID, VW.SUPPLIER_COUNTRY, VW.SUPPLIER_ADDRESS, VW.CITY_LONG, VW.CITY_LAT, VW.AVAILABILITY_DAYS, VW.SUPPLIER_CITY, VW.STOCK_QUANTITY, NEW ST_Point('POINT (6.7825 51.2230)', 4326).ST_Distance( NEW ST_POINT(TO_DOUBLE(VW.CITY_LONG), TO_DOUBLE(VW.CITY_LAT), 4326), 'meter') / 1000 AS SUPPLIER_DISTANCE, NEW ST_POINT(TO_DOUBLE(VW.CITY_LONG), TO_DOUBLE(VW.CITY_LAT),1000004326) AS GEO_LOCATION FROM ( SELECT L.PRODUCT_ID AS L_PRODUCT_ID, R.PRODUCT_ID AS R_PRODUCT_ID, L.PRODUCT_NAME AS L_PRODUCT_NAME, R.PRODUCT_NAME AS R_PRODUCT_NAME, R.CATEGORY AS R_CATEGORY, R.SUPPLIER_ID AS R_SUPPLIER_ID, COSINE_SIMILARITY(TO_REAL_VECTOR(L.EMB), R.EMB) AS SIMILARITY FROM GX_VECTOR_PRODUCTS AS L JOIN GX_VECTOR_PRODUCTS AS R ON L.PRODUCT_ID != R.PRODUCT_ID WHERE L.PRODUCT_NAME LIKE '%' || :PRODUCT_NAME || '%' AND COSINE_SIMILARITY(TO_REAL_VECTOR(L.EMB), R.EMB) > :SIM ) AS MAIN JOIN VW_SUPPLIER_LOCATIONS AS VW ON MAIN.R_SUPPLIER_ID = VW.SUPPLIER_ID ORDER BY SIMILARITY DESC; END;

Note

Table functions in SAP HANA Cloud are pieces of code that return tabular data sets. They are used to perform complex queries, execute custom functions, and handle loops when graphical views are not sufficient.

  1. Query product data, select both (or all rows) in the GEO_LOCATION field, right-click the mouse to get the pop-up then select View spatial data:

    Code Snippet
    1234567
    SELECT R_PRODUCT_NAME, SUPPLIER_COUNTRY, SUPPLIER_ADDRESS, SUPPLIER_CITY, SUPPLIER_DISTANCE, GEO_LOCATION FROM TF_VW_PRODUCTS('3m Wireless Mouse', 0.91);

  2. Select one of the pins on the map to reveal data attributes: