Overview of Knowledge Graphs

Objective

After completing this lesson, you will be able to describe the concept Knowledge Graph.

Knowledge Graphs

When we ask a virtual assistant about tomorrow’s weather forecast or use Google to search for the latest news on climate change, knowledge graphs serve as the foundation for today’s cutting-edge information systems. In addition, knowledge graphs have the potential to elucidate, evaluate, and substantiate information produced by deep learning models such as Chat-GPT and other large language models. Knowledge graphs have a wide range of applications, including improving search results, answering questions, providing recommendations, and developing explainable AI systems.

A knowledge graph is a structured representation of facts about the world, describing entities and their interrelations, organized in a graph format. Here are some key aspects of a knowledge graph:

AspectDefinition
EntitiesIn a knowledge graph, nodes represent entities (like people, places, things, concepts), and edges represent relationships or connections between these entities.
RelationshipsDefine the connections between entities. For example, in a knowledge graph about movies, a relationship might be "acted_in" between an actor and a movie.
AttributesDescribe the characteristics of entities and relationships. For example, an entity "movie" might have attributes like "director," "release_date," and "genre."
RDF

The Resource Description Framework (RDF) is a standard model for data interchange on the web. It extends the linking structure of the web to use URIs to name the relationship between things as well as the two ends of the link (this is usually referred to as a "triple").

This linking structure forms a directed, labeled graph, where the edges represent the named link between two resources, represented by the graph nodes

.
TriplesThe basic unit of data in a knowledge graph is a triple, which consists of a subject, a predicate, and an object. For example, "Paris (subject) - is the capital of (predicate) - France (object)".
  • Triple-Based Structure: RDF represents data as triples, each consisting of:​
  • Subject: The entity being described (e.g., a Paris).​
  • Predicate: The relationship or attribute (e.g., "is the capital of ").​
  • Object: The value or related entity (e.g., France).​A diagram showing the concept of triple (subject, predicate, object) together with the example of Paris, the capital of France)
Semantic MeaningKnowledge graphs often use standardized vocabularies or ontologies to define the types of entities and relationships, giving them semantic meaning. This makes knowledge graphs highly suitable for AI applications, such as question answering and recommendation systems.
Sources of DataData in a knowledge graph can come from various sources, including structured data (like databases), semi-structured data (like XML or JSON), and even unstructured data (like text, using techniques like Named Entity Recognition and Relation Extraction).
ExamplesSome prominent examples of knowledge graphs include DBpedia, Wikidata, Freebase (now part of the Google Knowledge Graph), and proprietary graphs developed by companies like SAP, Amazon, Facebook, and Microsoft.
ApplicationsKnowledge graphs are used in various applications, such as search engines, chatbots, digital assistants, recommendation systems, and data integration tasks.
OntologiesProvide a formal structure for the knowledge graph, defining the types of entities, relationships, and attributes. Ontologies are needed to ensure the consistency and interoperability of your model.A diagram showing an illustration of an ontology including town, country, country population, and continent.

In essence, a knowledge graph provides a way to represent, store, and query structured knowledge, enabling machines to understand and process human-readable information more effectively.

Ontologies

Ontologies are formal representations of knowledge, providing a structured way to describe domains of interest. They define the types, properties, and relationships of entities within a specific domain, creating a common vocabulary and understanding that can be shared among users and applications. Ontologies play a crucial role in the field of knowledge representation, semantic web, and artificial intelligence.

Key Components of Ontologies

Key ComponentsDescription
Classes (or Concepts)

Represent categories or types of entities within the domain. For example, in a medical ontology, classes might include "Patient," "Doctor," and "Disease."

Properties (or Attributes)

Describe the characteristics or attributes of the classes. For example, a "Patient" class might have properties like "name," "age," and "medical history."

Relationships (or Relations)

Define how classes are interrelated. For example, a "Doctor" class might be related to a "Patient" class through a relationship like "treats" or "diagnoses".

Individuals (or Instances)

Represent specific objects or entities that belong to a class. For example, "John Doe" might be an individual of the "Patient" class.

Axioms

Logical statements that constrain the interpretation of the classes, properties, and relationships. Axioms define rules and restrictions that must be satisfied within the ontology.

Purpose and Benefits of Ontologies

Purpose and BenefitsDescription
Shared Understanding

Provide a common vocabulary and structure that can be understood and used by both humans and computers.

Data Integration

Facilitate the integration of data from different sources by mapping them to a common ontology.

Reasoning

Enable automated reasoning and inference about the entities and relationships within the domain. For example, if a doctor treats a patient, and the patient has a particular disease, the ontology can infer that the doctor is involved in the treatment of that disease.

Knowledge Reuse

Allow for the reuse of knowledge across different applications and domains.

Semantic Interoperability

Promote interoperability between different systems and applications by ensuring that data is consistently interpreted and understood.

Examples of Ontologies

ExamplesDescription
Gene Ontology (GO)

A collection of terms that describe gene product attributes across species. It is widely used in bioinformatics for annotating genes and gene products.

FOAF (Friend of a Friend)

An ontology used to describe people and their relationships, commonly used in social networks.

SNOMED CT (Systematized Nomenclature of Medicine--Clinical Terms)

A comprehensive clinical terminology used in healthcare for encoding clinical data.

Schema.org

A collaborative initiative that provides a common vocabulary for marking up web pages to improve search engine results.

Ontology Languages

LanguagesDescription
RDF (Resource Description Framework)

A standard used for representing information about resources on the web.

RDFS (RDF Schema)

An extension of RDF that provides a vocabulary for describing classes and properties.

OWL (Web Ontology Language)

A richer and more expressive language that builds on RDF and RDFS, providing a framework for defining and reasoning about ontologies.

SKOS (Simple Knowledge Organization System)

SKOS is a W3C recommendation designed for representing knowledge organization systems such as thesauri, classification schemes, taxonomies, and subject heading systems. It is less formal than OWL and is focused on applications where informal relationships and loose definitions are sufficient.

Example: skos:Concept, skos:broader, skos:narrower, skos:related.

DAML+OIL

A predecessor to OWL, DAML+OIL is a combination of DAML, developed by DARPA, and OIL, developed by EU research projects. It provides constructs for complex class descriptions and axioms. After the development of OWL, DAML+OIL was largely superseded, but it played a crucial role in the evolution of ontology languages.

SHACL (Shapes Constraint Language)

SHACL is a language for validating RDF graphs against a set of conditions. It allows for the definition of constraints on RDF data and provides mechanisms to identify and report violations.

Example: sh:NodeShape, sh:PropertyShape, sh:minCount, sh:maxCount.

Frame Logic

Frame Logic is an older ontology language, based on the concept of frames, which are data structures for representing stereotypical situations. Frames include slots and fillers, where slots are the properties or attributes of the frame and fillers are the values of these properties.

Creating and Managing Ontologies

Creating and ManagingDescription
Ontology Editors

Tools like Protégé that provide a graphical interface for creating, editing, and visualizing ontologies.

Reasoning Engines

Software that can perform automated reasoning and inference based on the ontology. Examples include HermiT and Pellet.

Conclusion

Knowledge graphs are powerful tools for organizing, integrating, and leveraging knowledge. They provide a structured and semantically rich representation of information, enabling sophisticated queries, reasoning, and contextual understanding. By using knowledge graphs, organizations can enhance their data management, improve decision-making, and develop more intelligent systems.

Ontologies are essential for structuring knowledge in a way that is understandable and usable by both humans and machines. They provide a foundation for semantic interoperability, data integration, and intelligent systems. By using ontologies, organizations can enhance the quality and efficiency of their knowledge management processes, leading to better decision-making and innovation.