Egeria Dojos and the IBM Developer community
Explore Egeria, an open source project under LF AI & Data, and learn how it solves the difficult problem of real-time integration between heterogeneous products, services, and solutions
I am an Egeria maintainer at IBM. If you would like to look up what code I have written, my GitHub ID is davidradl. Many thanks to Nigel Jones for reviewing this content. Nigel is also an Egeria maintainer at IBM.
What is Egeria?
Egeria is an open source project under LF AI & Data that provices open metadata and governance for enterprises. It automatically captures, manages, and exchanges metadata between tools and platforms in a vendor-neutral way. From the Egeria documentation:
“Although it is desirable to synchronize metadata between the tools and platforms used by an organization, there are so many, each using a different format, naming conventions and interfaces, that it would be a complex and expensive project to build point-to-point metadata exchange integrations between them.”
The following diagram shows how in Egeria each tool is linked through an integration service (yellow circles) that is tailored for that specific type of tool. The tool might call the integration service directly, or Egeria can host a connector that converts from the specific formats, naming conventions, and interface of the specific tool and the open standards and interfaces of Egeria. Behind the scenes, Egeria manages the exchange of metadata by using various techniques to ensure that this occurs in the most efficient and timely manner possible.
Egeria exposes APIs that are metadata repository independent. The APIs route, create, delete, and update requests to the owning metadata repository to action, and queries are federated across the metadata repositories. In this way, callers use one open API and do not need to learn a proprietary API for each connected technology.
How do I get involved as a developer?
Given that Egeria connects many types of tools and services in different technologies, there must be code written to integrate from the tools and services to connect to Egeria. Organizations want to connect in the technology they use, so over time, more integration services are being written, and the Egeria ecosystem gets more community-written integration services.
Egeria Dojo Days
Egeria Dojos are online tutorials that guide you through hands-on activities that showcase the value and capability of using Egeria. There are also hands-on classes on working with Egeria. Throughout 2022, members of the Egeria Community will lead education sessions on each of these new Dojos. The Dojos include:
The following image depicts the value of each Dojo Day. For more information , see Egeria Dojo Days on the LF AI & Data site.
Day 2: Developing on Egeria
Day 2 of the Egeria Dojo Days is a key day for Java developers. This day explains how you can develop new connectors, which are one of the styles of integration into Egeria.
Coding Egeria connectors to enable integration involves:
- Coding in an opinionated framework. You should look to provide minimal technology-specific code, letting the framework do the heavy lifting.
- Thinking like a middleware developer. It is better to follow the established patterns and use the existing framework code rather than writing it yourself. This way, connectors will have the same capabilities that users expect, and adopting a new connector will be a similar experience to the existing connectors.
- Taking advantage of existing super classes to subclass what is special about your technology.
Egeria has repository connectors, which are connectors that allow a metadata repository to be part of the Egeria ecosystem (cohort). The advantages of this include:
- Egeria and frameworks provide code that minimizes the amount of code that must be written, so a new connector needs to scope what it supports and map requests to the proprietary metadata repository.
- The content of the metadata repository is included in Egeria queries, which are federated across all of the metadata repositories.
- The repository receives events as metadata in other repositories’ changes, and is expected to hold this content as read-only (reference) copies.
Egeria also has integration connectors. The advantages of these connectors include:
- Egeria and frameworks provide code that minimize the amount of code that must be written, so a new connector needs to scope what it supports and map requests to the proprietary metadata repository formats.
- The repository connected in through an integration connector is not in the Egeria ecosystem.
- The connector is driven by configuration that contains parameters and the direction of synchronization.
- The connectors can be simpler and use only a few types. They call Egeria Access services.
- For a given type of connector, the only code that must be written in the integration connection is mapping code between the open format and the third-party format. New types of connectors need other code to be written to call the Egeria access services appropriately.
Day 4: Developing in Egeria (contributing)
Day 4 of the Egeria Dojo Days covers coding in Egeria, which requires you to think like a middleware developer.
- First, and most importantly, being a good member of the Egeria community.
- Understand the scope of the change:
- What testing can be added to Egeria to ensure that capabilities continue to run as expected.
- Framework changes are rare and should be done with care to prevent regressions.
- Enhancements must consider open types and API impact.
- Bug fixes should not introduce regressions. Consider how to test automatically.
- Most capabilities use the connector framework to allow pluggability.
- Be aware of the status of the component that you are changing. More care must be taken if the component is released because the APIs must continue to work and might need new versions.
- Larger changes like introducing a new Access Service involve proposing the design, scope, and testing to the community so that your contribution will be accepted and merged by the maintainers. You can propose this in an issue, and, if required, in one of the community calls.
What has this got to do with IBM?
IBM embraces Egeria. In particular, Watson Knowledge Catalog uses Egeria to communicate between its repositories. To learn more, see Egeria open source standard enhances hybrid cloud metadata and data governance initiatives. There are also Egeria connectors for the IBM Information Governance Catalog, DataStage, and Cognos Analytics products. There are ongoing efforts to further integrate other IBM capabilities with Egeria, and IBM has multiple maintainers on the Egeria project.
Egeria solves a difficult problem, which is real-time integration between heterogeneous products, services, and solutions. There are many resources including the Egeria Dojos to help you to develop on and in Egeria quickly.
I really enjoy working with Egeria and both promoting and coding it. It seems to me to be a great way to solve the integration problem in a vendor-neutral way. I hope that you join us to collaborate in the open source community to make Egeria even better. We think a diverse, active community is a healthy one, and would love you to join us.