Knative joins the CNCF as a serverless incubation project
Why Knative joining CNCF is a win-win for both communities and why developers will win even more
Today, IBM joins the Cloud Native Computing Foundation (CNCF) and the Knative community in welcoming Knative as the newest incubating project to the umbrella open source organization. This important milestone is crucial for the future of Knative and the CNCF.
As mentioned in a previous blog post, by joining the CNCF, the Knative community gains a vibrant open source organization which allows it to grow and gain more users. By accepting Knative as an incubating project, the CNCF gains a project that extends, simplifies, and enhances the Kubernetes platform for serverless and event-driven workloads. This is a win-win combination.
Knative’s journey to being an incubating project
This blog post briefly retraces the journey to today’s announcement, discusses IBM’s contributions and highlights a few features that are being worked on that will extend Knative’s capabilities even more.
IBM has been involved in the Knative community since the release of the project, circa summer of 2018. The community has grown since that time to include contributions from dozens of companies and thousands of individuals. This data reveals a vibrant community working together to make Knative the best serverless platform for Kubernetes today.
Along with colleagues from Red Hat, VMware, and other companies, IBM engineers participate actively in various working groups, including taking on leadership roles in the technical oversight and trademark committees. This blog post retraces our past contributions in more detail, including specific ongoing and incubating sub-projects started or being led by IBMers.
IBM is currently involved in improving Knative’s performance, async calls, and leading the Knative operator — features that we believe will improve Knative dramatically. Let’s look closer at these areas.
The length of time it takes to start a new container to execute a new serverless request (latency) is the Achilles’ heel of serverless technologies and, broadly, any distributed network systems. An attractive feature of serverless technologies is to scale down services that are no longer in use, so latency and start-up times are crucial. Depending on the access pattern of services, the scale up for these services might have to occur often and, therefore, a serverless user has to incur frequent start-up costs.
Minimizing the time for services to start is a key goal of any serverless architecture. Currently, the startup time of Knative services is not as optimal as it could be. This is due to a number of factors, and IBM engineers are addressing the following ones with promising results.
Increase the speed for setup of networking for pods. The Kubernetes stack delegates its container runtime integration to a standard called the Container Runtime Interface (CRI) which links the kubelet on each worker node to container runtimes such as containerd and cri-o. These runtimes use the container networking interface (CNI) and its plugins for networking. As every Knative service uses Kubernetes primitives which in turn are executed by containerd/cri-o runtimes, making that layer faster should have repercussions across the stack.
A recent set of PRs by IBM engineers and their CNI colleagues significantly increases the speed for networking setup for pods by making CNI setup execute more efficiently in parallel and changing how the plugins handle duplicate address detection. Additionally, IBM is working on getting a Kubernetes Enhancement Proposal (KEP) into Kubernetes that will allow cached container images to be used even in multi-tenant scenarios. Together, the results of these improvements are promising and could show orders of magnitude startup time for containers in Kubernetes and, therefore, Knative.
Increase probing to improve communication. Another improvement in the low-level dependent stack that promises to improve performance, was contributed by IBMers in 2020 as a Kubernetes Enhancement Proposal (KEP). In this KEP, IBM engineers propose to optionally increase the frequency at which probing occurs on Kubernetes pods. Increasing probing periods communicates key information faster to layers above (such as Knative scaling), with a goal to enable faster reactions. This could translate in faster scale up and scale down, a paramount feature of Knative.
Freeze containers to speed startup times. Finally, IBMers have implemented a container-freezer functionality that can pause/idle containers when they are not handling requests. This allows users to keep “warm” containers ready to handle requests (by reviving) and thereby avoid requiring a cold start while also preventing background capacity from being used unintentionally.
Implement asynchronous invocation patterns
Adding new invocation patterns to Knative services is another promising feature that IBMers are working on. Currently, all Knative services are called in a synchronous fashion. This means the request of a Knative user will block until the response of the request is returned or an error occurs. This blocking request / response pattern is common, popular, and mimics the fundamental way the web works (HTTP requests).
However, in many use cases, a blocking request / response primitive is not sufficient. In particular, for data processing and AI use cases, a blocking invocation approach is sub-optimal. The execution of these services is often long running and surpasses the timeouts for responses, or result in the client having to manage a multitude of pending blocking requests. A more natural invocation pattern is to allow for “fire and forget” or asynchronous invocations, where services are called in an async manner. Doing so allows the client not to block as the service execution is unraveled.
The Knative async-component aims to achieve exactly this invocation pattern. Best of all, it does so in a natural and progressive manner that makes any service asynchronous with a simple label and lets the service’s caller decide when to invoke the service synchronously or asynchronously. The project is still in incubation but once it reaches beta-level, we can encourage Knative users with similar async use cases to download and try it in their own Knative clusters.
New features in the Knative Operator
In 2021, the Knative operator was downloaded over 564K times, an increase that is 100 times greater than all the download counts prior to 2021. The Knative operator manages the full lifecycle of all Knative components by leveraging the custom resources for Knative Serving and Eventing. New features in the Knative operator that readied it for the CNCF submission include:
- Install and uninstall Knative Serving and Evening components
- Enable and disable the ingresses of Knative Serving and the sources of Knative Eventing
- Configure the affinities, tolerations, nodeSelectors, resources, etc for the deployments
- Upgrade Knative Serving and Eventing
- Install custom manifests
The footprints of IBM contributions are everywhere in the Knative operator—leading the project’s engineering and management. A new Knative Client plugin for the operator is under development and will enable end users to configure Knative based on the operator, via the kn command line, which will further lower the threshold for end users.
Above are some of the many improvements to Knative that IBMers have been working on recently. These are in addition to helping maintain the current code base by addressing identified issues, improving user experience and documentation.
Knative constitutes the core of the IBM Cloud Code Engine product and with the different initiatives listed above and those in the works in the community, we could not be happier about Knative joining a broader community in the CNCF.
To learn more about these initiatives, meet IBM engineers, and find more information about everything Knative, we invite you to attend (and / or submit a talk before March 8th) the first KnativeCon at KubeCon Europe in Valencia, Spain in May 2022.