What Is Linkerd?
Linkerd is a lightweight open source service mesh developed primarily for Kubernetes. It provides added security, visibility, and reliability to cloud native applications, allowing you to observe all microservices running in a cluster without changing the code.
You can monitor requests, success rates, and latency issues for individual services. Linkerd also offers real-time network traffic analysis to help you diagnose failures. It is designed for ease of use, and can be implemented without complicated setup. Linkerd can process thousands of requests every second.
Linkerd: 13 key features
Here are the main features of Linkerd.
1. Authorization policies
Linkerd has authorization policies that let you control the type of traffic allowed in your meshed pods. For instance, you can limit communication with certain services (or HTTP paths to services) to specific other services. Linkerd can enforce mTLS on specific ports.
2. Automated mTLS
Linkerd has automatically enabled mutually authenticated Transport Layer Security (mTLS) by default, securing all TCP traffic between the pods in the service mesh. Thus, Linkerd automatically adds encrypted and authenticated communication to applications without further requiring actions.
The Linkerd control plane runs on the data plane, which means that all communication between the control plane components has automatic mTLS-based security.
3. Automated proxy injection
If a workload or namespace includes the injection annotation – linkerd.io/inject: enabled – Linkerd will automatically add a data plane proxy to the pod. This capability, called proxy injection, works with any workload, including pods and deployments.
Linkerd implements proxy injection as a Kubernetes webhook to add the proxy directly to a pod in the cluster. This works regardless of how the pods were created (i.e., by a CI/CD solution or kubectl). Learn more in our detailed guide to Linkerd proxy (coming soon).
4. Tracing
Distributed tracing is a useful tool for debugging performance issues in a distributed system, such as identifying the latency and bottlenecks associated with each system component. You can configure Linkerd to emit traces from each proxy, showing exactly how long requests and responses take.
Distributed tracing differs from most other Linkerd features because it requires code and configuration changes. Linkerd also offers features typically associated with distributed tracing that don’t require application or configuration changes.
5. High availability (HA) mode
The Linkerd control plane can operate in a high availability mode for production workloads. The HA mode runs three copies of certain control plane components – it can set production-ready memory and CPU resource requests for control plane components and data plane proxies.
This mode only works with a functional proxy injector that schedules pods automatically. It can also configure an anti-affinity policy on key control plane components to enforce scheduling in zones and nodes by default.
6. Fault injection
Fault injection is a chaos engineering technique that artificially increases a service’s error rate to determine its impact on the overall system. Traditionally, this required modifying the service code and adding a library to perform the fault injection. Linkerd can implement fault injection without changing your service code, requiring little or no configuration.
7. Load balancing
Linkerd provides automatic load balancing for requests using HTTP or gRPC connections, increasing resource efficiency across all target endpoints without requiring configuration. Linkerd also balances TCP connections.
Linkerd automatically routes requests to the fastest endpoint using the Exponentially Weighted Moving Average (EWMA) algorithm. This load balancing capability helps reduce end-to-end latency.
8. Multi-cluster communication
Linkerd enables you to connect Kubernetes services in different clusters in a secure, application-transparent, and network topology-independent way.
Like intra-cluster connections, the cross-cluster connections in Linkerd are transparent to your application code. Whether communication occurs within a cluster, between clusters in a VPC or data center, or over the internet, Linkerd establishes the connection using mTLS for encryption and authentication at both ends.
9. CNI plugin
Linkerd’s data plane transparently routes all TCP traffic from and to each meshed pod to the relevant proxy. This functionality enables Linkerd to perform actions without the application being aware.
By default, this rewiring occurs with an init container, which uses iptables to install routing rules for each pod during pod startup time. This requires the CAP_NET_ADMIN capability, which may not be granted to pods in some clusters.
To address this challenge, Linkerd can optionally run iptables rules in a CNI plugin rather than in an init container, eliminating the need for CAP_NET_ADMIN capability.
10. Service profiles
Service profiles are custom Kubernetes resources (CRDs) that provide additional service information. You can use a service profile to define a list of possible paths for a service. These paths use regular expressions to define the service’s actions, allowing Linkerd to report per-path metrics and enable per-path features like retries and timeouts.
11. Timeouts and retries
Automated retries are a powerful mechanism that allows the service mesh to handle transient or partial application failures gracefully. However, if retries are not implemented correctly, small errors can escalate and crash the entire system. Thus, Linkerd ensures proper implementation to limit the risk and increase the system’s reliability.
Timeouts help ensure the proper implantation of retries. When a request has too many retries, it is important to set a time limit for the client to wait before completely giving up the request. For example, without a timeout, a request could have unlimited retries, making the client wait ten seconds for each retry.
The service’s profile can define specific paths for retries and specify a timeout for the paths. As a result, the Linkerd proxy will perform the right number of retries or implement a timeout when calling the service. Timeouts and retries always work on the client side.
12. Telemetry and monitoring
Another powerful Linkerd feature is the extensive observability tools used to measure and report the behavior of applications in the service mesh. Linkerd has no direct knowledge from inside the service code, but has deep insight into the service code’s external behavior.
The Linkerd monitoring and telemetry features work automatically without any user effort. These features record the following metrics:
- Top-line metrics—including latency distribution, request volume, and success rate for gRPC and HTTP traffic
- TCP-level metrics—for example, bytes in or out for TCP traffic
- Per-service metrics—referring to specific services
- Per-caller-receiver-pair metrics
- Service profile metrics—referring to specific routes or paths
Linkerd can also generate topology diagrams that show runtime relationships between services. It also provides on-demand, real-time request sampling.
13. Traffic splitting
The traffic splitting feature lets you dynamically divert any portion of traffic destined for a given Kubernetes service to another destination service. You can split traffic to implement complex rollout strategies like canary and blue/green deployments. For example, you can slowly ease traffic from an old service version to a new version.
Combining Linkerd’s metrics with traffic splitting allows for a more robust deployment approach that automatically accounts for the latency and success rates of older and newer service versions.
Linkerd Architecture
A high-level description of the Linkerd architecture breaks it down into a control and a data plane.
Control Plane
Linkerd’s control plane includes multiple services within a dedicated namespace in Kubernetes (the default namespace is “linkerd”). Control plane components include:
- Destination service—the data plane’s proxies use this service to determine several behavioral aspects. It can retrieve service discovery, policy, and service profile data. Discovery data specifies the destination of a request and its expected TLS identity. Policy data specifies the request types allowed, while service profile data informs per-route metrics, timeouts, and retries.
- Identity service—serves as the TLS Certificate Authority accepting CSRs from the proxies and returning signed certificates. It issues these certificates when the proxies initialize, enabling connections between proxies to implement mTLS.
- Proxy injector—a Kubernetes controller that receives webhook requests upon each pod creation. This admission controller inspects each resource for an annotation specific to Linkerd (linkerd.io/inject: enabled). If this annotation is present, the proxy injector modifies the pod specification to add linkerd-proxy and proxy-init containers alongside the appropriate start-time configuration.
Data Plane
Linkerd’s data plane contains lightweight micro-proxies deployed as sidecars within the application pods. Each micro-proxy transparently intercepts the TCP pod’s inbound and outbound, with iptables rules determined by the linkerd-init container or the CNI plugin.
Data plane components include:
- Linkerd2-proxy—a transparent, lightweight micro-proxy designed for the service mesh, written in the Rust programming language. This proxy is not general-purpose. It supports DNS-based service discovery using the destination gRPC API. The Linkerd proxy provides transparent, zero-configuration proxying for TCP, HTTP, and WebSocket, alongside automation features such as automatic exporting of Prometheus metrics, load balancing (Layer 7 and Layer 4), and TLS. It also has an on-demand diagnostics API.
Linkerd-init container—all meshed pods have a Kubernetes init container running before other containers can start. It routes all inbound and outbound TCP pod traffic via the Linkerd proxy using iptables. The init container can run in various modes determining the version of iptables.
Linkerd Installation Tutorial: Getting Started
The following guide covers the Linkerd basics. You must have the command-line interface (CLI) installed on your machine before installing the Linkerd control plane on a Kubernetes cluster and “mesh” the application by adding the data plane. The guide and the code examples are based on the official Linkerd documentation.
Configuration
First, ensure you can access your Kubernetes cluster and kubectl command from your local machine. Run the following script to validate the Kubernetes setup:
kubectl version --short
The output should include client and server version components.
Installing the CLI
If you’re a first-time Linkerd user, you must install the CLI on your machine. The CLI lets you interact with the Linkerd deployment.
Manually install the CLI using:
curl --proto '=https' --tlsv1.2 -sSfL https://run.linkerd.io/install | sh
Follow the instructions for adding the CLI to the relevant path.
Next, check the CLI runs correctly using:
linkerd version
The CLI version (and Server version) should appear as “unavailable” until you install the control plane on the cluster.
Validating the cluster
There are many ways to configure a Kubernetes cluster – before installing the Linkerd control plane, check that the configuration is correct by running:
linkerd check --pre
If a check fails, follow the relevant links to fix the issue before continuing.
Installing Linkerd on the cluster
Once the CLI runs locally, and your cluster is ready, you can install Linkerd on the Kubernetes cluster by running:
linkerd install --crds | kubectl apply -f - linkerd install | kubectl apply -f -
These commands create Kubernetes manifests with the requisite resources for Linkerd and add them to the cluster. They install the Linkerd custom resource definitions (CRDs) and control plane. After one or two minutes (depending on the connection speed), verify the installation using:
linkerd check
Installing a demo application
When you first install Linkerd, it cannot do anything without an application. You can use a demo app like Emojivoto, a simple Kubernetes application that uses HTTP and gRPC calls to enable users to vote on emojis.
Use the following command to install the app in the emojivoto namespace in your cluster:
curl --proto '=https' --tlsv1.2 -sSfL https://run.linkerd.io/emojivoto.yml \ | kubectl apply -f -
Next, you have to “mesh” the app to support Linkerd – this involves adding the Linkerd data plane proxies. You can mesh live applications with zero downtime using Kubernetes rolling deployments.
Run the following:
kubectl get -n emojivoto deploy -o yaml \ | linkerd inject - \ | kubectl apply -f -
This command runs the manifests for all deployments in the namespace via proxy injection and reapplies them to the cluster. Like the install command, the inject command is a pure-text operation, allowing you to inspect inputs and outputs. Kubernetes executes a rolling deployment and updates the pods using the data plane proxies.
After adding Linkerd to your application, you can verify that everything functions currently on the data plane using:
linkerd -n emojivoto check --proxy
Linkerd vs. Istio
Istio is a popular open source service mesh platform. Organizations considering a service mesh are often faced with a choice between Linkerd and Istio. Let’s compare the two solutions on architecture, performance, and security.
Architecture
Istio has a similar architecture to Linkerd. A key difference between the two solutions is the proxy deployed in a sidecar container together with microservices:
- Istio uses the open source Envoy proxy, which is considered a de facto standard with over 300 companies contributing to the project.
- Linkerd 2.0 has its own proxy known as linkerd-2, which is maintained primarily by Bouyant.
Performance
In most environments, Istio and Linkerd have similar performance. Key differences:
- Istio uses the Envoy proxy which is written in C++, offers excellent performance, and is proven in large scale production environments.
- Linkerd uses the linkerd-2 micro-proxy, written in Rust. It offers good performance in smaller environments.
Security
Both Isio and Linkerd support certificate rotation and external root certificates. In addition:
- Istio provides more security features, mutual TLS (mTLS) for both HTTP and TCP. Istio also allows integration of security solutions with its policy management framework, making it possible to set granular rules limiting which applications can communicate with each other.
- Linkerd provides mTLS by default on TCP connections.
Linkerd vs. Gloo Mesh
While Linkerd has been in the market for quite a while. Solo.io has decided to only support Istio in the Gloo Mesh product. Istio has a more robust community of contributors to the open source project, and has been proven in large, production Enterprise environments.
From a business standpoint, adopting an enterprise solution for Istio service mesh management means you will have reduced risk, increased security, and easier management of the connectivity between Kubernetes-based and legacy applications. Istio management even helps with application modernization and “migration to cloud” initiatives by smoothing the adoption process and providing ongoing updates and support.
The Gloo Mesh difference
By default, basic open source distributions of Istio don’t go far enough to deliver features needed for comprehensive application networking. With Gloo Mesh, Solo builds upon and hardens the Istio distribution for production. Solo.io adds comprehensive functionality to your service mesh, reducing complexity while increasing security, reliability, and observability for consistent applications and microservices connectivity.
Solo.io believed early on that Istio’s powerful features will shape the future of the service mesh. As a top Istio contributor with more than 40,000 open source contributions and 9,400 GitHub stars, and a member of the Istio steering committee, Solo plays a significant role in guiding the project and building an essential foundation for today’s service mesh market.
It is also important to consider that the community support for open source software itself, such as Istio, doesn’t meet the requirements for production deployments. Your organization will need a vendor like Solo.io on standby to help you out. Inevitably there will be issues and when a CVE (common vulnerabilities and exposures) incident is discovered, it is reassuring to know that someone can quickly patch your code and even backport the fix to older versions if you haven’t kept up with the rapid pace of new releases.
Looking toward the future of Istio, Solo has contributed a great deal of work to build Ambient Mesh, which enables a new sidecar-less data plane for Istio. Istio Ambient Mesh enables an alternative Istio architecture that moves the proxy function from a sidecar to the node-level. This reduces costs by reducing the compute and memory requirements per node. It reduces the number of proxies to manage. Gloo Mesh will allow users to run both Istio sidecar-proxy mode and Ambient Mesh mode. This allows users flexibility to manage costs, operational simplicity, and performance where it best aligns to application needs.