| Internet-Draft | Efficient Remote Protection | March 2026 |
| Clad, et al. | Expires 3 September 2026 | [Page] |
This document specifies Efficient Remote Protection (ERP), a mechanism for IP Fast Reroute (IP-FRR) that utilizes network notifications to activate pre-computed backup paths at nodes multiple hops upstream of a failure. ERP addresses scenarios where local protection mechanisms, such as Loop-Free Alternates (LFA) or Topology-Independent LFA (TI-LFA), result in suboptimal paths, specifically traffic hairpinning.¶
By activating protection at strategically selected upstream nodes rather than at the node immediately adjacent to the failure, ERP preserves routing optimality and prevents bandwidth waste. ERP applies to both complete link/node failures and performance degradations such as congestion or reduced link capacity. This makes ERP particularly beneficial in networks with high link utilization, such as AI data centers and Data Center Interconnect (DCI) networks.¶
This Internet-Draft is submitted in full conformance with the provisions of BCP 78 and BCP 79.¶
Internet-Drafts are working documents of the Internet Engineering Task Force (IETF). Note that other groups may also distribute working documents as Internet-Drafts. The list of current Internet-Drafts is at https://datatracker.ietf.org/drafts/current/.¶
Internet-Drafts are draft documents valid for a maximum of six months and may be updated, replaced, or obsoleted by other documents at any time. It is inappropriate to use Internet-Drafts as reference material or to cite them other than as "work in progress."¶
This Internet-Draft will expire on 3 September 2026.¶
Copyright (c) 2026 IETF Trust and the persons identified as the document authors. All rights reserved.¶
This document is subject to BCP 78 and the IETF Trust's Legal Provisions Relating to IETF Documents (https://trustee.ietf.org/license-info) in effect on the date of publication of this document. Please review these documents carefully, as they describe your rights and restrictions with respect to this document. Code Components extracted from this document must include Revised BSD License text as described in Section 4.e of the Trust Legal Provisions and are provided without warranty as described in the Revised BSD License.¶
IP Fast Reroute (IP-FRR) mechanisms ([RFC5714]) enable rapid traffic rerouting upon link or node failures by pre-computing and installing backup paths. Traditional IP-FRR mechanisms perform protection at the Point of Local Repair (PLR), which is the node directly adjacent to the failed resource. While this local protection model provides robust and immediate failure response, it has limitations:¶
Topology Dependency: Loop-Free Alternates (LFA, [RFC5286]) and Remote LFAs (RLFA, [RFC7490]) do not provide complete protection coverage in all topologies, as described in [RFC6571].¶
Path Optimality: While Topology-Independent Loop-Free Alternate (TI-LFA, [RFC9855]) provides complete protection coverage and enforces the post-convergence path from the PLR's perspective, it may steer traffic through suboptimal paths from the perspective of upstream nodes. Specifically, when the PLR's backup path traverses nodes that are upstream of the PLR on the primary path, traffic originating from or transiting through those nodes experiences hairpinning, where packets flow toward the PLR before being redirected back toward the destination.¶
Efficient Remote Protection (ERP) addresses the path optimality limitation by introducing a notification-triggered protection model. ERP allows strategically selected upstream nodes to pre-compute and install backup paths that protect against failures and performance degradations multiple hops away and activate these paths upon receiving a network notification ([I-D.ietf-rtgwg-net-notif-ps]) from the node adjacent to the affected resource. For complete failures, ERP reroutes all traffic; for performance degradations, ERP may load-balance traffic across primary and backup paths. This approach enables traffic to be rerouted from the most efficient location(s) in the network, avoiding hairpins and preserving optimal routing under failure and degradation conditions.¶
ERP is particularly beneficial in networks with high link utilization, such as those supporting AI workloads, where traffic patterns are highly synchronized, flows are large, and link capacity must be used efficiently.¶
This document uses the following terms:¶
Point of Local Repair (PLR): The node directly adjacent to a failed resource (link or node). In traditional IP-FRR mechanisms, the PLR is responsible for activating backup paths.¶
Point of Remote Repair (PRR): A node that is one or more hops upstream of the PLR and that activates a pre-computed backup path upon receiving a network notification about a remote failure.¶
Protected Resource: A link or node for which backup paths are computed and installed.¶
Network Notification: A signal sent by a node upon detecting a local failure or performance degradation, intended to trigger protection mechanisms at remote nodes. Network notifications are described in [I-D.ietf-rtgwg-net-notif-ps].¶
Hairpin: A suboptimal routing condition where traffic is forwarded toward a destination via a node that is located further from the destination than the traffic's current location, resulting in unnecessary bandwidth consumption.¶
Q-Space: The set of nodes from which a destination can be reached without traversing a given failed resource. This term is defined in [RFC9855].¶
This section illustrates the path efficiency problem that ERP addresses through two scenarios.¶
Consider a generic scenario where a node R is adjacent to a resource (link or node) X, and traffic destined to D normally traverses X. Node R must protect destination D against the failure of resource X.¶
If node R has a directly attached LFA neighbor Q for destination D with respect to resource X, as shown in Figure 1 and Figure 2, R installs Q as a backup next-hop for destination D and activates it upon failure of resource X. In this case, the backup path is guaranteed not to create a hairpin. A fundamental property of LFA is that the LFA neighbor Q is not upstream of X (and therefore not upstream of R) on the shortest path to D.¶
If node R does not have a directly attached LFA for destination D with respect to resource X, R may still be able to protect against the failure of X by using TI-LFA to enforce a path through one or more intermediate nodes U_0, U_1, ..., U_k before reaching a node Q in the Q-Space of D with respect to resource X. The Q-Space, defined in [RFC9855], is the set of nodes from which the path to D is unaffected by the failure of X.¶
However, these intermediate nodes (U_0, U_1, ..., U_k) that are outside the Q-Space are upstream of X, and may be upstream of R, on the primary path to D. When traffic originates at or transits through such a node U_i, it follows the primary path toward R. Upon reaching R, the traffic may be redirected via a TI-LFA backup path back through U_i and then onward through Q to reach D. This creates a hairpin, where traffic is unnecessarily transmitted from U_i to R and back, as depicted in Figure 3.¶
This hairpin consumes unnecessary bandwidth on the links between U_i and R in both directions, potentially causing congestion in networks with high link utilization, and increases end-to-end latency.¶
The principle of ERP is to prevent hairpins such as the one depicted in Figure 3 by installing a backup path to D (protecting against the failure of resource X) at a node U that is upstream of R and would otherwise create a hairpin. When U receives a network notification from R indicating the failure of resource X, U activates its pre-installed backup path. This allows traffic originating at or transiting through U to be rerouted directly toward Q and then to D, without traversing R, as depicted in Figure 4.¶
The ERP backup path is enforced using Segment Routing ([RFC8402]) and encoded as a loop-free segment list from node U that steers traffic to a node in the Q-Space of D with respect to resource X, without traversing the failed resource X.¶
This section describes the steps for computing and installing ERP backup paths. The specific algorithms for path computation and the protocol mechanisms for network notification subscription are outside the scope of this document.¶
For a given destination D and protected resource X adjacent to node R, a node U is a candidate Point of Remote Repair (PRR) if:¶
Node U is upstream of R on the primary path to D, and¶
The TI-LFA backup path computed by R for destination D (protecting resource X) would traverse U, creating a hairpin for traffic originating at or transiting through U.¶
ERP backup paths should be installed preferably at PRR candidates that are closest to the protected resource to limit the number of ERP backup paths required.¶
A PRR candidate node U subscribes to network notifications for resource X from node R. The mechanism for establishing this subscription depends on the specific network notification protocol used and is outside the scope of this document.¶
The PRR node U computes a backup path to destination D that protects against the failure of resource X and does not create a hairpin. The backup path is encoded as a loop-free segment list that steers traffic to a node in the Q-Space of D with respect to resource X.¶
The segment list should terminate at the first node in the Q-Space along the backup path to minimize the length of the segment list and allow traffic to follow the regular shortest path from that point onward.¶
The PRR node U installs the computed backup path in its forwarding plane and associates it with the network notification for resource X from node R.¶
Upon receiving a network notification for resource X from node R, a PRR node U activates its pre-installed ERP backup path for destination D as follows.¶
If the notification indicates a complete failure of resource X, node U immediately reroutes all traffic destined to D through the ERP backup path.¶
If the notification indicates a performance degradation of resource X (such as reduced link capacity, or congestion), node U may load-balance traffic between the primary path (via R) and the ERP backup path.¶
When load-balancing is employed, the load-balancing ratio should be determined based on the severity of the performance degradation indicated in the notification and may be adjusted dynamically as conditions change, based on subsequent notifications.¶
ERP is designed to coexist with existing IP-FRR mechanisms such as LFA and TI-LFA. Traffic that passes through a PRR with an installed and activated ERP backup path will be protected at that upstream location, while traffic that does not pass through an ERP-enabled PRR will continue to be protected by traditional local protection mechanisms at the PLR.¶
This allows for incremental deployment of ERP in a network. Operators may initially deploy ERP at strategic nodes (such as those carrying high traffic volumes or those most susceptible to hairpinning) without disrupting existing protection schemes. Over time, ERP deployment can be expanded to additional nodes as needed.¶
The PLR (node R) must maintain its local protection mechanisms (e.g., TI-LFA backup paths) even when ERP is deployed at upstream nodes. This ensures protection for traffic that does not pass through any PRR, as well as providing a fallback mechanism.¶
The effectiveness of ERP depends on the reliable and timely delivery of network notifications from the PLR to PRR nodes. Operators should ensure that the network notification mechanism provides sufficient reliability and low latency to meet the protection requirements of the network.¶
If a PRR does not receive a notification within an expected timeframe after a failure (e.g., due to notification loss), traffic will continue to be protected by the PLR's local protection mechanism, albeit potentially with hairpinning.¶
Operators should consider the capacity of backup paths when deploying ERP. While ERP avoids hairpinning and improves path efficiency, the backup paths themselves must have sufficient capacity to carry the redirected traffic without causing congestion.¶
To be done.¶
This document does not require any IANA actions.¶