FANTEL X. Geng Internet-Draft Huawei Intended status: Standards Track P. Huo Expires: 5 December 2025 ByteDance Y. Zhu China Telecom D. Li Tsinghua University W. Cheng China Mobile C. Liu China Unicom 3 June 2025 Requirements of Fast Notification for Traffic Engineering and Load Balancing draft-geng-fantel-fantel-requirements-00 Abstract This document defines the requirements for Fast Notification for Traffic Engineering and Load Balancing (FaNTEL), a mechanism designed to deliver near real-time network status updates. FaNTEL enables fast failure and congestion notifications, supporting rapid protection switching and dynamic load balancing. By providing low- latency alerts, it helps networks respond quickly to link failures and congestion events, improving service reliability and performance. Status of This Memo This Internet-Draft is submitted in full conformance with the provisions of BCP 78 and BCP 79. Internet-Drafts are working documents of the Internet Engineering Task Force (IETF). Note that other groups may also distribute working documents as Internet-Drafts. The list of current Internet- Drafts is at https://datatracker.ietf.org/drafts/current/. Internet-Drafts are draft documents valid for a maximum of six months and may be updated, replaced, or obsoleted by other documents at any time. It is inappropriate to use Internet-Drafts as reference material or to cite them other than as "work in progress." This Internet-Draft will expire on 5 December 2025. Geng, et al. Expires 5 December 2025 [Page 1] Internet-Draft Requirements of Fantel June 2025 Copyright Notice Copyright (c) 2025 IETF Trust and the persons identified as the document authors. All rights reserved. This document is subject to BCP 78 and the IETF Trust's Legal Provisions Relating to IETF Documents (https://trustee.ietf.org/ license-info) in effect on the date of publication of this document. Please review these documents carefully, as they describe your rights and restrictions with respect to this document. Code Components extracted from this document must include Revised BSD License text as described in Section 4.e of the Trust Legal Provisions and are provided without warranty as described in the Revised BSD License. Table of Contents 1. Introduction to Fast Notification . . . . . . . . . . . . . . 2 1.1. Background and Motivation . . . . . . . . . . . . . . . . 2 1.2. Key Components of Fast Notification . . . . . . . . . . . 3 2. Fast Notification for Load Balancing . . . . . . . . . . . . 4 2.1. Background: Challenges in Load Balancing . . . . . . . . 4 2.2. Requirements for Fast Notification in Load Balancing . . 4 2.3. Core Benefits . . . . . . . . . . . . . . . . . . . . . . 5 3. Fast Notification for Protection . . . . . . . . . . . . . . 5 3.1. Background: Challenges in Network Protection . . . . . . 5 3.2. Requirements for Fast Notification in Protection . . . . 6 3.3. Core Benefits . . . . . . . . . . . . . . . . . . . . . . 6 3.4. Integration Requirements with Existing Protection Mechanisms . . . . . . . . . . . . . . . . . . . . . . . 6 4. Fast Notification for Flow Control . . . . . . . . . . . . . 7 4.1. Background: Challenges in Flow Control . . . . . . . . . 7 4.2. Requirements for Fast Notification in Flow Control . . . 7 4.3. Core Benefits . . . . . . . . . . . . . . . . . . . . . . 8 4.4. Integration with Existing Flow Control Mechanisms . . . . 8 5. Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . 8 6. Informative References . . . . . . . . . . . . . . . . . . . 9 Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . 9 1. Introduction to Fast Notification 1.1. Background and Motivation In today's increasingly dynamic and complex network environments, efficient traffic management and rapid adaptation to network changes are critical. Traditional network management systems are often limited in their ability to react quickly to sudden traffic shifts, failures, or congestion. As a result, these networks may experience performance degradation, prolonged service disruptions, or Geng, et al. Expires 5 December 2025 [Page 2] Internet-Draft Requirements of Fantel June 2025 inefficient resource utilization. The need for faster, more responsive network management has become especially evident with the rise of emerging technologies such as AI training and reasoning. These technologies are designed to enable flexible and agile network response, but they also introduce additional delay and complexity. The ability to maintain high levels of performance and availability in such environments demands faster communication and decision-making processes within the network. Fast Notification for Traffic Engineering and Load Balancing is a mechanism that provides real-time, rapid notification of network events (e.g., link failures, congestion, traffic shifts, or load imbalances) to relevant network nodes. By enabling swift communication between devices, fast notification facilitates quicker decision-making and faster adjustments to network routing and traffic management strategies. The core principle of Fast Notification is to reduce the time it takes for a network node to become aware of a change in its environment and to adjust accordingly. This is achieved through the use of high-priority, low-latency signaling mechanisms that notify nodes of changes in traffic patterns or network conditions almost immediately. 1.2. Key Components of Fast Notification * Fast Notification Messages: Lightweight, low-latency messages that convey state changes (such as traffic or network failure events) from one node to others. * Notification Propagation Mechanism: A reliable and efficient way to disseminate notifications quickly throughout the network. * Triggering Mechanism for Message Sending: A mechanism that detects significant network changes (e.g., link utilization thresholds, delay spikes, packet loss) and initiates the sending of fast notification messages. * Triggering Mechanism for Action: A mechanism that initiates an action (such as rerouting traffic or applying flow control) once the notification is received. This requirement framework enables network nodes to quickly adapt to real-time changes, thereby improving the overall performance and efficiency of the network. Geng, et al. Expires 5 December 2025 [Page 3] Internet-Draft Requirements of Fantel June 2025 Note: The detailed mechanisms and implementations (such as message format, propagation protocols, and triggering thresholds) are out of scope of this document and may be specified in separate documents. 2. Fast Notification for Load Balancing 2.1. Background: Challenges in Load Balancing Load balancing is a critical function in AI networks, ensuring that network resources are efficiently allocated and that no single node or link becomes overwhelmed with excessive traffic. Proper load balancing improves network performance, prevents bottlenecks, and ensures that network services remain responsive and reliable. However, current load balancing techniques face significant challenges in highly dynamic environments. One of the core issues is the lack of timely awareness and adaptive response to network state changes. Traditional mechanisms often rely on periodic global state synchronization or static policies, which results in delayed and inaccurate decision-making. These delays make it difficult to capture instantaneous changes such as link congestion, node failures, or traffic bursts. Moreover, load balancing decisions based on local views cannot perceive downstream contention or routing fluctuations, potentially leading to persistent traffic injection into congested paths and increased queuing and packet loss. Fast Notification is supposed to support load balancing by providing fast, efficient notification of changes in traffic patterns, network failures, and congestion. By using high-priority, low-latency messages, Fast Notification allows network nodes to immediately adjust their load balancing decisions in response to these changes, ensuring optimal resource utilization and performance. 2.2. Requirements for Fast Notification in Load Balancing 1. Traffic State Detection: Monitoring of traffic patterns, link utilization, and node load to trigger notifications on significant deviations. 2. Notification Propagation: Real-time propagation of event details (e.g., congestion, traffic shift) to relevant devices. 3. Action Adjustments: Nodes can reroute or redistribute traffic immediately upon receiving a notification. Geng, et al. Expires 5 December 2025 [Page 4] Internet-Draft Requirements of Fantel June 2025 Once a fast notification message is received, the load balancing mechanism is supposed to immediately reassess the routing and traffic allocation strategy. This may involve: * Shifting flows to underutilized paths * Splitting traffic across multiple paths * Throttling traffic destined for congested regions In addition, nodes may update their local state or forward the notification upstream to further optimize the network reaction. Timely and coordinated response across the network significantly enhances load balancing effectiveness. 2.3. Core Benefits * Real-time Traffic Information: Fast notification provides up-to- date information about the current state of the network, including traffic volume, node utilization, and link load. * Precise Load Rebalancing: Enables immediate notifications to the affected nodes for quick traffic redistribution. * Optimized Resource Utilization: Supports fine-grained traffic distribution on a per-packet or per-flow basis. 3. Fast Notification for Protection 3.1. Background: Challenges in Network Protection Network protection ensures service availability and minimizes disruptions due to failures like link outages or device malfunctions. However, traditional protection mechanisms face several limitations: * Slow Detection and Recovery: Traditional protection often relies on periodic failure detection and centralized rerouting, resulting in recovery times that are not fast enough for modern service expectations. * Inefficient Failover: Without fast notification, failover paths may not be activated or optimized in time, leading to service interruption. In high-reliability scenarios, network protection must be capable of rapid detection and notification of failures to meet performance goals such as sub-50ms recovery. Geng, et al. Expires 5 December 2025 [Page 5] Internet-Draft Requirements of Fantel June 2025 Fast Notification enables rapid notification of failures, allowing near-instantaneous and dynamic protection responses, minimizing user impact. 3.2. Requirements for Fast Notification in Protection 1. Failure Detection and Notification: Notifications are generated when failures occur and propagated in real-time. 2. Precise Notification Propagation: Notifications must reach relevant nodes quickly, such as upstream routers. 3. Optimization of Backup Paths: Failure notifications can trigger optimized rerouting or pre-established backup path activation. Upon receiving a notification of failure, protection mechanisms may immediately switch to backup paths, reroute traffic, or suppress affected routes. This ensures minimal disruption and quick recovery. Coordinated response strategies may include upstream node notification, service-aware failover, and path re-optimization based on updated network topology. 3.3. Core Benefits * Rapid Failure Response: Enables sub-second (or even sub-50ms) reaction to failures. * Improved Service Continuity: Minimizes traffic loss and recovery time. * Efficient Resource Utilization: Ensures backup resources are used only when needed, and in the most optimal way. 3.4. Integration Requirements with Existing Protection Mechanisms Fast Notification can be integrated with various existing protection schemes to improve their responsiveness and efficiency: * Fast Reroute (FRR): Fast notification enhances FRR by delivering failure notifications almost instantaneously, allowing for faster and more efficient rerouting of traffic. This helps maintain high availability and minimizes service disruption. * Hot Stand-by: Fast notification complements Routing Protocol Convergence protocols by providing real-time failure notifications, ensuring that devices can quickly switch to backup paths and maintain service continuity. Geng, et al. Expires 5 December 2025 [Page 6] Internet-Draft Requirements of Fantel June 2025 * Multi-Path Routing: In networks using ECMP or other multi-path routing protocols, fast notification enables the immediate re- adjustment of traffic flows when a failure is detected, ensuring optimal use of available paths. 4. Fast Notification for Flow Control 4.1. Background: Challenges in Flow Control Fast Notification enhances flow control by providing a fast, low- latency notification system that can detect and alert network devices to congestion events in real time. With Fast Notification, congestion can be identified and communicated to relevant devices almost instantaneously, allowing for rapid mitigation actions such as traffic rerouting, rate limiting, or queuing adjustments. A key challenge in flow control is the timely detection and dissemination of congestion events to avoid packet loss and throughput degradation. Traditional flow control mechanisms often rely on delayed feedback or reactive responses, which can lead to suboptimal network performance in highly dynamic environments. 4.2. Requirements for Fast Notification in Flow Control The integration of Fast Notification into flow control mechanisms involves several key processes: 1. Congestion Detection: Network devices continuously monitor traffic flows and link usage to identify potential congestion points. When congestion is detected, a notification is generated and sent through the Fast Notification system. These notifications include critical information, such as the affected link or device, the severity of the congestion, and the current traffic load. 2. Notification Propagation: Once the congestion event is detected, the Fast Notification system quickly propagates this information to other network devices that may be affected. This allows devices to be aware of the congestion in real time and begin planning their responses. 3. Rate Limiting and Traffic Buffering: In some cases, congestion may not be alleviated by rerouting traffic alone. Fast Notification can trigger rate limiting or traffic buffering actions, where traffic flows are temporarily adjusted to reduce the load on congested links. This helps to prevent packet loss and guarantee the network throughput. Geng, et al. Expires 5 December 2025 [Page 7] Internet-Draft Requirements of Fantel June 2025 4.3. Core Benefits * Real-Time Congestion Detection: Fast Notification provides real- time updates on network conditions, enabling devices to detect congestion as soon as it occurs. This ensures that corrective actions can be taken promptly before the congestion worsens. * Adaptive Congestion Management: Fast Notification enables adaptive congestion management by allowing devices to dynamically adjust to changing network conditions. For example, when congestion is detected, traffic can be rerouted, and resource allocation can be adjusted to avoid overloading any one path. * Minimized Packet Loss: By enabling real-time congestion alerts, Fast Notification helps to avoid packet loss by triggering corrective actions (such as traffic steering or flow rate adjustments) before the congestion reaches critical levels. 4.4. Integration with Existing Flow Control Mechanisms Fast Notification can be integrated with existing flow control strategies to improve their responsiveness and efficiency: * TCP Flow Control: Fast Notification can complement traditional TCP flow control by providing faster notifications of network congestion, allowing for quicker response times compared to conventional end-to-end congestion feedback. * Explicit Congestion Notification (ECN): Fast Notification can work alongside ECN by providing more granular, real-time updates on congestion status. This allows ECN-enabled devices to take immediate action to alleviate congestion without waiting for slower feedback mechanisms. 5. Summary The increasing complexity of modern networks, driven by the proliferation of data-intensive applications, IoT devices, and high- speed communication demands, underscores the urgent need for enhanced responsiveness in network traffic engineering, protection, and flow control. This document presents the requirements for *Fast Notification for Traffic Engineering and Load Balancing (FaNTEL)*, which aims to enable near real-time awareness and response to network state changes. These requirements cover essential capabilities such as: Geng, et al. Expires 5 December 2025 [Page 8] Internet-Draft Requirements of Fantel June 2025 * Timely detection of critical events such as congestion, link failures, and traffic shifts; * Efficient and prioritized dissemination of notification messages; * Support for immediate and adaptive network responses including rerouting, load redistribution, and flow control; * Compatibility with existing protocols and control mechanisms (e.g., FRR, ECN). As future efforts progress toward protocol development, solution design and evaluation, this requirements document serves as a foundational reference to guide the development of interoperable and efficient fast notification mechanisms. Ultimately, satisfying the requirements outlined herein will facilitate more responsive, intelligent, and resilient networks in the face of growing operational complexity and performance expectations. 6. Informative References [RFC3168] Ramakrishnan, K., Floyd, S., and D. Black, "The Addition of Explicit Congestion Notification (ECN) to IP", RFC 3168, DOI 10.17487/RFC3168, September 2001, . [RFC5880] Katz, D. and D. Ward, "Bidirectional Forwarding Detection (BFD)", RFC 5880, DOI 10.17487/RFC5880, June 2010, . [RFC7490] Bryant, S., Filsfils, C., Previdi, S., Shand, M., and N. So, "Remote Loop-Free Alternate (LFA) Fast Reroute (FRR)", RFC 7490, DOI 10.17487/RFC7490, April 2015, . Authors' Addresses Xuesong Geng Huawei Email: gengxuesong@huawei.com PengFei Huo ByteDance Email: huopengfei@bytedance.com Geng, et al. Expires 5 December 2025 [Page 9] Internet-Draft Requirements of Fantel June 2025 Yongqing Zhu China Telecom Email: zhuyq8@chinatelecom.cn Dan Li Tsinghua University Email: tolidan@tsinghua.edu.cn Weiqiang Cheng China Mobile Email: chengweiqiang@chinamobile.com Chang Liu China Unicom Email: liuc131@chinaunicom.cn Geng, et al. Expires 5 December 2025 [Page 10]