SNMP Introduction and Challenges Traditional networks rely on legacy mechanisms such as Simple...
Telemetry Transport and Encoding Options
Here we will discuss transport and encoding options for enabling telemetry and how to determine which option is best for your organisation.
Transport Options
1. Dial-in
In the dial-in option, the router is acting as a server and listens passively on a specific port until the collector "dials-in". The collector sends a TCP SYN packet first to establish a TCP session. Once the session is established the router pushes data to the collector at a configured specified interval. If GRPC is already used in the network for device configuration GRPC dial-in can be used for simplicity.
In summary:
- the collector will initiate the session
- update the router with the required subscription
- the router will always push data to the collector based on subscription.
Dial-in option security and load balancing considerations
As a general best practice firewall devices are used to protect network management systems from external and internal threats. In this type of network, firewalls need to be configured based on the required port, as per the configuration below.
To achieve load balancing and redundancy in the telemetry environment, device groups will be configured from the collector farm perspective (see the below image). One device group sends telemetry data streams to one set of collectors, another device group sends telemetry data streams to another group of collectors. You should be aware that this design option adds extra overhead from a management perspective to achieve redundancy and scale.
One key advantage of using the dial-in method is that if any organisation is using gRPC set up for device provisioning then that gRPC set up can simply add “dial-in” and push down the new configuration to devices. For instance, send telemetry subscription configs via gRPC setup and have the operational data streamed back. Another advantage is that the collector contains all the subscription definitions and as such it knows which stream it is subscribing to. Obviously centralising that data in the collector is helpful because you don’t have to configure all the other devices in the network.
2. Dial Out
In the dial-out option, the router initiates a TCP session to the collector by sending a TCP SYN packet to the collector and updates the collector with the data that it will be receiving. This option adds extra overhead from a configuration perspective by configuring collector address on all devices. Once a session is established, the router pushes data to the collector at configured specified intervals. In the dial-out option, the collector is stateless, meaning the collector will not initiate the session and the collector simply listens, subscribes and stores the collected data.
In the case of multiple collectors in the network, anycast and virtual addresses can be used for redundancy and load balancing purposes.
In summary, the router will:
- initiate the session
- update the collector about the required subscription
- always push data to the collector based on subscription.
Note that, in both the dial-in and dial-out models, the router will always push the data to the collector. This aspect doesn’t change; only the source of the session and location of the configuration changes.
Load balancing and redundancy considerations
In the dial-out model, routers send network telemetry streams and the destination can easily be configured to use the virtual-IP address of a load balancer. The load balancer then distributes streams to collectors. This method adds redundancy, load balance streams and allows for scaling.
gRPC dial-out provides several benefits when compared to gRPC dial-in:
- gRPC dial-out provides better load balancing and can be used for larger scale deployment. In the dial-out method, collectors are installed behind load balancer devices and network devices stream telemetry data to the virtual-IP of the load balancer. The load balancer intelligently distributes the streams to collectors. If one of the collectors goes offline, then the load balancer detects it and distributes the streams to remaining collectors. This way network operators can take advantage of both redundancy at the collectors’ level and at the streaming load distribution level.
- The gRPC dial-in method requires a collector to overcome a series of complex firewall configurations to gaining access to a device. This is not the case with gRPC dial-out, helping to reduce the target device’s exposure to threats outside of its topology.
- Collectors can be stateless; without the need to initiate a session, they simply listen, subscribe and store collected data.
3. GRPC tunnelling
The gRPC tunnel is basically a HTTP/2 (TCP port 443 session) tunnel, originating from network devices and terminating on a gRPC end point (on the collector or proxy). The tunnel telemetry collector establishes a session in the opposite direction to network devices. So, from a network perspective, there is a dial-out option from the network device to a tunnel endpoint (collector or alternative proxy). Then the collector will connect over that tunnel back to the network device over a gNMI to signal the telemetry stream.
Encoding
Network devices will send data over Json or GPB (Google Protocol Buffer) encoding formats. The following diagrams illustrate a quick comparison of various encoding formats.
Which encoding to choose?
The following image provides a comparison of the message length throughput between GPB, KV-GPB and JSON. GPB uses less memory from a wire efficiency perspective. However, JSON provides the highest level of throughput. Encoding methods should be chosen based on existing network requirements.
Summary
So, to conclude:
- If your organisation is looking for a quick and simple solution for a single router and collector, then TCP dial-out is the simplest form to go with. It’s simple to configure and there are no new protocols to learn.
- If your organisation is already using gRPC for configuration, a gRPC dial-in can be considered. Dial-in also allows for centralised configuration management.
- For a gRPC tunnel, network organisations do not require any series of complex firewall configurations. In addition, a gRPC tunnel allows for authentication and encryption via TLS, which adds extra layer of security to protect operational data.
To understand more about telemetry architecture and telemetry practical use case, please take a look at our telemetry webinar recording.