Here we will discuss transport and encoding options for enabling telemetry and how to determine which option is best for your organisation.
In the dial-in option, the router is acting as a server and listens passively on a specific port until the collector "dials-in". The collector sends a TCP SYN packet first to establish a TCP session. Once the session is established the router pushes data to the collector at a configured specified interval. If GRPC is already used in the network for device configuration GRPC dial-in can be used for simplicity.
In summary:
As a general best practice firewall devices are used to protect network management systems from external and internal threats. In this type of network, firewalls need to be configured based on the required port, as per the configuration below.
To achieve load balancing and redundancy in the telemetry environment, device groups will be configured from the collector farm perspective (see the below image). One device group sends telemetry data streams to one set of collectors, another device group sends telemetry data streams to another group of collectors. You should be aware that this design option adds extra overhead from a management perspective to achieve redundancy and scale.
One key advantage of using the dial-in method is that if any organisation is using gRPC set up for device provisioning then that gRPC set up can simply add “dial-in” and push down the new configuration to devices. For instance, send telemetry subscription configs via gRPC setup and have the operational data streamed back. Another advantage is that the collector contains all the subscription definitions and as such it knows which stream it is subscribing to. Obviously centralising that data in the collector is helpful because you don’t have to configure all the other devices in the network.
In the dial-out option, the router initiates a TCP session to the collector by sending a TCP SYN packet to the collector and updates the collector with the data that it will be receiving. This option adds extra overhead from a configuration perspective by configuring collector address on all devices. Once a session is established, the router pushes data to the collector at configured specified intervals. In the dial-out option, the collector is stateless, meaning the collector will not initiate the session and the collector simply listens, subscribes and stores the collected data.
In the case of multiple collectors in the network, anycast and virtual addresses can be used for redundancy and load balancing purposes.
In summary, the router will:
Note that, in both the dial-in and dial-out models, the router will always push the data to the collector. This aspect doesn’t change; only the source of the session and location of the configuration changes.
In the dial-out model, routers send network telemetry streams and the destination can easily be configured to use the virtual-IP address of a load balancer. The load balancer then distributes streams to collectors. This method adds redundancy, load balance streams and allows for scaling.
gRPC dial-out provides several benefits when compared to gRPC dial-in:
The gRPC tunnel is basically a HTTP/2 (TCP port 443 session) tunnel, originating from network devices and terminating on a gRPC end point (on the collector or proxy). The tunnel telemetry collector establishes a session in the opposite direction to network devices. So, from a network perspective, there is a dial-out option from the network device to a tunnel endpoint (collector or alternative proxy). Then the collector will connect over that tunnel back to the network device over a gNMI to signal the telemetry stream.
Network devices will send data over Json or GPB (Google Protocol Buffer) encoding formats. The following diagrams illustrate a quick comparison of various encoding formats.
The following image provides a comparison of the message length throughput between GPB, KV-GPB and JSON. GPB uses less memory from a wire efficiency perspective. However, JSON provides the highest level of throughput. Encoding methods should be chosen based on existing network requirements.
So, to conclude:
To understand more about telemetry architecture and telemetry practical use case, please take a look at our telemetry webinar recording.