multicast routing with pimd - a story of ups and downs with examples

By default, multicast traffic is not routed for good reason. Usually one wants to keep this traffic in it’s own network. This is the first important lesson. Nearly all multicast generating software is setting by default a max-hop-counter (time to live) on packages of 1 (the IP TTL field). This counter is decreased by every router by 1 on the way from sender to recipient. The counter is to avoid loops / forever looping packages in network. With a TTL of 1, the package is dropped on the first router.

$ ping 8.8.8.8 
PING 8.8.8.8 (8.8.8.8) 56(84) Bytes Daten. 
64 Bytes von 8.8.8.8: icmp_seq=1 ttl=113 Zeit=28.3 ms

See? The ping has a TTL of 113 set. So the icmp package can be forwarded by 112 routers until it gets dropped. We have to keep that in mind so that our multicast traffic does not get dropped.

Multicast-Basics

Multicast is a fairly simple mechanism to communicate with a group of receivers. A sender sends out traffic to a group and all group members receive the same data. Multicast is a uni-directional (one-way!) communication. This is the next important lesson. That means, that it relies on a protocol that does not require session initiation. Hence multicast only works with UDP-traffic and – as stated – only in one-way.

One-way means for example, a sender streams a video-signal to a group of receivers. Or a sender sends out a chat message to a group. No acknowledgment of data by receivers, no answer, no re-sending of missed/lost packages. No fancy TCP features at all if multicast is used.

Group-Management with IGMP explained

Joining or leaving a group is something an application does automatically or can be achieved manually for testing purpose. Technically, joining/leaving a group is – from a client perspective – just accepting packages from this moment on, that have the multicast destination-address set. But as others need to know, that we’ve joined/left a multicast group or would like to get traffic at all, a protocol is required. The protocol that does the whole „group management“ (announcing groups, joining groups, leaving groups) is called IGMP (internet group management protocol). A better word would be membership information and management protocol – but what do i know? There is IGMPv1-v3 but lets keep it simple.

The protocol is basically used for 3 tasks:

Announce an available group to others in the local network „This is group 239.2.5.31 – who wanna join!!?!?!“
Inform others about joining a specific group „I’m now part of group 239.2.5.31 that will be awesome!“
Inform others about leaving a group „Im out of this group, it sucks“

Lets catch some fancy IGMPv3 packages with tcpdump to show it in detail:

Joining the group 239.2.5.31 with a linux box and what tcpdump shows:

ip addr add 239.2.5.31/32 dev wlp3s0 autojoin

21:19:44.976186 IP (tos 0xc0, ttl 1, id 0, offset 0, flags [DF], proto IGMP (2), length 40, options (RA))
192.168.179.252 > 224.0.0.22: igmp v3 report, 1 group record(s) [gaddr 239.2.5.31 to_ex, 0 source(s)]

Leaving the same group right after:

ip addr del 239.2.5.31/32 dev wlp3s0 autojoin

20:08.996151 IP (tos 0xc0, ttl 1, id 0, offset 0, flags [DF], proto IGMP (2), length 40, options (RA))
192.168.179.252 > 224.0.0.22: igmp v3 report, 1 group record(s) [gaddr 239.2.5.31 to_in, 0 source(s)]

Note the gaddr in the message we are interested in and the to_ex and to_in.

gaddr is just the group, we are interested in/reporting for.

One could think of, that if a group is joined, it should be IN and on leave, it should be EX, but keep in mind, that these IGMP-messages are just REPORTS of the devices, that join or leave a group on it’s own and just inform the network about the local change. So the first message reports, that the client 192.168.179.252 is not to exclude from the group 239.2.5.31 (meaning it is in now).

The leave-report means, the same client is not included anymore (meaning it left the group) in the group 239.2.5.31.

Now one or many clients can form a multicast-group just by opting-in/out the group as stated above.

A sender that wants to send traffic to the group afterwards, just sends UDP-traffic to the group-IP.

This is all working fine in local network. For routing multicast traffic between networks that are not connected directly (no layer 2 connection) you need at least a router in each network with pimd installed.

So what about routing and how does pimd work?

pimd operates in 3 steps:

1, pimd is setup on the routers that have 2 or more interfaces and listens for IGMP-reports all the time. pimd learns all groups in each networks and it’s associated members.

2, without further configuration, pimd „detects“ other pimd-routers and all of them elect a primary router. The detection just happens by listening to IGMP-reports on foreign networks as well. This primary router (called RP – Rendezvous Point) will receive from now on all multicast-traffic from all other pimd’s. This is called rendezvous point, as all multicast traffic will come here together.

3, each pimd is sending all multicast-traffic, that it sees in it’s own network, to the elected rendezvous point. The rendezvous point is then either routing the traffic to all recipient-pimds (all pimd’s, that have locally group members, that awaiting the multicast traffic or/and sending the traffic out of it’s own local network interface towards final destinations.

One has to keep in mind, that pimd is not using the operating systems routing table. pimd(s) building up and keep maintaining own dynamic routing tables. And, most importantly, the „routed traffic“ is encapsulated on it’s way from one pimd to the RP or othre pimds. The IP-packages that are routed from pim to pim are unicast-packages. Let me repeat this to make it clear:

pimd is encapsulating the multicast-UDP-packages in a special crafted IP-package and sends it via unicast to the RP or final recipient pimd. This IP-packages have no multicast destination IP.

Encapsulated in this IP package is the origin UDP multicast package. Last but not least.

Lessons learned

TTL timer must not be 1 if routing should take place

Keep MTU-size of UDP-segments in mind. I had to lower the MTU to 1450. With default MTU of 1500, packages got dropped on routers as the maximum segment size ist usually 1500bytes for the IP packages, so a UDP-package of 1.500 does not fit.
make sure, no firewall is in place or if so, is configured properly
make sure, all the independent networks with multicast receivers have individual networks (we had issues, having 192.168.0.0/24 more than once)
Use NAT/masquerading if the same network is present more than once

Finally, some debugging hints for pimd:

Debugging pimd:

pimd -d (run in foreground-mode with debug flag)

pimd -r (shows current routing tables, learned neighbors and multicast-members in each group, as well as live join/leaves for each group)

multicast routing with pimd – a story of ups and downs with examples

Published by cubewerk on 29. Juni 202229. Juni 2022

Nachträgliches Erzeugen der Windows Recovery Partition – WinRE-Status: Disabled

command ‚ceph-volume lvm create failed, exit code 1 Was unable to complete a new OSD

Radius-Alternative Cisco ISE NAC Radius 802.1x Erfahrungsbericht 2024 aus der Praxis für KMU

multicast routing with pimd – a story of ups and downs with examples

Published by cubewerk on 29. Juni 202229. Juni 2022

Related Posts

Nachträgliches Erzeugen der Windows Recovery Partition – WinRE-Status: Disabled

command ‚ceph-volume lvm create failed, exit code 1 Was unable to complete a new OSD

Radius-Alternative Cisco ISE NAC Radius 802.1x Erfahrungsbericht 2024 aus der Praxis für KMU