Understanding ICE for Teams Media - Part 2.
Overview
Last week, I started a series of article about ICE protocol and its implementation for Microsoft Teams. Today, I'm going to write a bit more about ICE builds up the candidate pairs and how ICE test those connectivities.
Here you can find part 1.
Build candidate pairs
Before we can build a candidate pairs we also have to run a process called Determining Role. For each session, each agent takes on a role. There are two roles; controlling and controlled. The controlling agent is responsible for the choice of the final candidate pairs used for the communications. This means nominating the candidate pairs that can be used by ICE for each media stream, and for generating the updated offer based on ICE's selection, when needed.
The agent that generated the offer which started the ICE processing MUST take the controlling role, and the other MUST take the controlled role. Both agents will form check lists, run the ICE state machines, and generate connectivity checks. The controlling agent will execute the logic to nominate pairs that will be selected by ICE.
Finally we can start forming the candidate pairs. First, the agent takes each of its candidates for a media stream (called LOCAL CANDIDATES) and pairs them with the candidates it received from its peer (called REMOTE CANDIDATES) for that media stream.
A local candidate is paired with a remote candidate if and only if the two candidates have the same component ID and have the same IP address version. It is possible that some of the local candidates won't get paired with remote candidates, and some of the remote candidates won't get paired with local candidates. This can happen if one agent doesn't include candidates for the all of the components for a media stream. If this happens, the number of components for that media stream is effectively reduced, and considered to be equal to the minimum across both agents of the maximum component ID provided by each agent across all components for the media stream.
The candidate pairs whose local and remote candidates are both the default candidates for a particular component is called, unsurprisingly, the default candidate pair for that component. This is the pair that would be used to transmit media if both agents had not been ICE aware.
Now as we have a list of candidate pairs we need to computing pair priority and ordering the pairs.
Once the pairs are formed, a candidate pair priority is computed. Let G be the priority for the candidate provided by the controlling agent. Let D be the priority for the candidate provided by the controlled agent. The priority for a pair is computed as:
Pair priority = 2^32*MIN(G,D) + 2*MAX(G,D) + (G>D?1:0)
Where G>D?1:0 is an expression whose value is 1 if G is greater than D, and 0 otherwise. Once the priority is assigned, the agent sorts the candidate pairs in decreasing order of priority. If two pairs have identical priority, the ordering amongst them is arbitrary.
This sorted list of candidate pairs is used to determine a sequence of connectivity checks that will be performed. Each check involves sending a request from a local candidate to a remote candidate. Since an agent cannot send requests directly from a reflexive candidate, but only from its base, the agent next goes through the sorted list of candidate pairs. For each pair where the local candidate is server reflexive, the server reflexive candidate MUST be replaced by its base. Once this has been done, the agent MUST prune the list. This is done by removing a pair if its local and remote candidates are identical to the local and remote candidates of a pair higher up on the priority list. The result is a sequence of ordered candidate pairs, called the check list for that media stream.
Now we need to compute the states of the candidate pairs. Each candidate pair in the check list has a foundation and a state. The foundation is the combination of the foundations of the local and remote candidates in the pair. The state is assigned once the check list for each media stream has been computed. There are five potential values that the state can have:
- Waiting - a check has not been performed
- In-Progress - a check has been sent for this pair, but the transaction is in progress
- Succeeded - A check for this pair done and produced a successful result.
- Failed - A check for this pair done and failed.
- Frozen - A check for this pair hasn't been performed and it can't yet be performed until some other check succeeds. After it can be moved to the Waiting state.
The initial states for each pair in a check list are computed by performing the following sequence:
- The agent sets all of the pairs in each check list to the Frozen state
- The agent examines the check list for the first media stream (first m line in the SDP offer). For all pairs with the same foundation, it sets the state of the pair with the lowest component ID to Waiting. If there is more then one the highest priority one will be used.
- The check list itself is associated with a state. Running, completed or failed. When a check list is first constructed as the consequence of an offer/answeer exchange, its placed in the Running state.
- An agent performs ordinary checks and triggered checks. The generation of both checks is governed by a timer that fires periodically for each media stream. The agent maintains a FIFO queue, called the triggered check queue, which contains candidate pairs for which checks are to be sent at the next available opportunity. When the timer fires, the agent removes the top pair from the triggered check queue, performs a connectivity check on that pair, and sets the state of the candidate pair to In-Progress. If there are no pairs in the triggered check queue, an ordinary check is sent. Once the agent has computed the check lists as described in it sets a timer for each active check list. When the timer fires and there is no triggered check to be sent, the agent MUST choose an ordinary check as follows:
- Find the highest-priority pair in that check list that is in the Waiting state.
- If there is such a pair: Send a STUN check from the local candidate of that pair to the remote candidate of that pair. Set the state of the candidate pair to In-Progress.
- If there is no such pair:Find the highest-priority pair in that check list that is in the Frozen state.
If there is such a pair: Unfreeze the pair. Perform a check for that pair, causing its state to transition to In-Progress.
If there is no such pair: Terminate the timer for that check list.
The above steps well illustrated how we generating the pair and what checks we are doing. A very similar steps as above is executed when an agent receives the answer from the peer.
In this article I'm not going to describe all the STUN checks in details, however it can eventually grouped into either Failure Cases or Success Cases.
A check is considered to be a success if all of the following are true:
- The STUN transaction generated a success response.
- The source IP address and port of the response equals the destination IP address and port to which the Binding request was sent.
- The destination IP address and port of the response match the source IP address and port from which the Binding request was sent.
The agent checks the mapped address from the STUN response. If the transport address does not match any of the local candidates that the agent knows about, the mapped address represents a new candidate -- a peer reflexive candidate. Like other candidates, it has a type, base, priority, and foundation. They are computed as follows:
- Its type is equal to peer reflexive.
- Its base is set equal to the local candidate of the candidate pair from which the STUN check was sent.
- Its priority is set equal to the value of the PRIORITY attribute in the Binding request.
This peer reflexive candidate is then added to the list of local candidates for the media stream. Its username fragment and password are the same as all other local candidates for that media stream.
The agent constructs a candidate pair whose local candidate equals the mapped address of the response, and whose remote candidate equals the destination address to which the request was sent. This is called a valid pair, since it has been validated by a STUN connectivity check. The valid pair may equal the pair that generated the check, may equal a different pair in the check list, or may be a pair not currently on any check list. If the pair equals the pair that generated the check or is on a check list currently, it is also added to the VALID LIST, which is maintained by the agent for each media stream. This list is empty at the start of ICE processing, and fills as checks are performed, resulting in valid candidate pairs.
If the agent was a controlling agent, and it had included a USE-CANDIDATE attribute in the Binding request, the valid pair generated from that check has its nominated flag set to true. This flag indicates that this valid pair should be used for media if it is the highest-priority one amongst those whose nominated flag is set. This may conclude ICE processing for this media stream or all media streams.
Nominate Final Path
Concluding ICE involves nominating pairs by the controlling agent and updating of state machinery. The controlling agent nominates pairs to be selected by ICE by using one of two techniques: regular nomination or aggressive nomination.Regular Nomination
With regular nomination, the agent lets some number of checks complete, each of which omit the USE-CANDIDATE attribute. Once one or more checks complete successfully for a component of a media stream, valid pairs are generated and added to the valid list. The agent lets the checks continue until some stopping criterion is met, and then picks amongst the valid pairs based on an evaluation criterion. The criteria for stopping the checks and for evaluating the valid pairs is entirely a matter of local optimization.
When the controlling agent selects the valid pair, it repeats the check that produced this valid pair (by enqueuing the pair that generated the check into the triggered check queue), this time with the USE-CANDIDATE attribute. This check should succeed (since the previous did), causing the nominated flag of that and only that pair to be set. Consequently, there will be only a single nominated pair in the valid list for each component, and when the state of the check list moves to completed, that exact pair is selected by ICE for sending and receiving media for that component.
Regular nomination provides the most flexibility, since the agent has control over the stopping and selection criteria for checks. The only requirement is that the agent MUST eventually pick one and only one candidate pair and generate a check for that pair with the USE-CANDIDATE attribute present.
Aggressive Nomination
With aggressive nomination, the controlling agent includes the USE-CANDIDATE attribute in every check it sends. Once the first check for a component succeeds, it will be added to the valid list and have its nominated flag set. When all components have a nominated pair in the valid list, media can begin to flow using the highest priority nominated pair. However, because the agent included the USE-CANDIDATE attribute in all of its checks, another check may yet complete, causing another valid pair to have its nominated flag set. ICE always selects the highest-priority nominated candidate pair from the valid list as the one used for media. Consequently, the selected pair may actually change briefly as ICE checks complete, resulting in a set of transient selections until it stabilises.
After these steps media should start flowing between the two agents.
In Part 3, I will include some information how Microsoft uses ICE, what security considerations are in place.
To be continued...
You can find this article on LinkedIn:
Comments
Post a Comment