Skip to content
Snippets Groups Projects
Commit 7b924d9d authored by ovari's avatar ovari Committed by Adrien Béraud
Browse files

developer/jami-concepts/calls.md: cleanup

Change-Id: I48889b81897deedbfbeefa2b5058d7b6835abaa8
parent 5b8c5a56
Branches
No related tags found
No related merge requests found
Calls
=====
# Calls
**NOTE: this page detail the principle for Jami accounts. For SIP accounts, the SIP protocol is used.**
```{important}
This page details the principle for Jami accounts.
For SIP accounts, the SIP protocol is used.
```
Let's do a call in Jami!
## Daemon side
When creating a call between two peers, Jami mainly uses known protocols such as ICE, SIP or TLS. However, to make it distributed, the process of creating a call is a bit different. To summarize, when someone wants to contact one of its contact, this is what they will do:
When creating a call between two peers, Jami mainly uses known protocols such as ICE, SIP, or TLS.
However, to make it distributed, the process of creating a call is a bit different.
To summarize, when someone wants to contact one of their contacts, this is what they will do:
1. Search the contact presence on the DHT (for more details, see {doc}`contact-management`)
2. Once the contact is found, send a call request, announcing the known candidates (the ip of each network interfaces + relay addresses (TURN) + reflexives addresses (UPnP, public ones).
3. Wait for the response of the contact (they will respond their known addresses).
1. Search the contact presence on the DHT (for more details, see {doc}`contact-management`).
2. Once the contact is found, send a call request, announcing the known candidates (the IP address of each network interface + relay addresses (TURN) + reflexive addresses (UPnP, public ones).
3. Wait for the response of the contact (they will respond to their known addresses).
4. Negotiate the socket via ICE. In fact, two ICE sessions are negotiated. One (preferred) in TCP, one in UDP (as a fallback).
5. Then, the socket is encrypted in TLS (if TCP) or DTLS (if UDP).
6. The contact is now able to accept or decline the call. When they accept, a ICE transport (UDP only for now) is negotiated to create 4 new sockets for the medias (2 for audio, 2 for video).
6. The contact is now able to accept or decline the call. When they accept, an ICE transport (UDP only for now) is negotiated to create 4 new sockets for the media (2 for audio, 2 for video).
7. The call is now alive!
### Exchange ICE candidates
Everything really starts in `jamiaccount.cpp` (`JamiAccount::startOutgoingCall`). Once both ICE objects are ready and when the contact is found via the DHT, the call request for the contact is crafted. This request contains all the informations necessary for the remote ICE session defined by:
Everything really starts in `jamiaccount.cpp` (`JamiAccount::startOutgoingCall`).
Once both ICE objects are ready and when the contact is found via the DHT, the call request for the contact is crafted.
This request contains all the information necessary for the remote ICE session defined by:
```cpp
dht::IceCandidates(callvid, blob)
```
where `callvid` is a random number used to identify the call and blob contains two concatened ICE messages (`IceTransport::packIceMsg` in `ice_transport.cpp`) containing the password of the session, the *ufrag* and ICE candidates.) like:
where:
* `callvid` is a random number used to identify the call, and
* `blob` contains two concatenated ICE messages (`IceTransport::packIceMsg` in `ice_transport.cpp`) containing the password of the session, the *ufrag*, and ICE candidates like:
```
0d04b935
......@@ -45,67 +53,73 @@ Hc0a8007e 1 TCP 2130706431 192.168.0.123 9 typ host tcptype active
Sc0a8007e 1 TCP 1694498815 X.X.X.X 42751 typ srflx tcptype passive
```
and is sent via the DHT in an encrypted message for the device to `hash(callto:xxxxxx)` where `xxxxxx` is the device id. The peer will answer at the exact same place (but encrypted for the sender device) its own `dht::IceCandidates`. See `JamiAccount::replyToIncomingIceMsg` for more details.
and is sent via the DHT in an encrypted message for the device to `hash(callto:xxxxxx)` where `xxxxxx` is the device ID.
The peer will answer at the exact same place (but encrypted for the sender device) its own `dht::IceCandidates`.
See `JamiAccount::replyToIncomingIceMsg` for more details.
The ICE session is created both side when they have all the candidates (so for the sender, when the reply from the contact is received).
The ICE session is created on both sides when they have all the candidates (so for the sender, when the reply from the contact is received).
### ICE negotiation
Pending calls are managed by `JamiAccount::handlePendingCallList()`, which first wait that the TCP negotiation finish (and if it fails, wait for the UDP one). The code for the ICE negotiation is mainly managed by [pjproject](https://github.com/pjsip/pjproject) but for Jami, the interesting part is located in `ice_transport.cpp`. Moreover, we add some important patches/features on top of *pjproject* not merged upstream for now (for example, ICE over TCP). These patches are present in `contrib/src/pjproject`.
Pending calls are managed by `JamiAccount::handlePendingCallList()`, which first wait for the TCP negotiation to finish (and if it fails, wait for the UDP one).
The code for the ICE negotiation is mainly managed by [pjproject](https://github.com/pjsip/pjproject) but for Jami, the interesting part is located in `ice_transport.cpp`.
Moreover, we add some important patches/features on top of *pjproject* not merged upstream for now (for example, ICE over TCP).
These patches are present in `contrib/src/pjproject`.
### Encrypt the control socket
Once the socket is created and managed by an **IceTransport** instance, it is then wrapped in a **SipTransport** corresponding to a *TlsIceTransport*. The main code is located into `JamiAccount::handlePendingCall()` and the wrapping is done into `SipTransportBroker::getTlsIceTransport`. Finally, our session is managed by **TlsSession** in `daemon/src/security/tls_session.cpp` and uses the GnuTLS library.
Once the socket is created and managed by an **IceTransport** instance, it is then wrapped in a **SipTransport** corresponding to a *TlsIceTransport*.
The main code is located in `JamiAccount::handlePendingCall()` and the wrapping is done in `SipTransportBroker::getTlsIceTransport`.
Finally, our session is managed by **TlsSession** in `daemon/src/security/tls_session.cpp` and uses the GnuTLS library.
So, the control socket will be a TLS (1.3 if your and your peer gnutls version support it) if a TCP socket is negotiated. If a UDP socket is negotiated instead (due to firewall restrictions/problem in the negotiation/etc), the socket will use DTLS (still managed by the same parts).
So, the control socket will be a TLS (1.3 if you and your peer's GnuTLS version supports it) if a TCP socket is negotiated.
If a UDP socket is negotiated instead (due to firewall restrictions/problems in the negotiation/etc.), the socket will use DTLS (still managed by the same parts).
The control socket is used to transmit SIP packets, like invites, custom messages (Jami sends the VCard of your profile on this socket at the start of the call, or the rotation of the camera), text messages.
The control socket is used to transmit SIP packets, like invites, custom messages (Jami sends the vCard of your profile on this socket at the start of the call, or the rotation of the camera), and text messages.
Related articles:
+ https://jami.net/improved-video-rotation-support/
+ https://jami.net/peer-to-peer-file-sharing-support-in-jami/
* <https://jami.net/improved-video-rotation-support/>
* <https://jami.net/peer-to-peer-file-sharing-support-in-jami/>
### Media sockets
Media sockets are SRTP sockets where the key is negotiated through the TLS Session previously created.
**TODO**
Media sockets are SRTP sockets where the key is negotiated through the TLS session previously created.
### Architecture
```{warning}
TODO: This section is incomplete.
```
**TOOD**
### Architecture
```{warning}
TODO: This section is incomplete.
```
## Multi-stream
Since daemon's version 13.3.0, multi-stream is fully supported. This
feature allows users to share multiple videos during a call at the
same time. In the following parts, we will describe all related
changes.
Since daemon version 13.3.0, multi-stream is fully supported.
This feature allows users to share multiple videos during a call at the same time.
In the following parts, we will describe all related changes.
### pjsip
### PJSIP
The first part is to negotiate enough media streams. In fact, every
media stream uses 2 UDP sockets. We consider three scenarios:
The first part is to negotiate enough media streams.
In fact, every media stream uses 2 UDP sockets.
We consider three scenarios:
1. If it's the host of a conference who wants to add media, there is
nothing more to negotiate, because we already mix the videos into
one stream. So, we add the new media directly to the video-mixer
without negotiations.
1. If it's the host of a conference who wants to add media, there is nothing more to negotiate, because we already mix the videos into one stream.
So, we add the new media directly to the video mixer without negotiations.
2. If we're in 1:1, for now, as there is no conference information,
multi-stream is not supported.
3. If we're in 1:1, for now, as there is no conference information, multi-stream is not supported.
3. Else, 2 new sockets are negotiated for new media.
4. Else, 2 new sockets are negotiated for new media.
To make pjsip able to generate more sockets per ICE session,
`PJ_ICE_COMP_BITS` was modified to `5` (which corresponds to `2^5`, so
32 streams).
To make PJSIP able to generate more sockets per ICE session, `PJ_ICE_COMP_BITS` was modified to $5$ (which corresponds to $2^5$, so $32$ streams).
### Deprecate switchInput, support requestMediaChange
In the daemon, the old API `switchInput` is now **DEPRECATED**; same
for `switchSecondaryInput`:
In the daemon, the old API `switchInput` is now **DEPRECATED**; same for `switchSecondaryInput`:
```xml
<method name="switchInput" tp:name-for-bindings="switchInput">
......@@ -154,17 +168,13 @@ for `switchSecondaryInput`:
</method>
```
### Compability
### Compatibility
If a call is done with a peer where the daemon's version is < 13.3.0,
multi-stream is not enabled and the old behavior is used (1 video
only).
If a call is done with a peer where the daemon's version is < 13.3.0, multi-stream is not enabled, and the old behavior is used (1 video only).
### Identifications of streams
### Stream identification
Because there can be multiple streams now, every media stream is
identified by its identifier, and the format is "<type>_<idx>"; for
example: "audio_0", "video_2", etc.
Because there can be multiple streams now, every media stream is identified by its identifier, and the format is "<type>_<idx>"; for example: "audio_0", "video_2", etc.
### Rotation
......@@ -198,7 +208,7 @@ The XML was updated to add the wanted stream:
### Voice activity
The XML was updated to add the wanted stream:
The XML was updated to add the required stream:
```
<?xml version="1.0" encoding="utf-8" ?>
......@@ -218,13 +228,8 @@ Reflected changes are documented [here](conference-protocol).
## Client
Even if the back-end supports up to 32 media at the same time, except
for custom clients we currently recommend only giving the ability to
share one camera and one video at the same time. The camera is
controlled via the camera button, and the other media via the "Share"
button.
Even if the back-end supports up to 32 media at the same time, except for custom clients, we currently recommend only giving the ability to share one camera and one video at the same time.
The camera is controlled via the camera button, and the other media via the "Share" button.
In client-qt, the interesting part is in `AvAdapter` (methods like
`isCapturing`, `shareAllScreens`, `stopSharing`). In the library's
logic, `addMedia` and `removeMedia` in the `callModel` directly use
the `requestMediaChange` and can be used as a design reference.
In client-qt, the interesting part is in `AvAdapter` (methods like `isCapturing`, `shareAllScreens`, `stopSharing`).
In the library's logic, `addMedia` and `removeMedia` in the `callModel` directly use the `requestMediaChange`, and can be used as a design reference.
\ No newline at end of file
0% Loading or .
You are about to add 0 people to the discussion. Proceed with caution.
Please register or to comment