Skip to content
Snippets Groups Projects
Commit b3edcbaa authored by ovari's avatar ovari Committed by Pierre Nicolas
Browse files

developer/jami-concepts/swarm.md: cleanup

Change-Id: Ia1d5fce09e5b426f24747993f3de3be51eafefa2
parent ec06909c
No related branches found
No related tags found
No related merge requests found
......@@ -4,7 +4,8 @@
The goal of this document is to describe how group chats (a.k.a. **swarm chat**) will be implemented in Jami.
A *swarm* is a group able to discuss without any central authority in a resilient way. Indeed, if two person doesn't have any connectivity with the rest of the group (ie Internet outage) but they can contact each other (in a LAN for example or in a subnetwork), they will be able to send messages to each other and then, will be able to sync with the rest of the group when it's possible.
A *swarm* is a group able to discuss without any central authority in a resilient way.
Indeed, if two person doesn't have any connectivity with the rest of the group (ie Internet outage) but they can contact each other (in a LAN for example or in a subnetwork), they will be able to send messages to each other and then, will be able to sync with the rest of the group when it's possible.
So, the *swarm* is defined by:
1. Ability to split and merge following the connectivity.
......@@ -16,10 +17,10 @@ So, the *swarm* is defined by:
The main idea is to get a synchronized Merkle tree with the participants.
We identified four modes for swarm chat that we want to implement:
+ **ONE_TO_ONE**, basically the case we have today when you discuss to a friend
+ **ADMIN_INVITES_ONLY** generally a class where the teacher can invite people, but not students
+ **INVITES_ONLY** a private group of friends
+ **PUBLIC** basically an opened forum
* **ONE_TO_ONE**, basically the case we have today when you discuss to a friend
* **ADMIN_INVITES_ONLY** generally a class where the teacher can invite people, but not students
* **INVITES_ONLY** a private group of friends
* **PUBLIC** basically an opened forum
## Scenarios
......@@ -29,9 +30,9 @@ We identified four modes for swarm chat that we want to implement:
1. Bob creates a local git repository.
2. Then, he creates an initial signed commit with the following:
+ His public key in `/admins`
+ His device certificate in ̀ /devices`
+ His CRL in ̀ /crls`
* His public key in `/admins`
* His device certificate in ̀ /devices`
* His CRL in ̀ /crls`
3. The hash of the first commit becomes the **ID** of the conversation
4. Bob announces to his other devices that he creates a new conversation. This is done via an invite to join the swarm sent through the DHT to other devices linked to that account.
......@@ -40,8 +41,8 @@ We identified four modes for swarm chat that we want to implement:
*Alice adds Bob*
1. Alice adds Bob to the repo:
+ Adds the invited URI in `/invited`
+ Adds the CRL into `/crls`
* Adds the invited URI in `/invited`
* Adds the CRL into `/crls`
2. Alice sends a request on the DHT
### Receiving an invite
......@@ -68,7 +69,9 @@ Sending a message is pretty simple. Alice writes a commit-message in the followi
}
```
and adds her device and CRL to the repository if missing (others must be able to verify the commit). Merge conflicts are avoided because we are mostly based on commit messages, not files (unless CRLS + certificates but they are located). then she announces the new commit via the **DRT** with a service message (explained later) and pings the DHT for mobile devices (they must receive a push notification).
and adds her device and CRL to the repository if missing (others must be able to verify the commit).
Merge conflicts are avoided because we are mostly based on commit messages, not files (unless CRLS + certificates but they are located).
Then she announces the new commit via the **DRT** with a service message (explained later) and pings the DHT for mobile devices (they must receive a push notification).
For pinging other devices, the sender sends to other members a SIP message with mimetype = "application/im-gitmessage-id" containing a JSON with the "deviceId" which sends the message, the "id" of the conversation related, and the "commit"
......@@ -85,8 +88,10 @@ For pinging other devices, the sender sends to other members a SIP message with
To avoid users pushing some unwanted commits (with conflicts, false messages, etc), this is how each commit (from the oldest to the newest one) MUST be validated before merging a remote branch:
Note: if the validation fails, the fetch is ignored and we do not merge the branch (and remove the data), and the user should be notified
Note2: If a fetch is too big, it's not merged
```{note}
1. If the validation fails, the fetch is ignored and we do not merge the branch (and remove the data), and the user should be notified
2. If a fetch is too big, it's not merged.
```
+ For each commits, check that the device that tries to send the commit is authorized at this moment and that the certificates are present (in /devices for the device, and in /members or /admins for the issuer).
+ 3 cases. The commit has 2 parents, so it's a merge, nothing more to validate here
......@@ -150,16 +155,22 @@ This is needed to detect revoked devices, or simply avoid getting unwanted peopl
*Alice removes Bob*
Note: Alice MUST be admins to vote
```{important}
Alice MUST be an admin to vote.
```
+ First, she votes for banning Bob. To do that, she creates the file in /votes/ban/members/uri_bob/uri_alice (members can be replaced by devices for a device, or invited for invites or admins for admins) and commits
+ Then she checks if the vote is resolved. This means that >50% of the admins agree to ban Bob (if she is alone, it's sure it's more than 50%).
+ First, she votes for banning Bob.
To do that, she creates the file in /votes/ban/members/uri_bob/uri_alice (members can be replaced by devices for a device, or invited for invites or admins for admins) and commits
+ Then she checks if the vote is resolved.
This means that >50% of the admins agree to ban Bob (if she is alone, it's sure it's more than 50%).
+ If the vote is resolved, files into /votes/ban can be removed, all files for Bob in /members, /admins, /invited, /CRLs, /devices can be removed (or only in /devices if it's a device that is banned) and Bob's certificate can be placed into /banned/members/bob_uri.crt (or /banned/devices/uri.crt if a device is banned) and committed to the repo
+ Then, Alice informs other users (outside Bob)
*Alice (admin) re-adds Bob (banned member)
+ Fir she votes for unbanning Bob. To do that, she creates the file in /votes/unban/members/uri_bob/uri_alice (members can be replaced by devices for a device, or invited for invites or admins for admins) and commits
+ Then she checks if the vote is resolved. This means that >50% of the admins agree to ban Bob (if she is alone, it's sure it's more than 50%).
+ If she votes for unbanning Bob.
To do that, she creates the file in /votes/unban/members/uri_bob/uri_alice (members can be replaced by devices for a device, or invited for invites or admins for admins) and commits
+ Then she checks if the vote is resolved.
This means that >50% of the admins agree to ban Bob (if she is alone, it's sure it's more than 50%).
+ If the vote is resolved, files into /votes/unban can be removed, all files for Bob in /members, /admins, /invited, /CRLs, can be re-added (or only in /devices if it's a device that is unbanned) and committed to the repo
### Remove a conversation
......@@ -169,13 +180,16 @@ Note: Alice MUST be admins to vote
3. Now, if Jami startup and the repo is still present, the conversation is not announced to clients
4. Two cases:
a. If no other member in the conversation we can immediately remove the repository
b. If still other members, commit that we leave the conversation, and now wait that at least another device sync this message. This avoids the fact that other members will still detect the user as a valid member and still sends new message notifications.
b. If still other members, commit that we leave the conversation, and now wait that at least another device sync this message.
This avoids the fact that other members will still detect the user as a valid member and still sends new message notifications.
5. When we are sure that someone is synched, remove erased=time::now() and sync with other user's devices
6. All devices owned by the user can now erase the repository and related files
## How to specify a mode
Modes can not be changed through time. Or it's another conversation. So, this data is stored in the initial commit message.
Modes can not be changed through time.
Or it's another conversation.
So, this data is stored in the initial commit message.
The commit message will be the following:
......@@ -190,19 +204,25 @@ For now, "mode" accepts values 0 (ONE_TO_ONE), 1 (ADMIN_INVITES_ONLY), 2 (INVITE
### Processus for 1:1 swarms
The goal here is to keep the old API (addContact/removeContact, sendTrustRequest/acceptTrustRequest/discardTrustRequest) to generate swarm with a peer and its contact. This still implies some changes that we cannot ignore:
The goal here is to keep the old API (addContact/removeContact, sendTrustRequest/acceptTrustRequest/discardTrustRequest) to generate swarm with a peer and its contact.
This still implies some changes that we cannot ignore:
The process is still the same, an account can add a contact via addContact, then send a TrustRequest via the DHT. But two changes are necessary:
The process is still the same, an account can add a contact via addContact, then send a TrustRequest via the DHT.
But two changes are necessary:
1. The TrustRequest embeds a "conversationId" to inform the peer what conversation to clone when accepting the request
2. TrustRequest are retried when contact come backs online. It's not the case today (as we don't want to generate a new TrustRequest if the peer discard the first). So, if an account receives a trust request, it will be automatically ignored if the request with a related conversation is declined (as convRequests are synched)
2. TrustRequest are retried when contact come backs online.
It's not the case today (as we don't want to generate a new TrustRequest if the peer discard the first).
So, if an account receives a trust request, it will be automatically ignored if the request with a related conversation is declined (as convRequests are synched)
Then, when a contact accepts the request, a period of sync is necessary, because the contact now needs to clone the conversation.
removeContact() will remove the contact and related 1:1 conversations (with the same process as "Remove a conversation"). The only note here is that if we ban a contact, we don't wait for sync, we just remove all related files.
removeContact() will remove the contact and related 1:1 conversations (with the same process as "Remove a conversation").
The only note here is that if we ban a contact, we don't wait for sync, we just remove all related files.
#### Tricky scenarios
There are some cases where two conversations can be created. This is at least two of those scenarios:
There are some cases where two conversations can be created.
This is at least two of those scenarios:
1. Alice adds Bob
2. Bob accepts
......@@ -213,16 +233,20 @@ or
1, Alice adds Bob & Bob adds Alice at the same time, but both are not connected together
In this case, two conversations are generated. We don't want to remove messages from users or choose one conversation here. So, sometimes two 1:1 swarm between the same members will be shown. It will generate some bugs during the transition time (as we don't want to break API, the inferred conversation will be one of the two shown conversations, but for now it's "ok-ish", will be fixed when clients will fully handle conversationId for all APIs (calls, file transfer, etc)).
In this case, two conversations are generated.
We don't want to remove messages from users or choose one conversation here.
So, sometimes two 1:1 swarm between the same members will be shown.
It will generate some bugs during the transition time (as we don't want to break API, the inferred conversation will be one of the two shown conversations, but for now it's "ok-ish", will be fixed when clients will fully handle conversationId for all APIs (calls, file transfer, etc)).
#### Note while syncing
```{important}
After accepting a conversation's request, there is a time the daemon needs to retrieve the distant repository.
During this time, clients MUST show a syncing view to give informations to the user.
While syncing:
After accepting a conversation's request, there is a time the daemon needs to retrieve the distant repository. During this time, clients MUST show a syncing view to give informations to the user.
Note, while syncing:
+ ConfigurationManager::getConversations() will return the conversation's id even while syncing
+ ConfigurationManager::conversationInfos() will return {{"syncing": "true"}} if syncing.
+ ConfigurationManager::getConversationMembers() will return a map of two URIs (the current account and the peer who sent the request)
* ConfigurationManager::getConversations() will return the conversation's id even while syncing.
* ConfigurationManager::conversationInfos() will return {{"syncing": "true"}} if syncing.
* ConfigurationManager::getConversationMembers() will return a map of two URIs (the current account and the peer who sent the request).
```
### Conversations requests specification
......@@ -253,11 +277,15 @@ END:VCARD
#### Synchronization
To update the vCard, a user with enough permissions (by default: =ADMIN) needs to edit `/profile.vcf`. and will commit the file with the mimetype `application/update-profile`. The new message is sent via the same mechanism and all peers will receive the **MessageReceived** signal from the daemon. The branch is dropped if the commit contains other files or too big or if done by a non-authorized member (by default: <ADMIN).
To update the vCard, a user with enough permissions (by default: =ADMIN) needs to edit `/profile.vcf` and will commit the file with the mimetype `application/update-profile`.
The new message is sent via the same mechanism and all peers will receive the **MessageReceived** signal from the daemon.
The branch is dropped if the commit contains other files or too big or if done by a non-authorized member (by default: <ADMIN).
##### Last Displayed
In the synchronized data, each devices sends to other devices the state of the conversations. In this state, the last displayed is sent. However, because each device can have its own state for each conversation, and probably without the same last commit at some point, there is several scenarios to take into account:
In the synchronized data, each devices sends to other devices the state of the conversations.
In this state, the last displayed is sent.
However, because each device can have its own state for each conversation, and probably without the same last commit at some point, there is several scenarios to take into account:
5 scenarios are supported:
+ if the last displayed sent by other devices is the same as the current one, there is nothing to do.
......@@ -268,7 +296,10 @@ In the synchronized data, each devices sends to other devices the state of the c
#### Preferences
Every conversation has attached preferences set by the user. Those preferences are synced across user's devices. This can be the color of the conversation, if the user wants to ignore notifications, file transfer size limit, etc. For now, the recognized keys are:
Every conversation has attached preferences set by the user.
Those preferences are synced across user's devices.
This can be the color of the conversation, if the user wants to ignore notifications, file transfer size limit, etc.
For now, the recognized keys are:
+ "color" - the color of the conversation (#RRGGBB format)
+ "ignoreNotifications" - to ignore notifications for new messages in this conversation
......@@ -298,7 +329,8 @@ struct ConversationPreferencesUpdated
### Merge conflicts management
Because two admins can change the description at the same time, a merge conflict can occur on `profile.vcf`. In this case, the commit with the higher hash (eg ffffff > 000000) will be chosen.
Because two admins can change the description at the same time, a merge conflict can occur on `profile.vcf`.
In this case, the commit with the higher hash (eg ffffff > 000000) will be chosen.
#### APIs
......@@ -337,12 +369,16 @@ where `infos` is a `map<str, str>` with the following keys:
#### Re-import an account (link/export)
The archive MUST contain conversationId to be able to retrieve conversations on new commits after a re-import (because there is no invite at this point). If a commit comes for a conversation not present there are two possibilities:
The archive MUST contain conversationId to be able to retrieve conversations on new commits after a re-import (because there is no invite at this point).
If a commit comes for a conversation not present there are two possibilities:
+ The conversationId is there, in this case, the daemon is able to re-clone this conversation
+ The conversationId is missing, so the daemon asks (via a message `{{"application/invite", conversationId}}`) a new invite that the user needs to (re)accepts
Note, a conversation can only be retrieved if a contact or another device is there, else it will be lost. There is no magic.
```{important}
A conversation can only be retrieved if a contact or another device is there, else it will be lost.
There is no magic.
```
## Used protocols
......@@ -350,10 +386,15 @@ Note, a conversation can only be retrieved if a contact or another device is the
#### Why this choice
Each conversation will be a git repository. This choice is motivated by:
Each conversation will be a git repository.
This choice is motivated by:
1. We need to sync and order messages. The Merkle Tree is the perfect structure to do that and can be linearized by merging branches. Moreover, because it's massively used by Git, it's easy to sync between devices.
2. Distributed by nature. Massively used. Lots of backends and pluggable.
1. We need to sync and order messages.
The Merkle Tree is the perfect structure to do that and can be linearized by merging branches.
Moreover, because it's massively used by Git, it's easy to sync between devices.
2. Distributed by nature.
Massively used.
Lots of backends and pluggable.
3. Can verify commits via hooks and massively used crypto
4. Can be stored in a database if necessary
5. Conflicts are avoided by using commit messages, not files.
......@@ -366,7 +407,8 @@ Each conversation will be a git repository. This choice is motivated by:
#### Limits
History can not be deleted. To delete a conversation, the device has to leave the conversation and create another one.
History can not be deleted.
To delete a conversation, the device has to leave the conversation and create another one.
However, non-permanent messages (like messages readable only for some minutes) can be sent via a special message via the DRT (like Typing or Read notifications).
......@@ -401,7 +443,10 @@ However, non-permanent messages (like messages readable only for some minutes) c
### File transfer
Swarm massively changes file transfer. Now, all the history is syncing, allowing all devices in the conversation to easily retrieve old files. This changes allow us to move from a logic where the sender pushed the file on other devices, via trying to connect to their devices (This was bad because not really resistant to connections changes/failures and needed a manual retry) to a logic where the sender allow other devices to download. Moreover, any device having the file can be the host for other devices, allowing to retrieve files even if the sender is not there.
Swarm massively changes file transfer.
Now, all the history is syncing, allowing all devices in the conversation to easily retrieve old files.
This changes allow us to move from a logic where the sender pushed the file on other devices, via trying to connect to their devices (This was bad because not really resistant to connections changes/failures and needed a manual retry) to a logic where the sender allow other devices to download.
Moreover, any device having the file can be the host for other devices, allowing to retrieve files even if the sender is not there.
#### Protocol
......@@ -419,9 +464,11 @@ and creates a link in `${data_path}/conversation_data/${conversation_id}/${file_
Then, the receiver can now download the files by contacting the devices hosting the file by opening a channel with `name="data-transfer://" + conversationId + "/" + currentDeviceId() + "/" + fileId` and store the info that the file is waiting in `${data_path}/conversation_data/${conversation_id}/waiting`
The device receiving the connection will accepts the channel by verifying if the file can be sent (if sha3sum is correct and if file exists). The receiver will keep the first opened channel, close the others and write into a file (with the same path as the sender: `${data_path}/conversation_data/${conversation_id}/${file_id}`) all incoming data.
The device receiving the connection will accepts the channel by verifying if the file can be sent (if sha3sum is correct and if file exists).
The receiver will keep the first opened channel, close the others and write into a file (with the same path as the sender: `${data_path}/conversation_data/${conversation_id}/${file_id}`) all incoming data.
When the transfer is finished or the channel closed, the sha3sum is verified to validate that the file is correct (else it's deleted). If valid, the file will be removed from the waiting.
When the transfer is finished or the channel closed, the sha3sum is verified to validate that the file is correct (else it's deleted).
If valid, the file will be removed from the waiting.
In case of failure, when a device of the conversation will be back online, we will ask for all waiting files by the same way.
......@@ -429,7 +476,8 @@ In case of failure, when a device of the conversation will be back online, we wi
#### Idea
A swarm conversation can have multiple rendez-vous. A rendez-vous is defined by the following uri:
A swarm conversation can have multiple rendez-vous.
A rendez-vous is defined by the following uri:
"accountUri/deviceId/conversationId/confId" where accountUri/deviceId describes the host.
......@@ -445,15 +493,17 @@ So every part will receive the infos that a call has started and will be able to
#### Attacks?
+ Avoid git bombs
* Avoid git bombs
#### Notes
The timestamp of a commit can be trusted because it's editable. Only the user's timestamp can be trusted.
The timestamp of a commit can be trusted because it's editable.
Only the user's timestamp can be trusted.
### TLS
Git operations, control messages, files, and other things will use a p2p TLS v1.3 link with only ciphers which guaranty PFS. So each key is renegotiated for each new connexion.
Git operations, control messages, files, and other things will use a p2p TLS v1.3 link with only ciphers which guaranty PFS.
So each key is renegotiated for each new connexion.
### DHT (udp)
......@@ -610,13 +660,18 @@ Generated by administrators to add a vote for kicking or un-kicking someone.
**!! OLD DRAFT !!**
Note: Following notes are not organized yet. Just some line of thoughts.
```{note}
Following notes are not organized yet.
Just some line of thoughts.
```
## Crypto improvements.
For a serious group chat feature, we also need serious crypto. With the current design, if a certificate is stolen as the previous DHT values of a conversation, the conversation can be decrypted. Maybe we need to go to something like **Double ratchet**.
Note: a lib might exist to implement group conversations.
```{note}
A lib might exist to implement group conversations.
```
Needs ECC support in OpenDHT
......
0% Loading or .
You are about to add 0 people to the discussion. Proceed with caution.
Please register or to comment