MSC3401: Native Group VoIP Signalling #3401

ara4n · 2021-09-19T23:58:30Z

Rendered

Obsoletes #2359

ShadowJonathan

🚀 ☎️

proposals/3401-group-voip.md

ShadowJonathan · 2021-09-20T08:01:42Z

proposals/3401-group-voip.md

+}
+```
+
+We mandate at most one call per room at any given point to avoid UX nightmares - if you want the user to participate in multiple parallel calls, you should simply create multiple rooms, each with one call.


I think that this is worth considering though, the UX nightmare might not be that bad (some clients might even work entirely with this possibility), and personally i think that putting the conf ID in a sub-field is just asking for problems (if the previous call information gets overridden by a person sending another state event for a "new" call while the last one is still in-progress.)

Why not move conf_id into the state_key, currently declare multiple calls UB and unsupported, while noting that speccing it and properly seating it would be a case for a future MSC?

Re-opening this one because we've just had a glare-like bug on Element Call where multiple people entered the call at the same time (as you do) and multiple conferences got created in the same room. In general, we're going to want some way to handle glare of several people hitting the 'start conference call' button at the same time. Allowing multiple calls in a room means we need to handle this somehow. It's not impossible (eg. we could define some common ID for 'the' call in a room allowing you to use other IDs for other calls?) but I'd just like to check that we really want to deal with this complexity.

I am also very much in favour of having the state_key be just "" because having multiple group calls in one room often leads to more problems rather than benefits

With MSC3985 we now also have a separate method to create break-out rooms, so it feels like multiple calls in one room are no longer necessary

I also think we should be able to use the m.termintated to calculate the call length

I think there is still an issue with relying on m.terminated to determine the call length: If a client wants to display a timeline tile with the duration at the point where the call was ended, then it works, but if clients want to display the tile at the point that the call was started (like Element Web does), and we're reusing the same state key for all calls, it's difficult to get the duration from that event. In fact, if there's a call ongoing in the room, there's no way to tell whether a given call event is part of the current call or not, short of crawling the timeline, so clients won't know whether to label it with "call ended".

With separate state keys, this is a lot easier, because it gives you a way to efficiently look up the current state of any call, current or historical.

proposals/3401-group-voip.md

robertlong · 2021-09-20T18:08:40Z

proposals/3401-group-voip.md

+}
+```
+
+Rather than sending arrays one can send `"all"` to either `start` or `stop` to start or stop all streams.


Managing these streams via start/stop events seems a little prone to failure. Would it be easier to send the entire list of streams you wish to receive? This should be sufficiently small that the payload would never get too big and I think both the client/SFU logic would be simpler to manage.

My reason for not doing this is that switching streams can happen very rapidly (e.g. the client could request different streams as they receive different active speaker notifications), and the idea of sending the whole list of all streams you care about every time just feels like a big waste of bandwidth. If you're in a big cascading conference with thousands of users (which this architecture could support!) do you really want to list out all the stream IDs when you want to switch from one speaker to the next?

robertlong · 2021-09-20T18:13:02Z

proposals/3401-group-voip.md

+               C
+```
+
+SFU (aka Focus):


Bikeshedding warning: I'm relatively new to the WebRTC/VoIP industry, but I have never heard the term focus used in place of SFU. Is this a commonly known term? Should we be using SFU in this spec instead? Including renaming m.foci -> m.sfus?

the reason i originally went with foci is because the field originally described the (mxid, deviceid) tuples where a given mxid could be contacted - which might either be a local device (for full mesh) or an SFU.

However, in the current simpler draft, the only time you include this field is if you are using a conferencing focus of some kind.

But, this proposal is not meant to just be for SFUs - the device you use to focus together your view of the conference could (in future) equally be an MCU as much as an SFU. Hence using the correct more generic term of 'focus' rather than making it specific to SFU technology. For instance, the server could advertise a stream which composites together a mosaic of different feeds for a non-E2EE call... at which point it's acting as a (hybrid) MCU.

The term 'focus' comes from SIP (e.g. https://datatracker.ietf.org/doc/html/rfc3840#section-10.18) and is the standard term there for "an endpoint you connect to which mixes together other endpoints". I'm slightly inclined to keep it, to keep thing flexible for future more sophisticated foci tech.

Can we call it call_focus or stream_focus or something a bit more descriptive than a not-well-known dictionary word?

focus is a pretty well-known word, and foci is its plural. i don't particularly want to call it 'focuses', given that's a different word (the 3rd person present form of 'to focus'). not sure this is a showstopper.

It definitely isn't a showstopper but I would like to come up with a better name if we can. It is also a bit of a red-flag that just about everything else in the MSC is calling it a SFU.

While focus is a well-known word, outside of Britain its plural is 'focuses', so I would expect that a lot of people are going to be similarly confused over its meaning. Even the Cambridge Dictionary lists 'focuses' as the plural, while listing 'foci' as the formal plural in the UK.

Might it be possible to at least mention in the spec that it's used in this sense?

I'm coming around to using "foci" as the word and there are references out there in the wild for "foci" being used in SIP terminology

https://books.google.com/books?id=CyYAEAAAQBAJ&pg=PT66&lpg=PT66&dq=SIP+%22foci%22&source=bl&ots=zMu58i8hrj&sig=ACfU3U2VD7ts63JbE1HXkjWNoVRXi_3prA&hl=en&sa=X&ved=2ahUKEwjg9cT_j7r2AhUyGDQIHTtpCFoQ6AF6BAgCEAM#v=onepage&q=SIP%20%22foci%22&f=false

https://datatracker.ietf.org/doc/html/rfc4575#section-3.8

I think we should keep foci.

proposals/3401-group-voip.md

SimonBrandner · 2021-09-22T07:15:25Z

proposals/3401-group-voip.md

+     * `m.ring` if the call is meant to cause the room participants devices to ring (e.g. 1:1 call or group call)
+     * `m.conference` is the call should be presented as a conference call which users in the room may connect to


I am wondering if these two are basically the same thing with different push rules? Is this influenced by push rules?

It seems like there is a sort of intersection. I can see a use case where in the same room we may have "weekly sync" where we should buzz everyone and "debugging session" where people may drop in. Of course there are some rooms where I may not care about m.ring.

Maybe it is better to rephrase this as "priority"? Intent is very vague. Intent for what?

i guess you could put this distinction into push rules, but it seems a bit simpler (especially given what a mess push rules are) to make it explicit here. After all, the difference between ringing and conferencing is not just the type of push notification you receive, but the whole UX (e.g. CallKit on iOS, or whether you display a dedicated ringing UX etc).

While that is true, I feel like we shouldn't use something that are not push rules for influencing notifications

@ara4n do you see any resolution to this? I agree, it probably should be separate to the push rules. Implementations should use m.ring as the first clue as to whether or not to ring a device and push rules should apply on top of it. The m.ring type also defines what UI to render in a client.

It would make sense to use intentional mentions for this. If none are included, it's like a conference call. Otherwise, in conjunction with the fact that it's a call event, the client would know to start ringing and not just pinging. To ring everyone in a room, you'd simply mention @room.

proposals/3401-group-voip.md

kevincox

I think this is a solid improvement. I think there is a lot of minor massaging to rough edges but I think the code concepts are solid.

proposals/3401-group-voip.md

kevincox · 2021-09-22T23:16:43Z

proposals/3401-group-voip.md

+     * `m.ring` if the call is meant to cause the room participants devices to ring (e.g. 1:1 call or group call)
+     * `m.conference` is the call should be presented as a conference call which users in the room may connect to


It seems like there is a sort of intersection. I can see a use case where in the same room we may have "weekly sync" where we should buzz everyone and "debugging session" where people may drop in. Of course there are some rooms where I may not care about m.ring.

Maybe it is better to rephrase this as "priority"? Intent is very vague. Intent for what?

proposals/3401-group-voip.md

deepbluev7 · 2021-09-22T23:57:14Z

proposals/3401-group-voip.md

+
+## Encryption
+
+We get E2EE for 1:1 and full mesh calls automatically in this model.


One thing I don't like about this proposal, is that it uses quite a few unencrypted state events. If you join a conference, you are leaking metadata about

the call existing.

Who (tried to) participate in the call.

Maybe some info about the physical devices of the user

Normal calls are not affected by that, because they don't use state event. State events of course make it easier to track, that a room is a conference room or similar, but they currently can't be encrypted and calls are imo somewhat more sensitive metadata. Verification gets around that by using relations instead of state events.

I currently can't think of a good alternative to state events and maybe one day we will get magic encrypted state events, that none figured out so far. But maybe someone has an idea or we could at least call out this issue in the encryption, potential issues or security sections?

fair point. i'm assuming we will have magic encrypted state events sooner or later, however.

. o O ( make m.call and m.call.member state events with no body, but a state_key which contains an event_id for a timeline E2EE event. clients then call GET /event on the event_id in the state_key of the state event in order to grab the encrypted contents of the event in question :P )

or actually, keep the same state_keys as before, but just have the contents be { "encrypted": "$event_id" }. (shamelessly stolen from @turt2live)

...which has now turned into #3414

Should we mark this resolved? Should we add MSC3414 to the spec?

dbkr · 2021-09-23T08:28:37Z

proposals/3401-group-voip.md

+
+### Call participation
+
+Users who want to participate in the call declare this by adding an `m.conf` field to their `m.room.member` state event.  Ideally, we'd use a dedicated state event type for this, making it easier to rapidly spot who is in a conference.  But given we don't want other people editing our state event and Matrix doesn't yet provide that level of access control, instead we (ab)use the `m.room.member` event to declare our participation in the conference in the context of the room.  Therefore any profile updates need to be careful to preserve the `m.conf` field.


Do we need to worry about clients setting a conf state event and then falling off a cliff, leaving the user stuck as if they're in a conference?

I think https://github.com/matrix-org/matrix-doc/pull/3401/files#diff-a6d4763a98532010d751fbec7c898cb0fe27a757323befd0a1eb4ca443bbca4dR177 answers that

that line handles clients setting an m.call.member event and falling off a cliff. if a client sets an m.call event and falls off a cliff, however, anyone with permission to overwrite the state event in the room can go and remove the 'stuck' call.

proposals/3401-group-voip.md

ShadowJonathan · 2021-09-23T10:10:28Z

proposals/3401-group-voip.md

@@ -252,13 +267,15 @@ An alternative to to-device messages is to use DMs.  You still risk gappy sync p

 ## Security considerations

+State events are not encrypted currently, and so this leaks that a call is happening, and who is participating in it, and from which devices.


plus that it happened in the past and who there-and-then participated in it, by correlating it corresponding m.call.member state events.

proposals/3401-group-voip.md

bwindels

clarify how m.expires_ts should be interpreted

proposals/3401-group-voip.md

ara4n · 2022-09-25T21:18:40Z

proposals/3401-group-voip.md

+Call setup then uses the normal `m.call.*` events, except they are sent over to-device messages to the relevant devices (encrypted via Olm).  This means:
+
+ * When initiating a 1:1 call, the `m.call.invite` is sent to the devices listed in `m.call.member` event's `m.devices` array using the `device_id` field.
+ * `m.call.*` events sent via to-device messages should also include the following properties in their content:


we seem to have completely missed seq, needed to make up for todevice events having no intrinsic ordering otherwise: matrix-org/matrix-js-sdk@7f21f56

Maybe a dumb question here: does it mean that the order of the To-Device messages is not guaranteed? I'm asking since I've processed the To-Device messages on the SFU under the assumption that they come at the same order in which they were sent by the client. If the order is not guaranteed, it may cause some interesting (undesired) effects, e.g. if the "Invite -> Hangup" sequence comes as "Hangup -> Invite" on a server, the end effect will not be what the user expects. Or Invite -> SelectAnswer as SelectAnswer -> Invite.

Yes, it's not guaranteed and we should rely on the seq field

Yes, it's not guaranteed and we should rely on the seq field

Oh, that's interesting. Should we document how to deal with it and what's the semantics? - I've noticed that a new commit has been pushed recently to add the seq to the request that says that it starts with 0 and gets incremented with each message. But where the counter is stored? I can imagine that it's a per-device counter (meaning that 2 different devices may generate the same seq)? And what happens when the overflow of the value occurs?

It also has some practical implications: how are we (as receivers) expected to handle it in a proper way? - I.e. imagine that we receive a "New ICE candidates message" on the SFU with a seq=15 while the previous message from the sender had seq=5. We probably don't want to handle the message with seq=15 right now if we have not yet received the previous 10 messages (since the current message with seq=15 may not even be useful by the moment we process the previous 10, or maybe it's related to the invite that has been sent in seq=10). This means that in order to handle a message with seq=15, we would need to buffer a couple more messages (messages that went before seq=15) before we take a decision on whether to handle it.

However, this poses certain questions, namely if we're communicating with 1000 devices (SFU use case), this means we would need to store the lastStoredSeq for 1000 users, the problem is that we don't really know when to release the counter for a particular user from memory (i.e. when to remove it from the map since we don't know in advance the pool of devices that would communicate with us and we may theoretically receive a message from any of them) which means that the memory usage would grow indefinitely and once the SFU is restarted, we'll have the counters lost.

Another issue is that the sender can attack the receiver by sending a message with seq=1 followed by a message with seq=99999999999 and then another with seq=99999999998 and another with seq=99999999997 and, knowing that the other side buffers them since it can't process them, it will send them until the other side gets killed due to the OOM.

dbkr · 2022-11-18T11:54:33Z

proposals/3401-group-voip.md

+
+The fields within the item in the `m.calls` contents are:
+
+ * `m.call_id` - the ID of the conference the user is claiming to participate in.  If this doesn't match an unterminated `m.call` event, it should be ignored.


This probably ought to be m.conf_id to differentiate it from IDs of 1:1 calls and match the conf_id field in m.call.* to-device events?

See previous discussion at https://github.com/matrix-org/matrix-spec-proposals/pull/3401/files#r823313876

Note: currently the call_id and conf_id are not identical. This seems to be confusing if we're talking about the SFU calls (not sure how it's handled in a full-mesh).

When working on an SFU recently, I realized that conf_id was the ID of a conference (or a call if you will) which was quite logical and expected. However, what I did not expect is that in addition to the conf_id, each To-Device message has a call_id which does not match the conf_id and which seems to be uniquely generated by each participant.

The thing is: call_id field does not make any sense for the SFU at the moment (see the SFU MSC), since the SFU does not know what the call_id is (it looks like a randomly generated string that is different for each participant who tries to join a conference), but at the same time, the SFU is essentially obligated to store the call_id because the To-Device messages from the SFU to the participants are expected to have the call_id that matches the call_id value sent from participants to the SFU when they contact the SFU (I tried settings the call_id to match conf_id when sending a message from the SFU to the client, but the client discarded the message if the call_id did not match the call_id that the client sent to the SFU). So essentially, there is a conf_id the semantics of which is defined (it's the unique ID of a conference/call) and the call_id (which does not have any meaning for the SFU).

As discussed with Robert Long

daniel-abramov · 2022-12-02T17:37:01Z

proposals/3401-group-voip.md

+
+The fields within the item in the `m.calls` contents are:
+
+ * `m.call_id` - the ID of the conference the user is claiming to participate in.  If this doesn't match an unterminated `m.call` event, it should be ignored.


Note: currently the call_id and conf_id are not identical. This seems to be confusing if we're talking about the SFU calls (not sure how it's handled in a full-mesh).

When working on an SFU recently, I realized that conf_id was the ID of a conference (or a call if you will) which was quite logical and expected. However, what I did not expect is that in addition to the conf_id, each To-Device message has a call_id which does not match the conf_id and which seems to be uniquely generated by each participant.

The thing is: call_id field does not make any sense for the SFU at the moment (see the SFU MSC), since the SFU does not know what the call_id is (it looks like a randomly generated string that is different for each participant who tries to join a conference), but at the same time, the SFU is essentially obligated to store the call_id because the To-Device messages from the SFU to the participants are expected to have the call_id that matches the call_id value sent from participants to the SFU when they contact the SFU (I tried settings the call_id to match conf_id when sending a message from the SFU to the client, but the client discarded the message if the call_id did not match the call_id that the client sent to the SFU). So essentially, there is a conf_id the semantics of which is defined (it's the unique ID of a conference/call) and the call_id (which does not have any meaning for the SFU).

daniel-abramov · 2022-12-02T17:41:50Z

proposals/3401-group-voip.md

+Call setup then uses the normal `m.call.*` events, except they are sent over to-device messages to the relevant devices (encrypted via Olm).  This means:
+
+ * When initiating a 1:1 call, the `m.call.invite` is sent to the devices listed in `m.call.member` event's `m.devices` array using the `device_id` field.
+ * `m.call.*` events sent via to-device messages should also include the following properties in their content:


Maybe a dumb question here: does it mean that the order of the To-Device messages is not guaranteed? I'm asking since I've processed the To-Device messages on the SFU under the assumption that they come at the same order in which they were sent by the client. If the order is not guaranteed, it may cause some interesting (undesired) effects, e.g. if the "Invite -> Hangup" sequence comes as "Hangup -> Invite" on a server, the end effect will not be what the user expects. Or Invite -> SelectAnswer as SelectAnswer -> Invite.

daniel-abramov · 2022-12-02T18:11:16Z

proposals/3401-group-voip.md

+                                    {
+                                        "kind": "audio",
+                                        "id": "zvhjiwqsx", // WebRTC MediaStreamTrack id
+                                        "label": "Sennheiser Mic",


Yeah, right. Should we remove it? (I also don't seem to find a case where we would want to let others know what are devices are called when publishing 🤔)

daniel-abramov · 2022-12-02T21:33:11Z

proposals/3401-group-voip.md

+                                        "settings": {
+                                            "width": 1280,
+                                            "height": 720,
+                                            "facingMode": "user",
+                                            "frameRate": 30.0,
+                                            "m.maxbr": 512000,


Are these settings required for some matrix-specific logic? (maybe bridges that may need this data about a stream or a track?)

I'm just wondering if they are useful in a general case (like why would we need the information about a facingMode of a camera for intsance?). I.e. when we're talking about the SFU, the SFU does not really need to know the facing mode of a user I guess and I'm also not sure if the other call participants would benefit from this information.

The only case where I assume the information about camera mode etc might be useful is when there is a specific app that runs over Matrix and needs to advertise the properties of the video/audio streams in order to implement a specific logic. But in this case, we're talking about application-specific data, i.e. something that must be the logic of the app rather than part of a [generic] Matrix protocol.

I think generally we only need the stream and track IDs, a purpose (for the use case of conference / using WebRTC for calls), and, perhaps basic information about certain tracks like the width and height of the video (theoretically it's not required, because we'll be able to access it when the track is received, but practically we would need it for the simulcast implementation on the SFU side, so such information would be useful for the conference use cases).

daniel-abramov · 2022-12-02T21:40:06Z

proposals/3401-group-voip.md

+
+This builds on [MSC3077](https://github.com/matrix-org/matrix-spec-proposals/pull/3077), which describes streams in `m.call.*` events via a `sdp_stream_metadata` field.
+
+** TODO: Do we need all of this data? Why would we need it? **


IMHO we only need to submit the minimally required information about streams and tracks that would be enough for the calls to happen (with or without the SFU).

The details (label, displaySurface, facingMode, or even bitrate) are probably something that we generally speaking don't need. The apps that run on top of the Matrix could always exchange arbitrary metadata about their devices/tracks/streams if they need it: we anyway won't be able to describe all possible use cases for the matrixRTC in advance since we don't really know all of the use-cases - strictly speaking, video and audio tracks are not necessarily coming from a webcam or a microphone in a general [matrixRTC] case, they could be anything ranging from a mirrorless camera feed attached to the laptop to an audio output of digital instruments forwarded via a DAW).

Signed-off-by: Šimon Brandner <[email protected]>

bwindels

a comment about scope of seq counter

bwindels · 2023-02-06T14:49:19Z

proposals/3401-group-voip.md

+    discarded.
+  * `seq` - The sequence number of the to-device message. This is done since the
+    order of to-device messages is not guaranteed. With each new to-device
+    message this number gets incremented by `1` and it starts at `0`


Can we specify here whether this counter is scoped to the dest_session_id or the call_id? E.g. should it be reset or not when a webrtc connection between two peers is being retried with a new peer connection? Or only be reset after refreshing the client? Hydrogen currently does not reset when retrying a peer connection, nor does Element Call I think, but I think resetting in this case, e.g. scoping it to the call_id, might actually make more sense as signalling between call_id's should be independent (you always only have one call_id per member in a group call) and there is no point in forcing clients to keep the counter in memory longer than needed, e.g. if they are not joined to the call.

Suggested change

message this number gets incremented by `1` and it starts at `0`

message this number gets incremented by `1` and it starts at `0`.

This counter is scoped to the `call_id`, not the `dest_session_id`,

e.g. it should be reset to 0 when the `call_id` has changed,

for example when retrying a peer connection between two group

call members after a `m.call.hangup` has been sent.

HybridEidolon · 2023-06-15T04:18:01Z

proposals/3401-group-voip.md

+
+### m.call state event
+
+The user who wants to initiate a call sends a `m.call` state event into the room to inform the room participants that a call is happening in the room. This effectively becomes the placeholder event in the timeline which clients would use to display the call in their scrollback (including duration and termination reason using `m.terminated`). Its body has the following fields:


The m.room.power_levels state event specifies that posting state events requires a power level of 50 by default. From a user experience standpoint, I would think it is reasonable for normal users in a room to be able to start calls in that room by default, but with the current power_levels policy it would need the m.call power level set lower. It may be desirable for room creation UX in clients to present the option to set this level upfront.

Perhaps there should be a way to specify a different power level requirement for different intents as well. A Discord user would expect to be able to start a room's call freely without disturbing other members of the room ala m.room intent. On the other hand, an m.ring is a much more disruptive intent that would be reserved for smaller group chats and should not normally be allowed in other kinds of rooms.

Imho outside of DMs (where both users have PL100 anyway usually) calls should not be allowed for normal users. It is still a vector of spam. Just imagine having calls being started in Matrix HQ. It would just cause issues imho.

Imho it is a sane default to restrict this and need active changes to allow it in a room.

Just imagine having calls being started in Matrix HQ. It would just cause issues imho.

This is what I mean about the different call intents causing different levels of disruption. You're right, obviously m.ring has very different impact from m.prompt or m.room and the default should be to disallow that. But a room's administration may want users to be able to start calls with one intent and not the other.

Unless I'm misunderstanding the purpose of m.room? Is the idea for m.room intent that a room would always have a call "active", even if it has no participants, ala a "voice channel" in Discord, such that a level-0 user would typically not be able to end that call ergo not need to be able to publish state events for it other than m.call.member?

HybridEidolon · 2023-06-15T04:27:00Z

proposals/3401-group-voip.md

+
+### Call participation
+
+Users who want to participate in the call declare this by publishing a `m.call.member` state event using their matrix ID as the state key (thus ensuring other users cannot edit it).  The event contains an array `m.calls` of objects describing which calls the user is participating in within that room.  This array must contain one item (for now).


~~What happens if the user does not publish an m.call.member event to leave a call?~~ (nevermind, saw the consideration at the bottom) Additionally, does a user convey they are leaving a call by publishing with m.calls as an empty array?

nyabinary · 2024-09-16T15:12:07Z

Any reason why this seemingly stalled? :p

MSC3401: Native Group VoIP Signalling

05fd5af

ara4n added the proposal A matrix spec change proposal label Sep 19, 2021

ara4n added 2 commits September 20, 2021 01:03

comments & cosmetics

7f5ee49

grammar

083fd9a

turt2live added kind:core MSC which is critical to the protocol's success voip needs-implementation This MSC does not have a qualifying implementation for the SCT to review. The MSC cannot enter FCP. labels Sep 20, 2021

ShadowJonathan reviewed Sep 20, 2021

View reviewed changes

deepbluev7 reviewed Sep 20, 2021

View reviewed changes

proposals/3401-group-voip.md Outdated Show resolved Hide resolved

robertlong reviewed Sep 20, 2021

View reviewed changes

proposals/3401-group-voip.md Outdated Show resolved Hide resolved

robertlong reviewed Sep 20, 2021

View reviewed changes

proposals/3401-group-voip.md Outdated Show resolved Hide resolved

SimonBrandner reviewed Sep 21, 2021

View reviewed changes

proposals/3401-group-voip.md Outdated Show resolved Hide resolved

SimonBrandner reviewed Sep 21, 2021

View reviewed changes

proposals/3401-group-voip.md Outdated Show resolved Hide resolved

SimonBrandner reviewed Sep 22, 2021

View reviewed changes

proposals/3401-group-voip.md Outdated Show resolved Hide resolved

kevincox reviewed Sep 22, 2021

View reviewed changes

incorporate review

5ee96fb

deepbluev7 reviewed Sep 22, 2021

View reviewed changes

ara4n added 2 commits September 23, 2021 01:04

more feedback

b90b85e

add purpose from #3077

ed37a0d

ara4n force-pushed the matthew/group-voip branch from 3580f2f to ed37a0d Compare September 23, 2021 00:10

dbkr reviewed Sep 23, 2021

View reviewed changes

ShadowJonathan reviewed Sep 23, 2021

View reviewed changes

proposals/3401-group-voip.md Outdated Show resolved Hide resolved

ShadowJonathan reviewed Sep 23, 2021

View reviewed changes

proposals/3401-group-voip.md Show resolved Hide resolved

SimonBrandner reviewed Sep 23, 2021

View reviewed changes

proposals/3401-group-voip.md Outdated Show resolved Hide resolved

SimonBrandner reviewed Sep 23, 2021

View reviewed changes

proposals/3401-group-voip.md Outdated Show resolved Hide resolved

SimonBrandner reviewed Sep 23, 2021

View reviewed changes

proposals/3401-group-voip.md Outdated Show resolved Hide resolved

giannissc mentioned this pull request Jun 28, 2022

Personal Wishlist Tracker (Ignore) element-hq/element-android#2501

Closed

11 tasks

SimonBrandner reviewed Jul 7, 2022

View reviewed changes

proposals/3401-group-voip.md Outdated Show resolved Hide resolved

robintown mentioned this pull request Jul 29, 2022

Port VideoChannelUtils to MSC3401 element-hq/element-web#22957

Closed

SimonBrandner mentioned this pull request Jul 29, 2022

Add support for group calls using MSC3401 matrix-org/matrix-js-sdk#2553

Merged

HarHarLinks reviewed Sep 21, 2022

View reviewed changes

proposals/3401-group-voip.md Outdated Show resolved Hide resolved

bwindels reviewed Sep 23, 2022

View reviewed changes

proposals/3401-group-voip.md Outdated Show resolved Hide resolved

bwindels reviewed Sep 23, 2022

View reviewed changes

proposals/3401-group-voip.md Outdated Show resolved Hide resolved

This was referenced Sep 25, 2022

Rip out SFU bits out of MSC3401 #3897

Merged

[WIP] MSC3898: Native Matrix VoIP signalling for cascaded foci (SFUs, MCUs...) #3898

Draft

ara4n commented Sep 25, 2022

View reviewed changes

SimonBrandner mentioned this pull request Oct 20, 2022

MSC3914: Matrix native group call push rule #3914

Open

Rip out SFU bits out of MSC3401 (#3897)

32f566a

robintown mentioned this pull request Nov 2, 2022

Improve the README element-hq/element-call#696

Merged

dbkr reviewed Nov 18, 2022

View reviewed changes

robintown added 2 commits November 30, 2022 11:07

Move expiration timestamps to be per-device (#3941)

3fde32b

As discussed with Robert Long

Specify who calls who (#3942)

05b5db2

daniel-abramov reviewed Dec 2, 2022

View reviewed changes

SimonBrandner added 4 commits December 3, 2022 12:25

Clarify expires_ts

43dc42f

Signed-off-by: Šimon Brandner <[email protected]>

Add seq

5635cee

Signed-off-by: Šimon Brandner <[email protected]>

Use heading for Legend

b8ebe27

Signed-off-by: Šimon Brandner <[email protected]>

Fix-up some formatting

6b98d66

Signed-off-by: Šimon Brandner <[email protected]>

fkwp mentioned this pull request Dec 8, 2022

Calls Support (MSC3401) element-hq/hydrogen-web#946

Closed

ara4n mentioned this pull request Dec 27, 2022

MSC3888: Voice Broadcast #3888

Open

bwindels reviewed Feb 6, 2023

View reviewed changes

rriemann mentioned this pull request Mar 13, 2023

Multi-party E2EE voice calls via Jitsi element-hq/element-meta#1147

Open

HybridEidolon reviewed Jun 15, 2023

View reviewed changes

jplatte mentioned this pull request Oct 19, 2023

Add CallMemberEvent (All state events required for matrixRTC) ruma/ruma#1685

Merged

wrenix mentioned this pull request Aug 29, 2024

support new/modern Matrix Call krille-chan/fluffychat#1310

Open

		* `m.ring` if the call is meant to cause the room participants devices to ring (e.g. 1:1 call or group call)
		* `m.conference` is the call should be presented as a conference call which users in the room may connect to


		## Encryption

		We get E2EE for 1:1 and full mesh calls automatically in this model.


		### Call participation

		Users who want to participate in the call declare this by adding an `m.conf` field to their `m.room.member` state event. Ideally, we'd use a dedicated state event type for this, making it easier to rapidly spot who is in a conference. But given we don't want other people editing our state event and Matrix doesn't yet provide that level of access control, instead we (ab)use the `m.room.member` event to declare our participation in the conference in the context of the room. Therefore any profile updates need to be careful to preserve the `m.conf` field.

		@@ -252,13 +267,15 @@ An alternative to to-device messages is to use DMs. You still risk gappy sync p

		## Security considerations

		State events are not encrypted currently, and so this leaks that a call is happening, and who is participating in it, and from which devices.


		The fields within the item in the `m.calls` contents are:

		* `m.call_id` - the ID of the conference the user is claiming to participate in. If this doesn't match an unterminated `m.call` event, it should be ignored.


		This builds on [MSC3077](https://github.com/matrix-org/matrix-spec-proposals/pull/3077), which describes streams in `m.call.*` events via a `sdp_stream_metadata` field.

		TODO: Do we need all of this data? Why would we need it?

-    message this number gets incremented by `1` and it starts at `0`
+    message this number gets incremented by `1` and it starts at `0`.
+    This counter is scoped to the `call_id`, not the `dest_session_id`,
+    e.g. it should be reset to 0 when the `call_id` has changed,
+    for example when retrying a peer connection between two group
+    call members after a `m.call.hangup` has been sent.


		### m.call state event

		The user who wants to initiate a call sends a `m.call` state event into the room to inform the room participants that a call is happening in the room. This effectively becomes the placeholder event in the timeline which clients would use to display the call in their scrollback (including duration and termination reason using `m.terminated`). Its body has the following fields:

MSC3401: Native Group VoIP Signalling #3401

Are you sure you want to change the base?

MSC3401: Native Group VoIP Signalling #3401

Conversation

ara4n commented Sep 19, 2021 • edited Loading

ShadowJonathan left a comment

Choose a reason for hiding this comment

ShadowJonathan Sep 20, 2021 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

SimonBrandner Mar 29, 2023 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

robertlong Sep 20, 2021 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

kyrias Sep 25, 2021 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

kevincox left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

ShadowJonathan Sep 23, 2021 • edited Loading

Choose a reason for hiding this comment

bwindels left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

daniel-abramov Dec 5, 2022 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

bwindels left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

HybridEidolon Jun 15, 2023 • edited Loading

Choose a reason for hiding this comment

nyabinary commented Sep 16, 2024

ara4n commented Sep 19, 2021 •

edited

Loading

ShadowJonathan Sep 20, 2021 •

edited

Loading

SimonBrandner Mar 29, 2023 •

edited

Loading

robertlong Sep 20, 2021 •

edited

Loading

kyrias Sep 25, 2021 •

edited

Loading

ShadowJonathan Sep 23, 2021 •

edited

Loading

daniel-abramov Dec 5, 2022 •

edited

Loading

HybridEidolon Jun 15, 2023 •

edited

Loading