11 Jul 2011 17:43
Re: [AVTCORE] Fwd: I-D Action: draft-westerlund-avtcore-multistream-and-simulcast-00.txt
Randell Jesup <randell-ietf <at> jesup.org>
2011-07-11 15:43:39 GMT
2011-07-11 15:43:39 GMT
On 7/11/2011 3:56 AM, Roni Even wrote: > 3. In section 1.1 "Todays SDP signalling support for this is basically the > directionality attribute which indicates an end-point intend to send media > or not. No indication of how many media streams." My understanding so far > was that the number of PT in the m-line is the maximum number of streams > that can be sent in an RTP session. I've never heard of this (perhaps it's an issue with the meaning of stream in "number of streams that can be sent in an RTP session"). The number of distinct payloads, yes. But in theory (though not well supported, if at all) you could have a mixer providing N "streams", all of the same payload or different ones (with different SSRCs). I think it's just a terminology issue (and given the relative positions, I'm probably the one who's wrong).But I'll note your comment isn't clear to me, and it should be. > This goes also to the discussion on > support of multiple SSRCs. I think that today video conferencing endpoint is > doing a second offer answer to fix the PT it will use based on the first > offer answer exchange. They may be; I mostly work with devices designed for point-to-point - but that means that entire conferences either get locked down to the lowest-common-denominator codec, or it demands the MCU transcode. I personally prefer to leave multiple payloads/codecs to allow endpoints to switch payloads in order to help adapt to various things, like network conditions. In theory, an endpoint could even switch between voice codecs and better-at-music-audio-codecs when they detect music, as a random non-network example. I don't know if the non-network use-cases would be compelling, but the network adaptation are compelling, given the time (and possible failures) required to re-INVITE. > 4. The above still does not address the multiple SSRC for the same PT and I > already mentioned it in the past that I think that a lot of product today > will not handle it good. I think that this is a rare case today at least in > centralized video conferencing since the MCU (or RTP mixer if exist) will > use its own SSRC to send the mixed/switched media. Some devices reuse the same incoming ports for multiple sessions, both within a call and between different calls they're involved in or are acting as a conference bridge. As mentioned previously, you're right that many endpoints don't deal well (or consistently) with multiple SSRCs in a session. > 5. In section 3.2 second paragraph you talk about the receivers controlling > what the relay will send but this will make the relay something else than > RTP translator. > > 6. On the bandwidth support. Currently the definition is the b= is a > receiver capability. I am not sure why you need the sender one, you claim > that there is no way to control the actual bw but the receiver can use RTCP > CCM flow control to tell the sender to reduce the rate. As for the syntax, > my preference is for TIAS semantics and not the one you proposed. Well, knowing what rate the sender plans to use *could* be useful in deciding how many resources (memory, CPU/DSP horsepower, etc) might be needed and so help the other side plan or (if it's answering) help it decide how to answer (number of streams, codecs to accept, etc). The fact that it could be useful doesn't always mean that it's a good idea to complicate the protocol to support it directly, however. > 7. In the last sentence of 7.2.1 you say "This enables a simple fallback > solution to exclude a legacy client from all simulcast versions except one, > whichever is most suitable for the application". My view is that currently > when there are several m-lines of the same media type there is no definition > which one to choose if the answerer do not understand the semantics of the > offer and can support only one stream. Agreed - there's no definition; however almost all clients that don't support multiple m= lines of the same type will select the first one, or the first one that they can find a matching payload for. -- -- Randell Jesup randell-ietf <at> jesup.org _______________________________________________ Audio/Video Transport Core Maintenance avt <at> ietf.org https://www.ietf.org/mailman/listinfo/avt
But I'll note your comment isn't clear to me, and it
should be.
> This goes also to the discussion on
> support of multiple SSRCs. I think that today video conferencing endpoint is
> doing a second offer answer to fix the PT it will use based on the first
> offer answer exchange.
They may be; I mostly work with devices designed for point-to-point -
but that means that
entire conferences either get locked down to the
lowest-common-denominator codec, or
it demands the MCU transcode. I personally prefer to leave multiple
payloads/codecs
to allow endpoints to switch payloads in order to help adapt to various
things, like network
conditions. In theory, an endpoint could even switch between voice
codecs and
better-at-music-audio-codecs when they detect music, as a random
non-network example.
I don't know if the non-network use-cases would be compelling, but the
network adaptation
are compelling, given the time (and possible failures) required to
re-INVITE.
> 4. The above still does not address the multiple SSRC for the same PT and I
> already mentioned it in the past that I think that a lot of product today
> will not handle it good. I think that this is a rare case today at least in
> centralized video conferencing since the MCU (or RTP mixer if exist) will
> use its own SSRC to send the mixed/switched media.
Some devices reuse the same incoming ports for multiple sessions, both
within a call and
between different calls they're involved in or are acting as a
conference bridge. As mentioned
previously, you're right that many endpoints don't deal well (or
consistently) with multiple
SSRCs in a session.
> 5. In section 3.2 second paragraph you talk about the receivers controlling
> what the relay will send but this will make the relay something else than
> RTP translator.
>
> 6. On the bandwidth support. Currently the definition is the b= is a
> receiver capability. I am not sure why you need the sender one, you claim
> that there is no way to control the actual bw but the receiver can use RTCP
> CCM flow control to tell the sender to reduce the rate. As for the syntax,
> my preference is for TIAS semantics and not the one you proposed.
Well, knowing what rate the sender plans to use *could* be useful in
deciding how many
resources (memory, CPU/DSP horsepower, etc) might be needed and so help
the other side
plan or (if it's answering) help it decide how to answer (number of
streams, codecs to accept, etc).
The fact that it could be useful doesn't always mean that it's a good
idea to complicate the protocol
to support it directly, however.
> 7. In the last sentence of 7.2.1 you say "This enables a simple fallback
> solution to exclude a legacy client from all simulcast versions except one,
> whichever is most suitable for the application". My view is that currently
> when there are several m-lines of the same media type there is no definition
> which one to choose if the answerer do not understand the semantics of the
> offer and can support only one stream.
Agreed - there's no definition; however almost all clients that don't
support multiple m= lines
of the same type will select the first one, or the first one that they
can find a matching payload for.
RSS Feed