- 相关推荐
RTP-----实时软件传输协议 外文翻译(一)
附录
英文原文资料
RTP: A Transport Protocol for Real-Time Applications
1 Introduction
This memorandum specifies the real-time transport protocol (RTP), which provides end-to-end delivery services for data with real-time characteristics, such as interactive audio and video. Those services include payload type identification, sequence numbering, times tamping and delivery monitoring. Applications typically run RTP on top of UDP to make use of its multiplexing and checksum services; both protocols contribute parts of the transport protocol functionality. However, RTP may be used with other suitable underlying network or transport protocols (see Section 10). RTP supports data transfer to multiple destinations using multicast distribution if provided by the underlying network.
Note that RTP itself does not provide any mechanism to ensure timely delivery or provide other quality-of-service guarantees, but relies on lower-layer services to do so. It does not guarantee delivery or prevent out-of-order delivery, nor does it assume that the underlying network is reliable and delivers packets in sequence. The sequence numbers included in RTP allow the receiver to reconstruct the sender's packet sequence, but sequence numbers might also be used to determine the proper location of a packet, for example in video decoding, without necessarily decoding packets in sequence.
While RTP is primarily designed to satisfy the needs of multi- participant multimedia conferences, it is not limited to that particular application. Storage of continuous data, interactive distributed simulation, active badge, and control and measurement applications may also find RTP applicable.
This document defines RTP, consisting of two closely-linked parts:
[1]. The real-time transport protocol (RTP), to carry data that has real-time properties.
[2]. The RTP control protocol (RTCP), to monitor the quality of service and to convey information about the participants in an on-going session. The latter aspect of RTCP may be sufficient for "loosely controlled" sessions, i.e., where there is no explicit membership control and set-up, but it is not necessarily intended to support all of an application's control communication requirements. This functionality may be fully or partially subsumed by a separate session control protocol, which is beyond the scope of this document.
RTP represents a new style of protocol following the principles of application level framing and integrated layer processing proposed by Clark and Tennenhouse [1]. That is, RTP is intended to be malleable to provide the information required by a particular application and will often be integrated into the application processing rather than being implemented as a separate layer. RTP is a protocol framework that is deliberately not complete. This document specifies those functions expected to be common across all the applications for which RTP would be appropriate. Unlike conventional protocols in which additional functions might be accommodated by making the protocol more general or by adding an option mechanism that would require parsing, RTP is intended to be tailored through modifications and/or additions to the headers as needed. Examples are given in Sections 5.3 and 6.3.3.
Therefore, in addition to this document, a complete specification of RTP for a particular application will require one or more companion documents (see Section 12):
[1]. A profile specification document, which defines a set of payload type codes and their mapping to payload formats (e.g., media encodings). A profile may also define extensions or modifications to RTP that are specific to a particular class of applications. Typically an application will operate under only one profile. A profile for audio and video data may be found in the companion RFC TBD.
[2]. Payload format specification documents, which define how a particular payload, such as an audio or video encoding, is to be carried in RTP.
A discussion of real-time services and algorithms for their implementation as well as background discussion on some of the RTP design decisions can be found in [2].
Several RTP applications, both experimental and commercial, have already been implemented from draft specifications. These applications include audio and video tools along with diagnostic tools such as traffic monitors. Users of these tools number in the thousands. However, the current Internet cannot yet support the full potential demand for real-time services. High-bandwidth services using RTP, such as video, can potentially seriously degrade the quality of service of other network services. Thus, implementors should take appropriate precautions to limit accidental bandwidth usage. Application documentation should clearly outline the limitations and possible operational impact of high-bandwidth real- time services on the Internet and other network services.
2 RTP Use Scenarios
The following sections describe some aspects of the use of RTP. The examples were chosen to illustrate the basic operation of applications using RTP, not to limit what RTP may be used for. In these examples, RTP is carried on top of IP and UDP, and follows the conventions established by the profile for audio and video specified in the companion Internet-Draft draft-ietf-avt-profile
2.1 Simple Multicast Audio Conference
A working group of the IETF meets to discuss the latest protocol draft, using the IP multicast services of the Internet for voice communications. Through some allocation mechanism the working group chair obtains a multicast group address and pair of ports. One port is used for audio data, and the other is used for control (RTCP) packets. This address and port information is distributed to the intended participants. If privacy is desired, the data and control packets may be encrypted as specified in Section 9.1, in which case an encryption key must also be generated and distributed. The exact details of these allocation and distribution mechanisms are beyond the scope of RTP.
The audio conferencing application used by each conference participant sends audio data in small chunks of, say, 20 ms duration. Each chunk of audio data is preceded by an RTP header; RTP header and data are in turn contained in a UDP packet. The RTP header indicates what type of audio encoding (such as PCM, ADPCM or LPC) is contained in each packet so that senders can change the encoding during a conference, for example, to accommodate a new participant that is connected through a low-bandwidth link or react to indications of network congestion.
The Internet, like other packet networks, occasionally loses and reorders packets and delays them by variable amounts of time. To cope with these impairments, the RTP header contains timing information and a sequence number that allow the receivers to reconstruct the timing produced by the source, so that in this example, chunks of audio are contiguously played out the speaker every 20 ms. This timing reconstruction is performed separately for each source of RTP packets in the conference. The sequence number can also be used by the receiver to estimate how many packets are being lost.
Since members of the working group join and leave during the conference, it is useful to know who is participating at any moment and how well they are receiving the audio data. For that purpose, each instance of the audio application in the conference periodically multicasts a reception report plus the name of its user on the RTCP (control) port. The reception report indicates how well the current speaker is being received and may be used to control adaptive encodings. In addition to the user name, other identifying information may also be included subject to control bandwidth limits. A site sends the RTCP BYE packet (Section 6.5) when it leaves the conference.
2.2 Audio and Video Conference
If both audio and video media are used in a conference, they are transmitted as separate RTP sessions RTCP packets are transmitted for each medium using two different UDP port pairs and/or multicast addresses. There is no direct coupling at the RTP level between the audio and video sessions, except that a user participating in both sessions should use the same distinguished (canonical) name in the RTCP packets for both so that the sessions can be associated.
One motivation for this separation is to allow some participants in the conference to receive only one medium if they choose. Further explanation is given in Section 5.2. Despite the separation, synchronized playback of a source's audio and video can be achieved using timing information carried in the RTCP packets for both sessions.
2.3 Mixers and Translators
So far, we have assumed that all sites want to receive media data in the same format. However, this may not always be appropriate. Consider the case where participants in one area are connected through a low-speed link to the majority of the conference participants who enjoy high-speed network access. Instead of forcing everyone to use a lower-bandwidth, reduced-quality audio encoding, an RTP-level relay called a mixer may be placed near the low-bandwidth area. This mixer resynchronizes incoming audio packets to reconstruct the constant 20 ms spacing generated by the sender, mixes these reconstructed audio streams into a single stream, translates the audio encoding to a lower-bandwidth one and forwards the lower- bandwidth packet stream across the low-speed link. These packets might be unicast to a single recipient or multicast on a different address to multiple recipients. The RTP header includes a means for mixers to identify the sources that contributed to a mixed packet so that correct talker indication can be provided at the receivers.
Some of the intended participants in the audio conference may be connected with high bandwidth links but might not be directly reachable via IP multicast. For example, they might be behind an application-level firewall that will not let any IP packets pass. For these sites, mixing may not be necessary, in which case another type of RTP-level relay called a translator may be used. Two translators are installed, one on either side of the firewall, with the outside one funneling all multicast packets received through a secure connection to the translator inside the firewall. The translator inside the firewall sends them again as multicast packets to a multicast group restricted to the site's internal network.
Mixers and translators may be designed for a variety of purposes. An example is a video mixer that scales the images of individual people in separate video streams and composites them into one video stream to simulate a group scene. Other examples of translation include the connection of a group of hosts speaking only IP/UDP to a group of hosts that understand only ST-II, or the packet-by-packet encoding translation of video streams from individual sources without resynchronization or mixing. Details of the operation of mixers and translators are given in Section 7.
中文翻译
RTP-----------实时软件传输协议
1 介绍
实时传输协议RTP,可以对于那些具有实时特征的数据,比如交互式的音频、视频提供端到端的传输服务。提供的服务包括对传输数据类型的鉴别,顺序的排列以及传输时间及过程的监控。一般应用程序运行RTP多与UDP来实现多路传输和checksum的服务,虽然两种协议都提供了传输功能,但是RTP 可以用在某些与之相适配的底层网络和传输协议中。RTP可以在网络的允许下利用多点传送功能向多个目标发送数据。
但是要注意到,RTP本身不能对传输的及时性及传输的质量提供保证,这些是依靠它的下层服务来实现的。它也不能保证在传输过程中传输顺序都是有序的,就像他不能确定基层的网络是可靠的,其在网络上传送的包是按顺序的一样。RTP中对包都进行了编号,那样就允许接受者重建包的顺序,而且这些编码可以用来测定包所在的位子,比如一个视频,完全依次进行编码是没有必要的。
RTP最初设计是为了满足多人视频会议,但现在已经不仅仅局限在这个方面了,数据的连续存储,互动的分布式模拟,以及控制和测量部门都能找RTP的身影。
本文对RTP的定义包括两个方面:
[1].实时传输协议,用来传输具有实时特征的数据;
[2].实施传输控制协议RTCP,用来监测服务的质量以及传达某个正在进行的会议中各个成员的信息。对于RTCP第二个方面的应用,在一些不是非常严格的会议我们已经得到了应用:一般这些会议没有复杂的成员控制和建立,那么对所有应用程序控制的交流是没有必要的。这种情况也许会被部分或者全部的包含在一个独立的会议控制中,这个已经超出了本文的讨论范围。
RTP是继应用层框架原理以后新的协议类型,他整合层的处理。也就是说RTP对于一个应用程序所要求得信息处理已经不再是作为一个单独的层去进行,而是随着整合进该程序的进行过程中,同时处理。RTP有意成为一个不完整的协议框架。本文阐述这些功能,希望在那些适合RTP的应用程序中RTP能得到充分的发挥。而不像一些传统的协议那样,需要通过推广或者是机构的授权来增加附加功能。此外,如果想要知道对于某一个特定程序的RTP的描述,你可以在一些相关的书中寻找(见12章):
[1].一个概括地说明文档,定义一系列载荷类型编码和相对应的载荷格式。同时也说明了在某些特定的类型的应用程序中RTP的扩展和修改。以及各一个具有代表性的应用程序的炒作过程。一个为视频和音频数据做的概要说明可以在RFC TBD里找到。
[2].在和类型的描述文档,定义了一个特定的载荷,比如音频和视频编码是如何通过RTP来传输的。
对于实时服务的讨论,对于RTP设计及其运行时所遵循的算法和背景的讨论我们可以在第二节找到。
一些RTP程序,不管是试验性质的还是商业性质的从设计阶段上升到了实践阶段。这些程序包括音频和视频工具以及一些诊断工具比如交通监视器。这些工具的用户数量已有成千上万。但是现在的英特网还无法支持实时服务全部潜在的需求,利用RTP的高速宽带服务,比如视频,将会严重的降低网络其他服务的质量。所以,执行者应该采取合适的防范措施来限制那些次要的宽带利用。应用程序文件会清楚的略述这些限制以及在英特网和其他网络服务中高速宽带的实时服务可能会带来的影响。
2 RTP 使用环境
这个章节我们将讨论RTP的使用方面。我们将会通过实例来说明使用RTP程序的一些基本操作,但不限制使用的是什么样的RTP。在这些举例中,RTP运载于IP和UDP之上,在其之后是一些为了视频和音频而已经确立的协议,这些协议可以在同类书籍中找到。
2.1 简单的多点音频会议
一个工作组要讨论一个最近的工作草案,他们可以通过英特网的多点服务来进行语音交流。通过一些机制分配,工作组组长获得一个多点传送的地址以及两个端口,一个是用来传输语音数据,一个是用来控制包的传输,这个地址同时也被分送到每个成员那里。如果有保密的需要,数据及控制包可以被加密(详见9.1章),当然这样的话解密钥匙也必须要发布出去,关于机制的具体分布与安排不在RTP的讨论范围之内。
参加音频会议的人以包的形式传输语音数据,平均20毫秒一个。每个包有一个RTP报头。RTP 报头及其数据依次放入UDP包中。RTP报头用来说音频编码的类型(比如PCM ,ADPCM, LPC),这样的方式可以让数据发送者在会议中改变编码方式,这样的话,我们可以单独的为一个低速会议成员安排接入方式,同时我们也可以对网络的拥堵做出反应。
英特网,和其他的封宝试的网络一样,有时候会丢失包或者发生不可预知的时间延迟,为了处理这种情况,RTP报头包含了一个时间信息,和一个序列号码,那样就允许接受者重新排序,在这个例子中,音频包每隔20毫秒发出一个,会议中对于这些RTP包时序的重组一直在独立的进行着。序列号码还可以用来估计有多少包在传输中丢失了。
鉴于工作组成员会在开会时进入或者离开,那么清楚的知道到底谁在参加会议以及到底他们接受的音频数据质量如何是很有用的。于是每一个远程应用程序会定时的往RTCP发送一个接收报告,报告里会附上他们的名字。这个报告会指出现在对于讲话者数据接收如何从而用来控制合适的编码。除了用户的名字之外,我们还会用到其他的鉴别信息来控制款待限制。一个节点会发送一个“BYE”包(见6.5章)当他离开的时候。
2.2音频和视频会议
如果在一个会议中同时要用到视频和音频的话,他们在传输的过程中,用的是互相独立的RTP层,两种媒体的RTCP包也是用不同的UDP端口或者是不同的多点传送地址。
在音频和视频两个层之间没有直接的连接。除非是一个成员要以同一个规范化的名字参加两个会议层,这种情况下连个层会被连在一起。
把两种媒体分割开来可以使会议成员自由的选择一种或两种媒体。尽管是分割的,但是通过RTCP包中的时间信息,我们完全可以是两种媒体同步起来。
2.3 混频和翻译
到目前为止,我们一直假定所有的网站都是接受相同类型的媒体数据,但事实上,这种假定是不合理的。考虑到有些成员,以低速网络接入一个大部分成员是高速网络的会议中,我们可以在低速网络的区域放一个混频器那样我们就不用要求每一个成员都要以低速,低质的音频方式接入了。这个混频器重新同步音频包,以恒定的20毫秒的间隔重建发送者的音频信号。把这些改造过的音频流混合成一个单独的流,这样就把这些音频编码转化成了适用于低速宽带面下低速连接的数据包流。这些包可以被断点传送给一个用户也可以被多点传送给不同用户的不同地址。RTP报头包含了混合器的方法,这样即使包被混合了,接受者还是可以正确的确认谁在发言。
音频会议中,一些用户虽然是通过高速宽带连接,但他们有可能依然不可以直接通过IP被多点传送。比如,他们运行了一个应用层的防火墙,就会阻止任何IP包通过。对于这些站点,混频可能会失败,这种情况下我们可以利用另一个方法,翻译。防火墙的两端都装上翻译器,这样防火墙的两端好像形成了一个相连的漏斗,多点传送包,就可以安全的从外面传到里面,然后防火墙内部的翻译器再一次对网络内部进行多点传送。
【RTP-----实时软件传输协议 外文翻译(一)】相关文章:
图像实时传输技术在我院中的应用03-18
基于WE904的实时图像无线传输设计03-21
静态多跳Ad Hoc网络中压缩视频实时传输性能研究03-30
使用Rhapsody软件框架和UML的实时系统开发03-07
实时混音的实现03-18
黄冈本地传输网络优化(一)03-07
外文文献的引用格式11-29
本地传输网管设计方案分析(一)03-07
建立实时企业的策略分析03-20