http://tools.ietf.org/html/draft-pantos-http-live-streaming-12 Informational R. Pantos, Ed. Internet-Draft W. May Intended status: Informational Apple Inc. Expires: April 17, 2014 October 14, 2013 HTTP Live Streaming draft-pantos-http-live-streaming-12 Abstract This document describes a protocol for transferring unbounded streams of multimedia data. It specifies the data format of the files and the actions to be taken by the server (sender) and the clients (receivers) of the streams. It describes version 6 of this protocol. Status of this Memo This Internet-Draft is submitted in full conformance with the provisions of BCP 78 and BCP 79. This document may not be modified, and derivative works of it may not be created, and it may not be published except as an Internet-Draft. Internet-Drafts are working documents of the Internet Engineering Task Force (IETF). Note that other groups may also distribute working documents as Internet-Drafts. The list of current Internet- Drafts is at http://datatracker.ietf.org/drafts/current/. Internet-Drafts are draft documents valid for a maximum of six months and may be updated, replaced, or obsoleted by other documents at any time. It is inappropriate to use Internet-Drafts as reference material or to cite them other than as "work in progress." This Internet-Draft will expire on April 17, 2014. Copyright Notice Copyright (c) 2013 IETF Trust and the persons identified as the document authors. All rights reserved. This document is subject to BCP 78 and the IETF Trust's Legal Provisions Relating to IETF Documents (http://trustee.ietf.org/license-info) in effect on the date of publication of this document. Please review these documents carefully, as they describe your rights and restrictions with respect to this document. This Informational Internet Draft is submitted as an RFC Editor Pantos & May Expires April 17, 2014 [Page 1] Internet-Draft HTTP Live Streaming October 2013 Contribution and/or non-IETF Document (not as a Contribution, IETF Contribution, nor IETF Document) in accordance with BCP 78 and BCP 79. Table of Contents 1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . . 4 2. Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . 4 3. The Playlist file . . . . . . . . . . . . . . . . . . . . . . 4 3.1. Introduction . . . . . . . . . . . . . . . . . . . . . . . 4 3.2. Attribute Lists . . . . . . . . . . . . . . . . . . . . . 5 3.3. Standard Tags . . . . . . . . . . . . . . . . . . . . . . 6 3.3.1. EXTM3U . . . . . . . . . . . . . . . . . . . . . . . . 6 3.3.2. EXTINF . . . . . . . . . . . . . . . . . . . . . . . . 7 3.4. New Tags . . . . . . . . . . . . . . . . . . . . . . . . . 7 3.4.1. EXT-X-BYTERANGE . . . . . . . . . . . . . . . . . . . 7 3.4.2. EXT-X-TARGETDURATION . . . . . . . . . . . . . . . . . 8 3.4.3. EXT-X-MEDIA-SEQUENCE . . . . . . . . . . . . . . . . . 8 3.4.4. EXT-X-KEY . . . . . . . . . . . . . . . . . . . . . . 9 3.4.5. EXT-X-PROGRAM-DATE-TIME . . . . . . . . . . . . . . . 10 3.4.6. EXT-X-ALLOW-CACHE . . . . . . . . . . . . . . . . . . 11 3.4.7. EXT-X-PLAYLIST-TYPE . . . . . . . . . . . . . . . . . 11 3.4.8. EXT-X-ENDLIST . . . . . . . . . . . . . . . . . . . . 11 3.4.9. EXT-X-MEDIA . . . . . . . . . . . . . . . . . . . . . 11 3.4.9.1. Rendition Groups . . . . . . . . . . . . . . . . . 14 3.4.10. EXT-X-STREAM-INF . . . . . . . . . . . . . . . . . . . 15 3.4.10.1. Alternative Renditions . . . . . . . . . . . . . . 16 3.4.11. EXT-X-DISCONTINUITY . . . . . . . . . . . . . . . . . 17 3.4.12. EXT-X-DISCONTINUITY-SEQUENCE . . . . . . . . . . . . . 18 3.4.13. EXT-X-I-FRAMES-ONLY . . . . . . . . . . . . . . . . . 18 3.4.14. EXT-X-MAP . . . . . . . . . . . . . . . . . . . . . . 19 3.4.15. EXT-X-I-FRAME-STREAM-INF . . . . . . . . . . . . . . . 20 3.4.16. EXT-X-START . . . . . . . . . . . . . . . . . . . . . 20 3.4.17. EXT-X-VERSION . . . . . . . . . . . . . . . . . . . . 21 4. Media segments . . . . . . . . . . . . . . . . . . . . . . . . 22 5. Key files . . . . . . . . . . . . . . . . . . . . . . . . . . 23 5.1. Introduction . . . . . . . . . . . . . . . . . . . . . . . 23 5.2. IV for [AES_128] . . . . . . . . . . . . . . . . . . . . . 24 6. Client/Server Actions . . . . . . . . . . . . . . . . . . . . 24 6.1. Introduction . . . . . . . . . . . . . . . . . . . . . . . 24 6.2. Server Process . . . . . . . . . . . . . . . . . . . . . . 24 6.2.1. Introduction . . . . . . . . . . . . . . . . . . . . . 24 6.2.2. Live Playlists . . . . . . . . . . . . . . . . . . . . 27 6.2.3. Encrypting media segments . . . . . . . . . . . . . . 27 6.2.4. Providing variant streams . . . . . . . . . . . . . . 28 6.3. Client Process . . . . . . . . . . . . . . . . . . . . . . 29 Pantos & May Expires April 17, 2014 [Page 2] Internet-Draft HTTP Live Streaming October 2013 6.3.1. Introduction . . . . . . . . . . . . . . . . . . . . . 29 6.3.2. Loading the Playlist file . . . . . . . . . . . . . . 29 6.3.3. Playing the Playlist file . . . . . . . . . . . . . . 30 6.3.4. Reloading the Playlist file . . . . . . . . . . . . . 31 6.3.5. Determining the next segment to load . . . . . . . . . 31 6.3.6. Decrypting encrypted media segments . . . . . . . . . 32 7. Protocol version compatibility . . . . . . . . . . . . . . . . 32 8. Examples . . . . . . . . . . . . . . . . . . . . . . . . . . . 33 8.1. Introduction . . . . . . . . . . . . . . . . . . . . . . . 33 8.2. Simple Media Playlist file . . . . . . . . . . . . . . . . 34 8.3. Live Media Playlist, using HTTPS . . . . . . . . . . . . . 34 8.4. Playlist file with encrypted media segments . . . . . . . 34 8.5. Master Playlist file . . . . . . . . . . . . . . . . . . . 35 8.6. Master Playlist with I-Frames . . . . . . . . . . . . . . 35 8.7. Master Playlist with Alternative audio . . . . . . . . . . 35 8.8. Master Playlist with Alternative video . . . . . . . . . . 36 9. Contributors . . . . . . . . . . . . . . . . . . . . . . . . . 37 10. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 37 11. Security Considerations . . . . . . . . . . . . . . . . . . . 38 12. References . . . . . . . . . . . . . . . . . . . . . . . . . . 39 12.1. Normative References . . . . . . . . . . . . . . . . . . . 39 12.2. Informative References . . . . . . . . . . . . . . . . . . 41 Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . . 41 Pantos & May Expires April 17, 2014 [Page 3] Internet-Draft HTTP Live Streaming October 2013 1. Introduction This document describes a protocol for transferring unbounded streams of multimedia data. The protocol supports the encryption of media data and allows clients to choose from among different encodings of the presentation. Media data can be transferred soon after it is created, allowing it to be played in near real-time. Data is usually carried over HTTP [RFC2616]. External references that describe related standards such as HTTP are listed in Section 11. 2. Summary A multimedia presentation is specified by a URI [RFC3986] to a Playlist file, which is an ordered list of media URIs and informational tags. The URIs and their associated tags specify a series of media segments. To play the presentation, the client first obtains the Playlist file and then obtains and plays each media segment in the Playlist. It reloads the Playlist file as described in this document to discover additional segments. The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this document are to be interpreted as described in RFC 2119 [RFC2119]. 3. The Playlist file 3.1. Introduction Playlists MUST be Extended M3U Playlist files [M3U]. This document extends the M3U file format further by defining additional tags. An M3U Playlist is a text file that consists of individual lines. Lines are terminated by either a single LF character or a CR character followed by an LF character. Each line is a URI, blank, or starts with the character '#'. Blank lines are ignored. White space MUST NOT be present, except for elements in which it is explicitly specified. A URI line identifies a media segment or a Playlist file (see Section 3.4.10). Each media segment is specified by a media URI and the tags that apply to it. Pantos & May Expires April 17, 2014 [Page 4] Internet-Draft HTTP Live Streaming October 2013 A Playlist is known as a Media Playlist if all URI lines in the Playlist identify media segments. A Playlist is known as a Master Playlist if all URI lines in the Playlist identify Media Playlists. Lines that start with the character '#' are either comments or tags. Tags begin with #EXT. All other lines that begin with '#' are comments and SHOULD be ignored. A URI in a Playlist, whether it is a URI line or part of a tag, MAY be relative. Relative URIs MUST be resolved against the URI of the Playlist file that contains it. The duration of a Media Playlist file is the sum of the durations of the media segments within it. Playlist files whose names end in .m3u8 and/or have the HTTP Content- Type "application/vnd.apple.mpegurl" are encoded in UTF-8 [RFC3629]. Files whose names end with .m3u and/or have the HTTP Content-Type [RFC2616] "audio/mpegurl" are encoded in US-ASCII [US_ASCII]. Playlist files MUST have names that end in .m3u8 and/or have the Content-Type "application/vnd.apple.mpegurl" (if transferred over HTTP), or have names that end in .m3u and/or have the HTTP Content- Type type "audio/mpegurl" (for compatibility). 3.2. Attribute Lists Certain extended M3U tags have values which are Attribute Lists. An Attribute List is a comma-separated list of attribute/value pairs with no whitespace. An attribute/value pair has the following syntax: AttributeName=AttributeValue An AttributeName is an unquoted string containing characters from the set [A..Z] and '-'. An AttributeValue is one of the following: o decimal-integer: an unquoted string of characters from the set [0..9] expressing an integer in base-10 arithmetic. o hexadecimal-integer: an unquoted string of characters from the set [0..9] and [A..F] that is prefixed with 0x or 0X and which expresses an integer in base-16 arithmetic. Pantos & May Expires April 17, 2014 [Page 5] Internet-Draft HTTP Live Streaming October 2013 o decimal-floating-point: an unquoted string of characters from the set [0..9] and '.' which expresses a floating-point number in decimal positional notation. o quoted-string: a string of characters within a pair of double- quotes ("), including Uniform Type Identifiers [UTI]. The set of characters allowed in the string and any rules for escaping special characters are specified by the Attribute definition, but characters within a quoted-string MUST NOT include linefeed (0xA), carriage-return (0xD) or double quote (0x22). Attributes that wish to carry these characters must do so in an escaped form, such as percent-encoded for a URI. o enumerated-string: an unquoted character string from a set which is explicitly defined by the Attribute. An enumerated-string will never contain double-quotes ("), commas (,), or whitespace. o decimal-resolution: two decimal-integers separated by the "x" character. The first integer is a horizontal pixel dimension (width); the second is a vertical pixel dimension (height). The type of the AttributeValue for a given AttributeName is specified by the Attribute definition. A given AttributeName MUST NOT appear more than once in a given Attribute List. An Attribute/value pair with an unrecognized AttributeName MUST be ignored by the client. Any tag containing an attribute/value pair of type enumerated-string whose AttributeName is recognized but whose AttributeValue is not recognized MUST be ignored by the client. 3.3. Standard Tags 3.3.1. EXTM3U An Extended M3U file is distinguished from a basic M3U file by its first line, which MUST be the tag #EXTM3U. This tag MUST be included in both Media Playlists and Master Playlists. Its format is: #EXTM3U Pantos & May Expires April 17, 2014 [Page 6] Internet-Draft HTTP Live Streaming October 2013 3.3.2. EXTINF The EXTINF tag specifies the duration of a media segment. It applies only to the media segment that follows it, and MUST be followed by a media segment URI. Each media segment MUST be preceded by an EXTINF tag. Its format is: #EXTINF:, where duration is an decimal-integer or decimal-floating-point number that specifies the duration of the media segment in seconds. Durations that are reported as integers SHOULD be rounded to the nearest integer. Durations MUST be integers if the protocol version of the Playlist file is less than 3. Durations SHOULD be floating- point if the version is equal to or greater than 3. The remainder of the line following the comma is an optional human-readable informative title of the media segment. 3.4. New Tags This document defines the following new tags: EXT-X-BYTERANGE, EXT-X- TARGETDURATION, EXT-X-MEDIA-SEQUENCE, EXT-X-KEY, EXT-X-PROGRAM-DATE- TIME, EXT-X-ALLOW-CACHE, EXT-X-PLAYLIST-TYPE, EXT-X-STREAM-INF, EXT- X-I-FRAME-STREAM-INF, EXT-X-I-FRAMES-ONLY, EXT-X-MEDIA, EXT-X- ENDLIST, EXT-X-DISCONTINUITY, EXT-X-DISCONTINUITY-SEQUENCE, EXT-X- START, and EXT-X-VERSION. 3.4.1. EXT-X-BYTERANGE The EXT-X-BYTERANGE tag indicates that a media segment is a sub-range of the resource identified by its media URI. It applies only to the next media URI that follows it in the Playlist. Its format is: #EXT-X-BYTERANGE:<n>[@<o>] where n is a decimal-integer indicating the length of the sub-range in bytes. If present, o is a decimal-integer indicating the start of the sub-range, as a byte offset from the beginning of the resource. If o is not present, the sub-range begins at the next byte following the sub-range of the previous media segment. If o is not present, a previous media segment MUST appear in the Playlist file and MUST be a sub-range of the same media resource. A media URI with no EXT-X-BYTERANGE tag applied to it specifies a media segment that consists of the entire resource. The EXT-X-BYTERANGE tag appeared in version 4 of the protocol. It Pantos & May Expires April 17, 2014 [Page 7] Internet-Draft HTTP Live Streaming October 2013 MUST NOT appear in a Master Playlist. 3.4.2. EXT-X-TARGETDURATION The EXT-X-TARGETDURATION tag specifies the maximum media segment duration. The EXTINF duration of each media segment in the Playlist file, when rounded to the nearest integer, MUST be less than or equal to the target duration. This tag MUST appear once in a Media Playlist file. It applies to the entire Playlist file. Its format is: #EXT-X-TARGETDURATION:<s> where s is a decimal-integer indicating the target duration in seconds. The EXT-X-TARGETDURATION tag MUST NOT appear in a Master Playlist. 3.4.3. EXT-X-MEDIA-SEQUENCE Each media segment in a Playlist has a unique integer sequence number. The sequence number of a segment is equal to the sequence number of the segment that preceded it plus one. The EXT-X-MEDIA- SEQUENCE tag indicates the sequence number of the first segment that appears in a Playlist file. Its format is: #EXT-X-MEDIA-SEQUENCE:<number> where number is a decimal-integer. The sequence number MUST NOT decrease. A Media Playlist file MUST NOT contain more than one EXT-X-MEDIA- SEQUENCE tag. If the Media Playlist file does not contain an EXT-X- MEDIA-SEQUENCE tag then the sequence number of the first segment in the playlist SHALL be considered to be 0. A client MUST NOT assume that segments with the same sequence number in different Media Playlists contain matching content. A media URI is not required to contain its sequence number. See Section 6.2.1, Section 6.3.2 and Section 6.3.5 for information on handling the EXT-X-MEDIA-SEQUENCE tag. The EXT-X-MEDIA-SEQUENCE tag MUST NOT appear in a Master Playlist. Pantos & May Expires April 17, 2014 [Page 8] Internet-Draft HTTP Live Streaming October 2013 3.4.4. EXT-X-KEY Media segments MAY be encrypted. The EXT-X-KEY tag specifies how to decrypt them. It applies to every media segment that appears between it and the next EXT-X-KEY tag in the Playlist file with the same KEYFORMAT attribute (or the end of the Playlist file). Two or more EXT-X-KEY tags with different KEYFORMAT attributes MAY apply to the same media segment, in which case they MUST resolve to the same key. Its format is: #EXT-X-KEY:<attribute-list> The following attributes are defined: METHOD The value is an enumerated-string that specifies the encryption method. This attribute is REQUIRED. The methods defined are: NONE, AES-128, and SAMPLE-AES. An encryption method of NONE means that media segments are not encrypted. If the encryption method is NONE, the following attributes MUST NOT be present: URI; IV; KEYFORMAT; KEYFORMATVERSIONS. An encryption method of AES-128 means that media segments are completely encrypted using the Advanced Encryption Standard [AES_128] with a 128-bit key and PKCS7 padding [RFC5652]. If the encryption method is AES-128, the URI attribute MUST be present. The IV attribute MAY be present; see Section 5.2. An encryption method of SAMPLE-AES means that the media segments contain elementary streams of audio, video, or other samples that are encrypted using the Advanced Encryption Standard [AES_128]. How an elementary stream is encrypted depends on the media encoding. The encryption format for H.264 [H_264], AAC [ISO_14496] and AC-3 [AC_3] elementary streams is described by [SampleEnc]. The IV attribute MAY be present; see Section 5.2. A client MUST NOT not attempt to decrypt any segments whose EXT-X-KEY tag has a METHOD attribute that it does not recognize. URI The value is a quoted-string containing a URI [RFC3986] that specifies how to obtain the key. This attribute is REQUIRED unless the METHOD is NONE. Pantos & May Expires April 17, 2014 [Page 9] Internet-Draft HTTP Live Streaming October 2013 IV The value is a hexadecimal-integer that specifies the Initialization Vector to be used with the key. The IV attribute appeared in protocol version 2. See Section 5.2 for when the IV attribute is used. KEYFORMAT The value is a quoted-string that specifies how the key is represented in the resource identified by the URI; see Section 5 for more detail. This attribute is OPTIONAL; its absence indicates, an implicit value of "identity". The KEYFORMAT attribute appeared in protocol version 5. KEYFORMATVERSIONS The value is a quoted-string containing one or more positive integers separated by the "/" character (for example, "1/3"). If more than one version of a particular KEYFORMAT is defined, this attribute can be used to indicate which version(s) this instance complies with. This attribute is OPTIONAL; if it is not present, its value is considered to be "1". The KEYFORMATVERSIONS attribute appeared in protocol version 5. If the Media Playlist file does not contain an EXT-X-KEY tag then media segments are not encrypted. See Section 5 for the format of the key file, and Section 5.2, Section 6.2.3 and Section 6.3.6 for additional information on media segment encryption. 3.4.5. EXT-X-PROGRAM-DATE-TIME The EXT-X-PROGRAM-DATE-TIME tag associates the first sample of a media segment with an absolute date and/or time. It applies only to the next media segment. The date/time representation is ISO/IEC 8601:2004 [ISO_8601] and SHOULD indicate a time zone: #EXT-X-PROGRAM-DATE-TIME:<YYYY-MM-DDThh:mm:ssZ> For example: #EXT-X-PROGRAM-DATE-TIME:2010-02-19T14:54:23.031+08:00 See Section 6.2.1 and Section 6.3.3 for more information on the EXT- Pantos & May Expires April 17, 2014 [Page 10] Internet-Draft HTTP Live Streaming October 2013 X-PROGRAM-DATE-TIME tag. The EXT-X-PROGRAM-DATE-TIME tag MUST NOT appear in a Master Playlist. 3.4.6. EXT-X-ALLOW-CACHE The EXT-X-ALLOW-CACHE tag indicates whether the client MAY or MUST NOT cache downloaded media segments for later replay. It MAY occur anywhere in a Media Playlist file; it MUST NOT occur more than once. The EXT-X-ALLOW-CACHE tag applies to all segments in the playlist. Its format is: #EXT-X-ALLOW-CACHE:<YES|NO> See Section 6.3.3 for more information on the EXT-X-ALLOW-CACHE tag. 3.4.7. EXT-X-PLAYLIST-TYPE The EXT-X-PLAYLIST-TYPE tag provides mutability information about the Playlist file. It applies to the entire Playlist file. It is OPTIONAL. Its format is: #EXT-X-PLAYLIST-TYPE:<EVENT|VOD> Section 6.2.1 defines the implications of the EXT-X-PLAYLIST-TYPE tag. The EXT-X-PLAYLIST-TYPE tag MUST NOT appear in a Master Playlist. 3.4.8. EXT-X-ENDLIST The EXT-X-ENDLIST tag indicates that no more media segments will be added to the Media Playlist file. It MAY occur anywhere in the Playlist file; it MUST NOT occur more than once. Its format is: #EXT-X-ENDLIST The EXT-X-ENDLIST tag MUST NOT appear in a Master Playlist. 3.4.9. EXT-X-MEDIA The EXT-X-MEDIA tag is used to relate Media Playlists that contain alternative renditions of the same content. For example, three EXT- X-MEDIA tags can be used to identify audio-only Media Playlists that contain English, French and Spanish renditions of the same presentation. Or two EXT-X-MEDIA tags can be used to identify video- only Media Playlists that show two different camera angles. Pantos & May Expires April 17, 2014 [Page 11] Internet-Draft HTTP Live Streaming October 2013 The EXT-X-MEDIA tag stands alone, in that it does not apply to a particular URI in the Master Playlist. Its format is: #EXT-X-MEDIA:<attribute-list> The following attributes are defined: TYPE The value is enumerated-string; valid strings are AUDIO, VIDEO, SUBTITLES and CLOSED-CAPTIONS. If the value is AUDIO, the Playlist described by the tag MUST contain audio media. If the value is VIDEO, the Playlist MUST contain video media. If the value is SUBTITLES, the Playlist MUST contain subtitle media. If the value is CLOSED-CAPTIONS, the media segments for the video renditions can include closed captions. This attribute is REQUIRED. URI The value is a quoted-string containing a URI that identifies the Playlist file. This attribute is OPTIONAL; see Section 3.4.10.1. If the TYPE is CLOSED-CAPTIONS, the URI attribute MUST NOT be present. GROUP-ID The value is a quoted-string identifying a mutually-exclusive group of renditions. The presence of this attribute signals membership in the group. See Section 3.4.9.1. This attribute is REQUIRED. LANGUAGE The value is a quoted-string containing an RFC 5646 [RFC5646] language tag that identifies the primary language used in the rendition. This attribute is OPTIONAL. ASSOC-LANGUAGE The value is a quoted-string containing an RFC 5646 [RFC5646] language tag that identifies a language that is associated with the rendition. An associated language is often used in a different role than the language specified by the LANGUAGE attribute (e.g. written vs. spoken, or as a fallback dialect). This attribute is OPTIONAL. NAME The value is a quoted-string containing a human-readable description of the rendition. If the LANGUAGE attribute is present then this description SHOULD be in that language. This attribute is REQUIRED. Pantos & May Expires April 17, 2014 [Page 12] Internet-Draft HTTP Live Streaming October 2013 DEFAULT The value is an enumerated-string; valid strings are YES and NO. If the value is YES, then the client SHOULD play this rendition of the content in the absence of information from the user indicating a different choice. This attribute is OPTIONAL. Its absence indicates an implicit value of NO. AUTOSELECT The value is an enumerated-string; valid strings are YES and NO. This attribute is OPTIONAL. If it is present, its value MUST be YES if the value of the DEFAULT attribute is YES. If the value is YES, then the client MAY choose to play this rendition in the absence of explicit user preference because it matches the current playback environment, such as chosen system language. FORCED The value is an enumerated-string; valid strings are YES and NO. This attribute is OPTIONAL. Its absence indicates an implicit value of NO. The FORCED attribute MUST NOT be present unless the TYPE is SUBTITLES. A value of YES indicates that the rendition contains content which is considered essential to play. When selecting a FORCED rendition, a client should choose the one that best matches the current playback environment (e.g. language). A value of NO indicates that the rendition contains content which is intended to be played in response to explicit user request. INSTREAM-ID The value is a quoted-string that specifies a rendition within the segments in the Media Playlist. This attribute is REQUIRED if the TYPE attribute is CLOSED-CAPTIONS, in which case it MUST have one of the values: "CC1", "CC2", "CC3", "CC4". For all other TYPE values, the INSTREAM-ID SHOULD NOT be specified. CHARACTERISTICS The value is a quoted-string containing one or more Uniform Type Identifiers [UTI] separated by comma (,) characters. This attribute is OPTIONAL. Each UTI indicates an individual characteristic of the rendition. A SUBTITLES rendition MAY include the following characteristics: Pantos & May Expires April 17, 2014 [Page 13] Internet-Draft HTTP Live Streaming October 2013 "public.accessibility.transcribes-spoken-dialog"; "public.accessibility.describes-music-and-sound"; "public.easy-to- read" (which indicates that the subtitles have been edited for ease of reading). An AUDIO rendition MAY include the following characteristics: "public.accessibility.describes-video". The CHARACTERISTICS attribute MAY include private UTIs. The EXT-X-MEDIA tag appeared in version 4 of the protocol. The EXT- X-MEDIA tag MUST NOT appear in a Media Playlist. 3.4.9.1. Rendition Groups A set of EXT-X-MEDIA tags with the same GROUP-ID value forms a group of renditions. Each member of the group MUST represent an alternative rendition of the same content. All EXT-X-MEDIA tags in a Playlist MUST meet the following constraints: o All EXT-X-MEDIA tags in the same group MUST have the same TYPE attribute. o All EXT-X-MEDIA tags in the same group MUST have different NAME attributes. o A group MUST NOT have more than one member with a DEFAULT attribute of YES. o All members of a group whose AUTOSELECT attribute has a value of YES MUST have LANGUAGE [RFC5646] attributes with unique values. A Playlist MAY contain multiple groups of the same TYPE in order to provide multiple encodings of each group. If it does so, each group of the same TYPE MUST contain identical attributes, with the exception of the URI attribute. Each member in a group of renditions MAY have a different sample format. However, any EXT-X-STREAM-INF tag or EXT-X-I-FRAME-STREAMINF tag which references that group MUST have a CODECS attribute that lists every sample format present in any rendition in the group. Please see Section 3.4.10 and Section 3.4.15. Pantos & May Expires April 17, 2014 [Page 14] Internet-Draft HTTP Live Streaming October 2013 3.4.10. EXT-X-STREAM-INF The EXT-X-STREAM-INF tag specifies a variant stream, which is a set of renditions which can be combined to play the presentation. The attributes of the tag provide information about the variant stream. The EXT-X-STREAM-INF tag identifies the next URI line in the Playlist as a rendition of the variant stream. The EXT-X-STREAM-INF tag MUST NOT appear in a Media Playlist. Its format is: #EXT-X-STREAM-INF:<attribute-list> <URI> The following attributes are defined: BANDWIDTH The value is a decimal-integer of bits per second. It MUST be an upper bound of the overall bitrate of each media segment (calculated to include container overhead) that appears or will appear in the Playlist. Every EXT-X-STREAM-INF tag MUST include the BANDWIDTH attribute. CODECS The value is a quoted-string containing a comma-separated list of formats, where each format specifies a media sample type that is present in a media segment in the Media Playlist file. Valid format identifiers are those in the ISO Base Media File Format Name Space defined by RFC 6381 [RFC6381]. Every EXT-X-STREAM-INF tag SHOULD include a CODECS attribute. RESOLUTION The value is a decimal-resolution describing the approximate encoded horizontal and vertical resolution of video within the presentation. The RESOLUTION attribute is OPTIONAL but is recommended if the variant stream includes video. AUDIO The value is a quoted-string. It MUST match the value of the Pantos & May Expires April 17, 2014 [Page 15] Internet-Draft HTTP Live Streaming October 2013 GROUP-ID attribute of an EXT-X-MEDIA tag elsewhere in the Master Playlist whose TYPE attribute is AUDIO. It indicates the set of audio renditions that MAY be used when playing the presentation. See Section 3.4.10.1. The AUDIO attribute is OPTIONAL. VIDEO The value is a quoted-string. It MUST match the value of the GROUP-ID attribute of an EXT-X-MEDIA tag elsewhere in the Master Playlist whose TYPE attribute is VIDEO. It indicates the set of video renditions that MAY be used when playing the presentation. See Section 3.4.10.1. The VIDEO attribute is OPTIONAL. SUBTITLES The value is a quoted-string. It MUST match the value of the GROUP-ID attribute of an EXT-X-MEDIA tag elsewhere in the Master Playlist whose TYPE attribute is SUBTITLES. It indicates the set of subtitle renditions that MAY be used when playing the presentation. See Section 3.4.10.1. The SUBTITLES attribute is OPTIONAL. CLOSED-CAPTIONS The value can be either a quoted-string or an enumerated-string with the value NONE. If the value is a quoted-string, it MUST match the value of the GROUP-ID attribute of an EXT-X-MEDIA tag elsewhere in the Playlist whose TYPE attribute is CLOSED-CAPTIONS, and indicates the set of closed-caption renditions that may be used when playlist the presentation. See Section 3.4.10.1. If the value is the enumerated-string value NONE, all EXT-X-STREAM- INF tags MUST have this attribute with a value of NONE. This indicates that there are no closed captions in any variant stream in the Master Playlist. The CLOSED-CAPTIONS attribute is OPTIONAL. 3.4.10.1. Alternative Renditions When an EXT-X-STREAM-INF tag contains an AUDIO, VIDEO, SUBTITLES, or CLOSED-CAPTIONS attribute, it indicates that alternative renditions of the content are available for playback of that variant stream. Pantos & May Expires April 17, 2014 [Page 16] Internet-Draft HTTP Live Streaming October 2013 When defining alternative renditions, the following constraints MUST be met: o All playable combinations of renditions associated with an EXT-X- STREAM-INF tag MUST have an aggregate bandwidth less than or equal to the BANDWIDTH attribute of the EXT-X-STREAM-INF tag. o If an EXT-X-STREAM-INF tag contains a RESOLUTION attribute and a VIDEO attribute, then every alternative video rendition MUST match the value of the RESOLUTION attribute. o Every alternative rendition associated with an EXT-X-STREAM-INF tag MUST meet the constraints for a variant stream described in Section 6.2.4. The URI attribute is REQUIRED if the media type is SUBTITLES, but OPTIONAL if the media type is VIDEO or AUDIO. If the media type is VIDEO or AUDIO, a missing URI attribute indicates that the media data for this rendition is included in the Media playlist of any EXT-X- STREAM-INF tag referencing this MEDIA tag. If the media type is AUDIO, the audio data for this rendition MUST also be present in any video renditions specified by the EXT-X-STREAM-INF tag. The URI attribute MUST NOT be included if the media type is CLOSED- CAPTIONS. Note that if a client chooses to play renditions of audio and video that are not present in the main Media Playlist described by the EXT- X-STREAM-INF tag, or if the client chooses to play an audio rendition and the main Playlist is audio-only, then the client MAY ignore the main Playlist and its media. 3.4.11. EXT-X-DISCONTINUITY The EXT-X-DISCONTINUITY tag indicates an encoding discontinuity between the media segment that follows it and the one that preceded it. The set of characteristics that MAY change is: o file format o number and type of tracks o encoding parameters o encoding sequence o timestamp sequence Pantos & May Expires April 17, 2014 [Page 17] Internet-Draft HTTP Live Streaming October 2013 Its format is: #EXT-X-DISCONTINUITY See Section 4, Section 6.2.1, and Section 6.3.3 for more information about the EXT-X-DISCONTINUITY tag. The EXT-X-DISCONTINUITY tag MUST NOT appear in a Master Playlist. 3.4.12. EXT-X-DISCONTINUITY-SEQUENCE The EXT-X-DISCONTINUITY-SEQUENCE tag allows synchronization between different renditions of the same variant stream or different variant streams that have EXT-X-DISCONTINUITY tags in their Media Playlists. Its format is: #EXT-X-DISCONTINUITY-SEQUENCE:<number> where number is a decimal-integer. The discontinuity sequence number MUST NOT decrease. A Media Playlist MUST NOT contain more than one EXT-X-DISCONTINUITY- SEQUENCE tag. If the Media Playlist does not contain an EXT-X- DISCONTINUITY-SEQUENCE tag, then the discontinuity sequence number of the first segment in the playlist SHALL be considered to be 0. The EXT-X-DISCONTINUITY-SEQUENCE tag MUST appear before any EXT-X- DISCONTINUITY tag. A media playlist MUST NOT contain a EXT-X-DISCONTINUITY-SEQUENCE if its EXT-X-PLAYLIST-TYPE is VOD or EVENT. An EXT-X-DISCONTINUITY-SEQUENCE tag MUST ONLY appear in a Media Playlist. See Section 6.2.1 and Section 6.2.2 for more information about the EXT-X-DISCONTINUITY-SEQUENCE tag. 3.4.13. EXT-X-I-FRAMES-ONLY The EXT-X-I-FRAMES-ONLY tag indicates that each media segment in the Playlist describes a single I-frame. I-frames (or Intra frames) are encoded video frames whose encoding does not depend on any other frame. Pantos & May Expires April 17, 2014 [Page 18] Internet-Draft HTTP Live Streaming October 2013 The EXT-X-I-FRAMES-ONLY tag applies to the entire Playlist. Its format is: #EXT-X-I-FRAMES-ONLY In a Playlist with the EXT-X-I-FRAMES-ONLY tag, the media segment duration (EXTINF tag value) is the time between the presentation time of the I-frame in the media segment and the presentation time of the next I-frame in the Playlist, or the end of the presentation if it is the last I-frame in the Playlist. Media resources containing I-frame segments MUST begin with either a Transport Stream PAT/PMT or be accompanied by an EXT-X-MAP tag indicating the proper PAT/PMT. The byte range of an I-frame segment with an EXT-X-BYTERANGE tag applied to it (Section 3.4.1) MUST NOT include a PAT/PMT. The EXT-X-I-FRAMES-ONLY tag appeared in version 4 of the protocol. The EXT-X-I-FRAMES-ONLY tag MUST NOT appear in a Master Playlist. 3.4.14. EXT-X-MAP The EXT-X-MAP tag specifies how to obtain header information required to parse the applicable media segments, such as the Transport Stream PAT/PMT or the WebVTT header. It applies to every media segment that appears after it in the Playlist until the next EXT-X-DISCONTINUITY tag, or until the end of the playlist. Its format is: #EXT-X-MAP:<attribute-list> The following attributes are defined: URI The value is a quoted-string containing a URI that identifies a resource that contains segment header information. This attribute is REQUIRED. BYTERANGE The value is a quoted-string specifying a byte range into the resource identified by the URI attribute. This range SHOULD contain only the header information. The format of the byte range is described in Section 3.4.1. This attribute is OPTIONAL; if it is not present, the byte range is the entire resource indicated by the URI. Pantos & May Expires April 17, 2014 [Page 19] Internet-Draft HTTP Live Streaming October 2013 The EXT-X-MAP tag appeared in version 5 of the protocol for use in Media Playlist that contain the EXT-X-I-FRAMES-ONLY tag. In protocol version 6, it may appear in any Media Playlist. The EXT-X-MAP tag MUST NOT appear in a Master Playlist. 3.4.15. EXT-X-I-FRAME-STREAM-INF The EXT-X-I-FRAME-STREAM-INF tag identifies a Media Playlist file containing the I-frames of a multimedia presentation. It stands alone, in that it does not apply to a particular URI in the Master Playlist. Its format is: #EXT-X-I-FRAME-STREAM-INF:<attribute-list> All attributes defined for the EXT-X-STREAM-INF tag (Section 3.4.10) are also defined for the EXT-X-I-FRAME-STREAM-INF tag, except for the AUDIO, SUBTITLES and CLOSED-CAPTIONS attributes. In addition, the following attribute is defined: URI The value is a quoted-string containing a URI that identifies the I-frame Playlist file. Every EXT-X-I-FRAME-STREAM-INF tag MUST include a BANDWIDTH attribute and a URI attribute. The provisions in Section 3.4.10.1 also apply to EXT-X-I-FRAME- STREAM-INF tags with a VIDEO attribute. A Master Playlist that specifies alternative VIDEO renditions and I-frame Playlists SHOULD include an alternative I-frame VIDEO rendition for each regular VIDEO rendition, with the same NAME and LANGUAGE attributes. The EXT-X-I-FRAME-STREAM-INF tag appeared in version 4 of the protocol. Clients that do not implement protocol version 4 or higher MUST ignore it. The EXT-X-I-FRAME-STREAM-INF tag MUST NOT appear in a Media Playlist. 3.4.16. EXT-X-START The EXT-X-START tag indicates a preferred point at which to start playing a Playlist. By default, clients SHOULD start playback at this point when beginning a playback session. It MUST NOT appear more than once in a Playlist. This tag is OPTIONAL. Pantos & May Expires April 17, 2014 [Page 20] Internet-Draft HTTP Live Streaming October 2013 If the EXT-X-START tag appears in a Master Playlist, it indicates the preferred starting point of every Media Playlist in the Master Playlist. If this tag appears in a Media Playlist that is referenced by a Master Playlist, then every other Media Playlist in the Master Playlist MUST also contain an EXT-X-START tag with the same attributes and values. Its format is: #EXT-X-START:<attribute list> The following attributes are defined: TIME-OFFSET The value of TIME-OFFSET is a decimal-floating-point number of seconds. A positive number indicates a time offset from the beginning of the Playlist. A negative number indicates a negative time offset from the end of the last segment in the Playlist. This attribute is REQUIRED. The absolute value of TIME-OFFSET MUST NOT be larger than the Playlist duration. If the Playlist does not contain the EXT-X-ENDLIST tag, the TIME- OFFSET SHOULD NOT be within three target durations of the end of the Playlist file. PRECISE The value is an enumerated-string; valid strings are YES and NO. If the value is YES, clients SHOULD start playback at the segment containing the TIME-OFFSET, but SHOULD NOT render media samples in that segment whose presentation times are prior to the TIME-OFFSET. If the value is NO, clients SHOULD attempt to render every media sample in that segment. This attribute is OPTIONAL. If it is missing, its value should be treated as NO. The EXT-X-START tag appeared in version 6 of the protocol. 3.4.17. EXT-X-VERSION The EXT-X-VERSION tag indicates the compatibility version of the Playlist file. The Playlist file, its associated media, and its server MUST comply with all provisions of the most-recent version of this document describing the protocol version indicated by the tag value. Pantos & May Expires April 17, 2014 [Page 21] Internet-Draft HTTP Live Streaming October 2013 The EXT-X-VERSION tag applies to the entire Playlist file. Its format is: #EXT-X-VERSION:<n> where n is an integer indicating the protocol version. A Playlist file MUST NOT contain more than one EXT-X-VERSION tag. A Playlist file that does not contain an EXT-X-VERSION tag MUST comply with version 1 of this protocol. The EXT-X-VERSION tag MAY appear in either Master Playlist or Media Playlist. It MUST appear in all playlists containing tags or attributes that are not compatible with protocol version 1. 4. Media segments Each media URI in a Playlist file specifies a media segment which is part of the overall presentation. If a media URI has an EXT-X- BYTERANGE tag applied to it, the segment is a sub-range of the media file identified by the URI. Otherwise, the segment is the entire media file. Each media segment MUST be formatted as an MPEG-2 Transport Stream [ISO_13818], an MPEG audio elementary stream [ISO_11172], or a WebVTT [WebVTT] file. Transport Stream segments MUST contain a single MPEG-2 Program. There SHOULD be a Program Association Table (PAT) and a Program Map Table (PMT) at the start of each segment. A segment that contains video SHOULD have at least one key frame and enough information to completely initialize a video decoder. A Transport Stream or audio elementary stream segment MUST be the continuation of the encoded media at the end of the segment with the previous sequence number, where values in a continuous series, such as timestamps and Continuity Counters, continue uninterrupted - unless the media segment was the first ever to appear in the Playlist file or has an EXT-X-DISCONTINUITY tag applied to it. Clients SHOULD be prepared to handle multiple tracks of a particular type (e.g. audio or video). A client with no other preference SHOULD choose the track with the lowest numerical PID that it can play. Clients MUST ignore private streams inside Transport Streams that they do not recognize. Pantos & May Expires April 17, 2014 [Page 22] Internet-Draft HTTP Live Streaming October 2013 Each Elementary Audio Stream segment MUST signal the timestamp of its first sample with an ID3 PRIV tag [ID3] at the beginning of the segment. The ID3 PRIV owner identifier MUST be "com.apple.streaming.transportStreamTimestamp". The ID3 payload MUST be a 33-bit MPEG-2 Program Elementary Stream timestamp expressed as a big-endian eight-octet number, with the upper 31 bits set to zero. The encoding parameters for samples in a media segment and across multiple media segments in a Media Playlist SHOULD remain consistent. However clients SHOULD deal with encoding changes as they are encountered, for example by scaling video content to accommodate a resolution change. Subtitle segments MUST be formatted as WebVTT [WebVTT] files. Each subtitle segment MUST contain all subtitle cues that are intended to be displayed during the period indicated by the segment EXTINF duration. The start time offset and end time offset of each cue MUST indicate the total display time for that cue, even if that time range extends beyond the EXTINF duration. A WebVTT segment MAY contain no cues; this indicates that no subtitles are to be displayed during that period. Each subtitle segment MUST either start with a WebVTT header or have an EXT-X-MAP tag applied to it in the Media Playlist. Within each WebVTT header there MUST be an X-TIMESTAMP-MAP metadata header. This header synchronizes the cue timestamps in the WebVTT file with the MPEG-2 (PES) timestamps in other renditions of the variant stream. Its format is: X-TIMESTAMP-MAP=LOCAL:<cue time>,MPEGTS:<MPEG-2 time> e.g. X-TIMESTAMP-MAP=LOCAL:00:00:00.000,MPEGTS:900000 The cue timestamp in the LOCAL attribute MAY fall outside the range of time covered by the segment. 5. Key files 5.1. Introduction An EXT-X-KEY tag with a URI attribute identifies a Key file. A Key file contains the cipher key that MUST be used to decrypt subsequent media segments in the Playlist. [AES_128] encryption uses 16-octet keys. If the KEYFORMAT of an EXT- X-KEY tag is "identity", the Key file is a single packed array of 16 octets in binary format. Pantos & May Expires April 17, 2014 [Page 23] Internet-Draft HTTP Live Streaming October 2013 5.2. IV for [AES_128] [AES_128] requires the same 16-octet Initialization Vector (IV) to be supplied when encrypting and decrypting. Varying this IV increases the strength of the cipher. If an EXT-X-KEY tag has a KEYFORMAT of "identity" and an IV attribute is present, implementations MUST use the attribute value as the IV when encrypting or decrypting with that key. The value MUST be interpreted as a 128-bit number. If an EXT-X-KEY tag with a KEYFORMAT of "identity" does not have the IV attribute, implementations MUST use the sequence number of the media segment as the IV when encrypting or decrypting that media segment. The big-endian binary representation of the sequence number SHALL be placed in a 16-octet buffer and padded (on the left) with zeros. 6. Client/Server Actions 6.1. Introduction This section describes how the server generates the Playlist and media segments and how the client should download and play them. 6.2. Server Process 6.2.1. Introduction The production of the source media is outside the scope of this document, which simply presumes a source of continuous encoded media containing the presentation. The server MUST divide the source media into individual media segments whose duration is less than or equal to a constant target duration. The server SHOULD attempt to divide the source media at points that support effective decode of individual media segments, e.g. on packet and key frame boundaries. The server MUST create a URI for every media segment that enables its clients to obtain the segment data. If a server supports partial loading of resources (e.g. via HTTP Range requests), it MAY specify segments as sub-ranges of larger resources using the EXT-X-BYTERANGE tag. If WebVTT segments are distributed by HTTP, the server SHOULD support client requests to use the "gzip" Content-Encoding. Pantos & May Expires April 17, 2014 [Page 24] Internet-Draft HTTP Live Streaming October 2013 The server MUST create a Media Playlist file. The Playlist file MUST conform to the format described in Section 3. A URI for each media segment that the server wishes to make available MUST appear in the Media Playlist in the order in which it is to be played. The entire media segment MUST be available to clients if its URI is in the Playlist file. The Media Playlist file MUST contain an EXT-X-TARGETDURATION tag. Its value MUST be equal to or greater than the EXTINF duration of any media segment that appears or will appear in the Playlist file, rounded to the nearest integer. Its value MUST NOT change. A typical target duration is 10 seconds. The Playlist file SHOULD contain one EXT-X-VERSION tag which indicates its compatibility version. Its value MUST be the lowest protocol version with which the server, Playlist file, and associated media segments all comply. Its value MUST NOT change. The server MUST create a URI for the Playlist file that will allow its clients to obtain the file. If the Playlist file is distributed by HTTP, the server SHOULD support client requests to use "gzip" Content-Encoding. Changes to the Playlist file MUST be made atomically from the point of view of the clients. The server MUST NOT change the Media Playlist file, except to: Append lines to it (Section 6.2.1). Remove media segment URIs from the Playlist in the order that they appear, along with any tags that apply only to those segments (Section 6.2.2). Increment the value of the EXT-X-MEDIA-SEQUENCE or EXT-X- DISCONTINUITY-SEQUENCE tags (Section 6.2.2). Add or remove EXT-X-STREAM-INF tags or EXT-X-I-FRAME-STREAM-INF tags (Section 6.2.4). Note that clients are not required to reload Master Playlist files, so changing them may not have immediate effect. Add an EXT-X-ENDLIST tag to the Playlist (Section 6.2.1). Furthermore, the Playlist file MAY contain an EXT-X-PLAYLIST-TYPE tag with a value of either EVENT or VOD. If the tag is present and has a value of EVENT, the server MUST NOT change or delete any part of the Pantos & May Expires April 17, 2014 [Page 25] Internet-Draft HTTP Live Streaming October 2013 Playlist file (although it MAY append lines to it). If the tag is present and has a value of VOD, the Playlist file MUST NOT change. Every media segment in a Playlist MUST have an EXTINF tag applied to it indicating the duration of the media segment. Each segment in a Media Playlist has an integer discontinuity sequence number. The discontinuity sequence number can be used in addition to the timestamps within the media to synchronize media segments across different renditions. A segment's discontinuity sequence number is the value of the EXT-X- DISCONTINUITY-SEQUENCE tag (or zero if none) plus the number of EXT- X-DISCONTINUITY tags in the playlist preceding the URI line of the segment. A Media Playlist that contains an EXT-X-PLAYLIST-TYPE tag with a value of EVENT or VOD MUST NOT contain an EXT-X-DISCONTINUITY- SEQUENCE tag. The server MAY associate an absolute date and time with a media segment by applying an EXT-X-PROGRAM-DATE-TIME tag to it. This defines an informative mapping of the (wall-clock) date and time specified by the tag to the first media timestamp in the segment, which may be used as a basis for seeking, for display, or for other purposes. If a server provides this mapping, it SHOULD apply an EXT- X-PROGRAM-DATE-TIME tag to every segment that has an EXT-X- DISCONTINUITY tag applied to it. If the Media Playlist contains the final media segment of the presentation then the Playlist file MUST contain the EXT-X-ENDLIST tag. If a Media Playlist does not contain the EXT-X-ENDLIST tag, the server MUST make a new version of the Playlist file available that contains at least one new media segment. It MUST be made available relative to the time that the previous version of the Playlist file was made available: no earlier than one-half the target duration after that time, and no later than 1.5 times the target duration after that time. If the server wishes to remove an entire presentation, it MUST make the Playlist file unavailable to clients. It SHOULD ensure that all media segments in the Playlist file remain available to clients for at least the duration of the Playlist file at the time of removal. Pantos & May Expires April 17, 2014 [Page 26] Internet-Draft HTTP Live Streaming October 2013 6.2.2. Live Playlists The server MAY limit the availability of media segments by removing media segments from the Playlist file (Section 6.2.1). If media segments are to be removed, the Playlist file MUST contain exactly one EXT-X-MEDIA-SEQUENCE tag. Its value MUST be incremented by 1 for every media segment that is removed from the Playlist file. Media segments MUST be removed from the Playlist file in the order that they appear in the Playlist. The server MUST NOT remove a media segment from the Playlist file if the duration of the Playlist file minus the duration of the segment is less than three times the target duration. When the server removes a media segment from the Playlist, the corresponding media URI SHOULD remain available to clients for a period of time equal to the duration of the segment plus the duration of the longest Playlist file distributed by the server containing that segment. If the server wishes to remove segments from a Media Playlist containing an EXT-X-DISCONTINUITY tag, the playlist MUST contain an EXT-X-DISCONTINUITY-SEQUENCE tag. If the server removes a EXT-X-DISCONTINUITY tag from the Media Playlist, it MUST increment the value of the EXT-X-DISCONTINUITY- SEQUENCE tag so that the discontinuity sequence numbers of the segments still in the playlist remain unchanged. If a server plans to remove a media segment after it is delivered to clients over HTTP, it SHOULD ensure that the HTTP response contains an Expires header that reflects the planned time-to-live. A Live Playlist MUST NOT contain the EXT-X-PLAYLIST-TYPE tag. 6.2.3. Encrypting media segments If media segments are to be encrypted the server MUST define a URI which will allow authorized clients to obtain a Key file containing a decryption key. The Key file MUST conform to the format described in Section 5. The server MAY set the HTTP Expires header in the key response to indicate that the key may be cached. The server MUST encrypt every media segment in a Playlist according to the EXT-X-KEY tag that applies to its URI in the Playlist file. Pantos & May Expires April 17, 2014 [Page 27] Internet-Draft HTTP Live Streaming October 2013 Media segments with an EXT-X-KEY tag whose METHOD is NONE, or which do not have an EXT-X-KEY tag applied to them, MUST NOT be encrypted. If the encryption METHOD is AES-128 and the Playlist does not contain the EXT-X-I-FRAMES-ONLY tag, AES-128 CBC encryption with PKCS7 padding [RFC5652] SHALL be applied to individual media segments. The entire segment MUST be encrypted. Cipher Block Chaining MUST NOT be applied across media segments. The IV used for encryption MUST be either the sequence number of the media segment or the value of the IV attribute of the EXT-X-KEY tag, as described in Section 5.2. If the encryption METHOD is AES-128 and the Playlist contains an EXT- X-I-FRAMES-ONLY tag, AES-128 CBC encryption with PKCS7 padding [RFC5652] MUST be applied to the entire resource. The entire resource MUST be encrypted. Encryption MAY be restarted on 16-byte block boundaries, unless the first block contains an I-frame. The IV used for encryption MUST be either the sequence number of the media segment or the value of the IV attribute of the EXT-X-KEY tag, as described in Section 5.2. If the encryption METHOD is SAMPLE-AES, certain elementary streams MAY be encrypted prior to encapsulation in a media segment. The encryption format for H.264, AAC and AC-3 elementary streams is described by [SampleEnc]. The server MUST NOT remove an EXT-X-KEY tag from the Playlist file if it applies to any media segment in the Playlist file. 6.2.4. Providing variant streams A server MAY offer multiple Media Playlist files to provide different encodings of the same presentation. If it does so it SHOULD provide a Master Playlist file that lists each variant stream to allow clients to switch between encodings dynamically. Master Playlists MUST contain an EXT-X-STREAM-INF tag or EXT-X-I- FRAME-STREAM-INF tag for each variant stream. If an EXT-X-STREAM-INF tag or EXT-X-I-FRAME-STREAM-INF tag contains the CODECS attribute, the attribute value MUST include every format defined by [RFC6381] that is present in any media segment that is part of the variant stream, including in any rendition. The server MUST meet the following constraints when producing variant streams: Each variant stream MUST present the same content, including EXT- X-DISCONTINUITY tags at the same points in each rendition. Pantos & May Expires April 17, 2014 [Page 28] Internet-Draft HTTP Live Streaming October 2013 Matching content in variant streams MUST have matching timestamps. This allows clients to synchronize the media. Each Media Playlist in each variant stream MUST have the same target duration. The only exception is that SUBTITLES renditions with a EXT-X-PLAYLIST-TYPE of VOD MAY have longer target durations. Content that appears in a Media Playlist of one variant stream but not in another MUST appear either at the beginning or at the end of the Media Playlist file and MUST NOT be longer than the target duration. If any Media Playlist in a Master Playlist contains an EXT-X- PROGRAM-DATE-TIME tag, then all Media Playlists in that Master Playlist MUST contain EXT-X-PROGRAM-DATE-TIME tags with consistent mappings of date and time to media timestamps. In addition, for broadest compatibility, variant streams SHOULD contain the same encoded audio bitstream. This allows clients to switch between variant streams without audible glitching. The rules for variant streams also apply to alternative renditions - see Section 3.4.10.1. 6.3. Client Process 6.3.1. Introduction How the client obtains the URI to the Playlist file is outside the scope of this document; it is presumed to have done so. The client MUST obtain the Playlist file from the URI. If the Playlist file so obtained is a Master Playlist, the client MUST obtain the Media Playlist file from the Master Playlist. This document does not specify the treatment of variant streams by clients. 6.3.2. Loading the Playlist file Every time a Playlist file is loaded or reloaded from the Playlist URI: The client MUST ensure that the Playlist file begins with the EXTM3U tag and that the EXT-X-VERSION tag, if present, specifies a protocol version supported by the client; if not, the client MUST NOT attempt to use the Playlist. Pantos & May Expires April 17, 2014 [Page 29] Internet-Draft HTTP Live Streaming October 2013 The client SHOULD ignore any tags and attributes it does not recognize. The client MUST determine the next media segment to load, as described in Section 6.3.5. If the Media Playlist contains the EXT-X-MEDIA-SEQUENCE tag, the client SHOULD assume that each media segment in it will become unavailable at the time that the Playlist file was loaded plus the duration of the Playlist file. 6.3.3. Playing the Playlist file The client SHALL choose which media segment to play first from the Media Playlist when playback starts. If the EXT-X-ENDLIST tag is not present and the client intends to play the media regularly (i.e. in playlist order at the nominal playback rate), the client SHOULD NOT choose a segment which starts less than three target durations from the end of the Playlist file. Doing so can trigger playback stalls. To achieve regular playback, media segments MUST be played in the order that they appear in the Playlist file. The client MAY present the available media in any way it wishes, including regular playback, random access, and trick modes. The client MUST be prepared to reset its parser(s) and decoder(s) before playing a media segment that has an EXT-X-DISCONTINUITY tag applied to it. The client SHOULD attempt to load media segments in advance of when they will be required for uninterrupted playback to compensate for temporary variations in latency and throughput. If the Playlist file contains the EXT-X-ALLOW-CACHE tag and its value is NO, the client MUST NOT cache downloaded media segments after they have been played. Otherwise the client MAY cache downloaded media segments indefinitely for later replay. The client MAY use the value of the EXT-X-PROGRAM-DATE-TIME tag to display the program origination time to the user. If the value includes time zone information the client SHALL take it into account, but if it does not the client MUST NOT infer an originating time zone. The client MUST NOT depend upon the correctness or the consistency of the value of the EXT-X-PROGRAM-DATE-TIME tag. Pantos & May Expires April 17, 2014 [Page 30] Internet-Draft HTTP Live Streaming October 2013 6.3.4. Reloading the Playlist file The client MUST periodically reload the Media Playlist file unless it contains the EXT-X-ENDLIST tag. However the client MUST NOT attempt to reload the Playlist file more frequently than specified by this section. When a client loads a Playlist file for the first time or reloads a Playlist file and finds that it has changed since the last time it was loaded, the client MUST wait for at least the target duration before attempting to reload the Playlist file again, measured from the last time the client began loading the Playlist file. If the client reloads a Playlist file and finds that it has not changed then it MUST wait for a period of one-half the target duration before retrying. In order to reduce server load, the client SHOULD NOT reload the Playlist files of variant streams or alternate renditions that are not currently being played. If it decides to switch playback to a different variant stream, it SHOULD stop reloading the Playlist of the old variant stream and begin loading the Playlist of the new variant stream. It can use the EXTINF durations and the constraints in Section 6.2.4 to determine the approximate location of corresponding media. Once media from the new variant stream has been loaded, the timestamps in the media segments can be used to synchronize the old and new timelines precisely. A client MUST NOT assume that segments with the same media sequence number in different variant streams or different renditions contain matching content. 6.3.5. Determining the next segment to load The client MUST examine the Media Playlist file every time it is loaded or reloaded to determine the next media segment to load. The first segment to load MUST be the segment that the client has chosen to play first, as described in Section 6.3.3. If the first segment to be played has been loaded and the Playlist file does not contain the EXT-X-MEDIA-SEQUENCE tag then the client MUST verify that the current Playlist file contains the URI of the last loaded media segment at the offset it was originally found at, halting playback if it does not. The next media segment to load MUST be the first media segment following the last-loaded segment in the Playlist. If the first segment to be played has been loaded and the Playlist Pantos & May Expires April 17, 2014 [Page 31] Internet-Draft HTTP Live Streaming October 2013 file contains the EXT-X-MEDIA-SEQUENCE tag then the next media segment to load SHALL be the one with the lowest sequence number that is greater than the sequence number of the last media segment loaded. 6.3.6. Decrypting encrypted media segments If a Media Playlist file contains an EXT-X-KEY tag that specifies a Key file URI, the client MUST obtain that key file and use the key inside it to decrypt all media segments to which that EXT-X-KEY tag applies. A client MUST NOT attempt to use an EXT-X-KEY tag with an unsupported or unrecognized KEYFORMAT attribute. A client SHOULD fail playback if the Playlist contains a media segment to which only EXT-X-KEY tags with unrecognized or unsupported KEYFORMAT attributes are applied. If the encryption METHOD is AES-128, AES-128 CBC decryption SHALL be applied to individual media segments. The entire segment MUST be decrypted. Cipher Block Chaining MUST NOT be applied across media segments. The IV used for decryption MUST be either the sequence number of the media segment or the value of the IV attribute of the EXT-X-KEY tag, as described in Section 5.2. If the encryption METHOD is AES-128 and the media segment is part of an I-frame playlist (Section 3.4.13) special care MUST be taken in loading and decrypting the segment, because the resource identified by the URI is encrypted in 16-byte blocks from the start of the resource (offset 0). The sub-range specified by the EXT-X-BYTERANGE tag MUST be widened to include the 16-byte blocks in which the beginning and end of the sub-range fall. Next, it MUST be widened further to include the previous 16-byte block. That range MUST be loaded and decrypted with AES-128 CBC using an arbitrary IV. The decrypted segment will then be in the original (unwidened) sub-range. If the encryption METHOD is SAMPLE-AES, AES-128 decryption SHALL be applied to encrypted elementary streams within the media segment. The encryption format for H.264, AAC and AC-3 elementary streams is described by [SampleEnc]. An EXT-X-KEY tag with a METHOD of NONE indicates that the media segments it applies to are not encrypted. 7. Protocol version compatibility Clients and servers MUST implement protocol version 2 or higher to use: Pantos & May Expires April 17, 2014 [Page 32] Internet-Draft HTTP Live Streaming October 2013 o The IV attribute of the EXT-X-KEY tag. Clients and servers MUST implement protocol version 3 or higher to use: o Floating-point EXTINF duration values. Clients and servers MUST implement protocol version 4 or higher to use: o The EXT-X-BYTERANGE tag. o The EXT-X-I-FRAME-STREAM-INF tag. o The EXT-X-I-FRAMES-ONLY tag. o The EXT-X-MEDIA tag. o The AUDIO and VIDEO attributes of the EXT-X-STREAM-INF tag. Clients and servers MUST implement protocol version 5 or higher to use: o The KEYFORMAT and KEYFORMATVERSIONS attributes of the EXT-X-KEY tag. o The EXT-X-MAP tag. Clients and servers MUST implement protocol version 6 or higher to use: o The EXT-X-MAP tag in a Media playlist that does not contain EXT-X- I-FRAMES-ONLY. The PROGRAM-ID attribute of the EXT-X-STREAM-INF and the EXT-X-I- FRAME-STREAM-INF tags has been removed in protocol version 6. 8. Examples 8.1. Introduction This section contains several example Playlist files. Pantos & May Expires April 17, 2014 [Page 33] Internet-Draft HTTP Live Streaming October 2013 8.2. Simple Media Playlist file #EXTM3U #EXT-X-VERSION:3 #EXT-X-TARGETDURATION:5220 #EXTINF:5219.2, http://media.example.com/entire.ts #EXT-X-ENDLIST 8.3. Live Media Playlist, using HTTPS #EXTM3U #EXT-X-VERSION:3 #EXT-X-TARGETDURATION:8 #EXT-X-MEDIA-SEQUENCE:2680 #EXTINF:7.975, https://priv.example.com/fileSequence2680.ts #EXTINF:7.941, https://priv.example.com/fileSequence2681.ts #EXTINF:7.975, https://priv.example.com/fileSequence2682.ts 8.4. Playlist file with encrypted media segments #EXTM3U #EXT-X-VERSION:3 #EXT-X-MEDIA-SEQUENCE:7794 #EXT-X-TARGETDURATION:15 #EXT-X-KEY:METHOD=AES-128,URI="https://priv.example.com/key.php?r=52" #EXTINF:2.833, http://media.example.com/fileSequence52-A.ts #EXTINF:15.0, http://media.example.com/fileSequence52-B.ts #EXTINF:13.333, http://media.example.com/fileSequence52-C.ts #EXT-X-KEY:METHOD=AES-128,URI="https://priv.example.com/key.php?r=53" #EXTINF:15.0, http://media.example.com/fileSequence53-A.ts Pantos & May Expires April 17, 2014 [Page 34] Internet-Draft HTTP Live Streaming October 2013 8.5. Master Playlist file #EXTM3U #EXT-X-STREAM-INF:BANDWIDTH=1280000 http://example.com/low.m3u8 #EXT-X-STREAM-INF:BANDWIDTH=2560000 http://example.com/mid.m3u8 #EXT-X-STREAM-INF:BANDWIDTH=7680000 http://example.com/hi.m3u8 #EXT-X-STREAM-INF:BANDWIDTH=65000,CODECS="mp4a.40.5" http://example.com/audio-only.m3u8 8.6. Master Playlist with I-Frames #EXTM3U #EXT-X-STREAM-INF:BANDWIDTH=1280000 low/audio-video.m3u8 #EXT-X-I-FRAME-STREAM-INF:BANDWIDTH=86000,URI="low/iframe.m3u8" #EXT-X-STREAM-INF:BANDWIDTH=2560000 mid/audio-video.m3u8 #EXT-X-I-FRAME-STREAM-INF:BANDWIDTH=150000,URI="mid/iframe.m3u8" #EXT-X-STREAM-INF:BANDWIDTH=7680000 hi/audio-video.m3u8 #EXT-X-I-FRAME-STREAM-INF:BANDWIDTH=550000,URI="hi/iframe.m3u8" #EXT-X-STREAM-INF:BANDWIDTH=65000,CODECS="mp4a.40.5" audio-only.m3u8 8.7. Master Playlist with Alternative audio In this example, the CODECS attributes have been condensed for space. A '\' is used to indicate that the tag continues on the following line with whitespace removed: Pantos & May Expires April 17, 2014 [Page 35] Internet-Draft HTTP Live Streaming October 2013 #EXTM3U #EXT-X-MEDIA:TYPE=AUDIO,GROUP-ID="aac",NAME="English", \ DEFAULT=YES,AUTOSELECT=YES,LANGUAGE="en", \ URI="main/english-audio.m3u8" #EXT-X-MEDIA:TYPE=AUDIO,GROUP-ID="aac",NAME="Deutsch", \ DEFAULT=NO,AUTOSELECT=YES,LANGUAGE="de", \ URI="main/german-audio.m3u8" #EXT-X-MEDIA:TYPE=AUDIO,GROUP-ID="aac",NAME="Commentary", \ DEFAULT=NO,AUTOSELECT=NO,URI="commentary/audio-only.m3u8" #EXT-X-STREAM-INF:BANDWIDTH=1280000,CODECS="...",AUDIO="aac" low/video-only.m3u8 #EXT-X-STREAM-INF:BANDWIDTH=2560000,CODECS="...",AUDIO="aac" mid/video-only.m3u8 #EXT-X-STREAM-INF:BANDWIDTH=7680000,CODECS="...",AUDIO="aac" hi/video-only.m3u8 #EXT-X-STREAM-INF:BANDWIDTH=65000,CODECS="mp4a.40.5",AUDIO="aac" main/english-audio.m3u8 8.8. Master Playlist with Alternative video In this example, the CODECS attributes have been condensed for space. A '\' is used to indicate that the tag continues on the following line with whitespace removed: Pantos & May Expires April 17, 2014 [Page 36] Internet-Draft HTTP Live Streaming October 2013 #EXTM3U #EXT-X-MEDIA:TYPE=VIDEO,GROUP-ID="low",NAME="Main", \ DEFAULT=YES,URI="low/main/audio-video.m3u8" #EXT-X-MEDIA:TYPE=VIDEO,GROUP-ID="low",NAME="Centerfield", \ DEFAULT=NO,URI="low/centerfield/audio-video.m3u8" #EXT-X-MEDIA:TYPE=VIDEO,GROUP-ID="low",NAME="Dugout", \ DEFAULT=NO,URI="low/dugout/audio-video.m3u8" #EXT-X-STREAM-INF:BANDWIDTH=1280000,CODECS="...",VIDEO="low" low/main/audio-video.m3u8 #EXT-X-MEDIA:TYPE=VIDEO,GROUP-ID="mid",NAME="Main", \ DEFAULT=YES,URI="mid/main/audio-video.m3u8" #EXT-X-MEDIA:TYPE=VIDEO,GROUP-ID="mid",NAME="Centerfield", \ DEFAULT=NO,URI="mid/centerfield/audio-video.m3u8" #EXT-X-MEDIA:TYPE=VIDEO,GROUP-ID="mid",NAME="Dugout", \ DEFAULT=NO,URI="mid/dugout/audio-video.m3u8" #EXT-X-STREAM-INF:BANDWIDTH=2560000,CODECS="...",VIDEO="mid" mid/main/audio-video.m3u8 #EXT-X-MEDIA:TYPE=VIDEO,GROUP-ID="hi",NAME="Main", \ DEFAULT=YES,URI="hi/main/audio-video.m3u8" #EXT-X-MEDIA:TYPE=VIDEO,GROUP-ID="hi",NAME="Centerfield", \ DEFAULT=NO,URI="hi/centerfield/audio-video.m3u8" #EXT-X-MEDIA:TYPE=VIDEO,GROUP-ID="hi",NAME="Dugout", \ DEFAULT=NO,URI="hi/dugout/audio-video.m3u8" #EXT-X-STREAM-INF:BANDWIDTH=7680000,CODECS="...",VIDEO="hi" hi/main/audio-video.m3u8 #EXT-X-STREAM-INF:BANDWIDTH=65000,CODECS="mp4a.40.5" main/audio-only.m3u8 9. Contributors Significant contributions to the design of this protocol were made by Jim Batson, David Biderman, Bill May, Roger Pantos, Alan Tseng, and Eryk Vershen. 10. IANA Considerations This memo requests that the following MIME type [RFC2046] be registered with the IANA: Type name: "application" Pantos & May Expires April 17, 2014 [Page 37] Internet-Draft HTTP Live Streaming October 2013 Subtype name: "vnd.apple.mpegurl" Required parameters: (none) Optional parameters: (none) Encoding considerations: encoded as text. See Section 3 for more information. Security considerations: See Section 11. Compression: this media type does not employ compression. Interoperability considerations: There are no byte-ordering issues, since files are 7- or 8-bit text. Applications could encounter unrecognized tags, which SHOULD be ignored. Published specification: see Section 3. Applications that use this media type: Multimedia applications such as the iPhone media player in iOS 3.0 and later and QuickTime Player in Mac OS X version 10.6 and later. Additional information: files begin with the magic number #EXTM3U. Filenames normally end with .m3u8 or .m3u (see Section 3). No Macintosh file type codes have been registered. Person & email address to contact for further information: David Singer, singer AT apple.com. Intended usage: LIMITED USE Restrictions on usage: (none) Author: Roger Pantos Change Controller: David Singer 11. Security Considerations Since the protocol generally uses HTTP to transfer data, most of the same security considerations apply. See section 15 of RFC 2616 [RFC2616]. Media file parsers are typically subject to "fuzzing" attacks. Clients SHOULD take care when parsing segments received from a server that non-compliant segments are rejected. Pantos & May Expires April 17, 2014 [Page 38] Internet-Draft HTTP Live Streaming October 2013 Playlist files contain URIs, which clients will use to make network requests of arbitrary entities. Clients SHOULD range-check responses to prevent buffer overflows. See also the Security Considerations section of RFC 3986 [RFC3986]. Clients SHOULD load resources identified by URI lazily to avoid contributing to denial-of-service attacks. HTTP requests often include session state ("cookies"), which may contain private user data. Implementations MUST follow cookie restriction and expiry rules specified by RFC 6265 [RFC6265]. See also the Security Considerations section of RFC 6265, and RFC 2964 [RFC2964]. Encryption keys are specified by URI. The delivery of these keys SHOULD be secured by a mechanism such as HTTP over TLS [RFC5246] (formerly SSL) in conjunction with a secure realm or a session cookie. 12. References 12.1. Normative References [AC_3] Advanced Television Systems Committee, "ATSC Standard: A/52:2010: Digital Audio Compression (AC-3) (E-AC-3) Standard", November 2010, <http://www.atsc.org/cms/standards/a_52-2010.pdf>. [AES_128] U.S. Department of Commerce/National Institute of Standards and Technology, "Advanced Encryption Standard (AES), FIPS PUB 197", November 2001, <http:// csrc.nist.gov/publications/fips/fips197/fips-197.pdf>. [H_264] International Telecommunications Union, "Advanced video coding for generic audiovisual services", January 2012, <http://www.itu.int/rec/T-REC-H.264>. [ISO_11172] International Organization for Standardization, "ISO/IEC International Standard 11172-1; Coding of moving pictures and associated audio for digital storage media -- Part 1: Systems", 1993, <http://www.iso.org/iso/catalogue_detail?csnumber=19180>. [ISO_13818] International Organization for Standardization, "ISO/IEC International Standard 13818; Generic coding of moving Pantos & May Expires April 17, 2014 [Page 39] Internet-Draft HTTP Live Streaming October 2013 pictures and associated audio information", October 2007, <http://www.iso.org/iso/catalogue_detail?csnumber=44169>. [ISO_14496] International Organization for Standardization, "ISO/IEC 14496-3:2009 Information technology -- Coding of audio- visual objects -- Part 3: Audio", 2009, <http://www.iso.org/iso/catalogue_detail?csnumber=53943>. [ISO_8601] International Organization for Standardization, "ISO/IEC International Standard 8601:2004; Data elements and interchange formats -- Information interchange -- Representation of dates and times", December 2004, <http://www.iso.org/iso/catalogue_detail?csnumber=40874>. [RFC2046] Freed, N. and N. Borenstein, "Multipurpose Internet Mail Extensions (MIME) Part Two: Media Types", RFC 2046, November 1996. [RFC2119] Bradner, S., "Key words for use in RFCs to Indicate Requirement Levels", BCP 14, RFC 2119, March 1997. [RFC2616] Fielding, R., Gettys, J., Mogul, J., Frystyk, H., Masinter, L., Leach, P., and T. Berners-Lee, "Hypertext Transfer Protocol -- HTTP/1.1", RFC 2616, June 1999. [RFC2964] Moore, K. and N. Freed, "Use of HTTP State Management", BCP 44, RFC 2964, October 2000. [RFC3629] Yergeau, F., "UTF-8, a transformation format of ISO 10646", STD 63, RFC 3629, November 2003. [RFC3986] Berners-Lee, T., Fielding, R., and L. Masinter, "Uniform Resource Identifier (URI): Generic Syntax", STD 66, RFC 3986, January 2005. [RFC5246] Dierks, T. and E. Rescorla, "The Transport Layer Security (TLS) Protocol Version 1.2", RFC 5246, August 2008. [RFC5646] Phillips, A. and M. Davis, "Tags for Identifying Languages", BCP 47, RFC 5646, September 2009. [RFC5652] Housley, R., "Cryptographic Message Syntax (CMS)", STD 70, RFC 5652, September 2009. [RFC6265] Barth, A., "HTTP State Management Mechanism", RFC 6265, April 2011. Pantos & May Expires April 17, 2014 [Page 40] Internet-Draft HTTP Live Streaming October 2013 [RFC6381] Gellens, R., Singer, D., and P. Frojdh, "The 'Codecs' and 'Profiles' Parameters for "Bucket" Media Types", RFC 6381, August 2011. [US_ASCII] American National Standards Institute, "ANSI X3.4-1986, Information Systems -- Coded Character Sets 7-Bit American National Standard Code for Information Interchange (7-Bit ASCII)", December 1986. [WebVTT] World Wide Web Consortium (W3C), "WebVTT: The Web Video Text Tracks Format", July 2013, <http://dev.w3.org/html5/webvtt/>. 12.2. Informative References [ID3] ID3.org, "The ID3 audio file data tagging format", <http://www.id3.org/Developer_Information>. [M3U] Nullsoft, Inc., "The M3U Playlist format, originally invented for the Winamp media player", <http://wikipedia.org/wiki/M3U>. [SampleEnc] Apple Inc., "MPEG-2 Stream Encryption Format for HTTP Live Streaming", <https://developer.apple.com/library/ios/ documentation/AudioVideo/Conceptual/ HLS_Sample_Encryption/>. [UTI] Apple Inc., "Uniform Type Identifier", <http:// developer.apple.com/library/ios/#documentation/general/ conceptual/DevPedia-CocoaCore/UniformTypeIdentifier.html>. Authors' Addresses Roger Pantos (editor) Apple Inc. Cupertino, California United States Email: http-live-streaming-review@group.apple.com Pantos & May Expires April 17, 2014 [Page 41] Internet-Draft HTTP Live Streaming October 2013 William May, Jr. Apple Inc. Cupertino, California United States Email: http-live-streaming-review@group.apple.com Pantos & May Expires April 17, 2014 [Page 42]