ISO/IEC 14496-1 pdf
INFORMATION TECHNOLOGY –
GENERIC CODING OF AUDIO-VISUAL OBJECTS
Part 1: Systems
ISO/IEC 14496-1
Final Committee Draft of International Standard
The Systems part of this Final Committee Draft of International Standard describes a system for communicating
interactive audiovisual scenes. Such scenes consist of:
1. the coded representation of natural or synthetic, 2D or 3D objects that can be manifested audibly and/or visually
(media objects);
2. the coded representation of the spatio-temporal positioning of media objects as well as their behavior in response
to interaction (scene description); and
3. the coded representation of information related to the management of information streams (synchronization,
identification, description and association of stream content).
The overall operation of a system communicating such audiovisual scenes is as follows. At the sending side,
audiovisual scene information is compressed, supplemented with synchronization information and passed to a
delivery layer that multiplexes it in one or more coded binary streams that are transmitted or stored. At the receiver
these streams are demultiplexed and decompressed. The media objects are composed according to the scene
description and synchronization information and presented to the end user. The end user may have the option to
interact with the presentation. Interaction information can be processed locally or transmitted to the sender. This
specification defines the semantic and syntactic rules of bitstreams that convey such scene information, as well as the
details of their decoding processes.
In particular, the Systems part of this Final Committee Draft of International Standard specifies the following tools:
· a terminal model for time and buffer management;
· a coded representation of interactive audiovisual scene description information (Binary Format for Scenes –
BIFS);
· a coded representation of identification and description of audiovisual streams as well as the logical dependencies
between stream information (Object and other Descriptors);
· a coded representation of synchronization information (Sync Layer – SL);
· a multiplexed representation of individual streams in a single stream (FlexMux); and
· a coded representation of descriptive audiovisual content information (Object Content Information – OCI).
These various elements are described functionally in this clause and specified in the normative clauses that follow.