Mpeg Audio Datafile Format Specification

download Mpeg Audio Datafile Format Specification

of 17

Transcript of Mpeg Audio Datafile Format Specification

  • 7/28/2019 Mpeg Audio Datafile Format Specification

    1/17

    Mpeg Audio Datafile Format Specification

    MPEG audio datafile (*.m3d) is database of informations gathered from mpeg

    audio files. You do not need to have mpeg audio file itself to get it's info if you

    previously stored that info in database. This is very useful for cataloguing datawhen mpeg audio files are on removable media.

    Main purpose of this file format is mpeg audio info distribution amongapplications. It may be used as personal database, or as catalogue.

    It is currently supported byMPGScriptMPEG audio cataloguing applicationandMPGToolsDelphi unit for accessing informations from MPEG files.

    Last updated version of this document may be found atm3dspecs.htm

    File format specification

    Generally, M3D is divided in four sections: file signature, application

    identification, header info, and mpeg data. In file structure they are sorted out like

    this:

    This section signs file as MPEG Audio datafile and determines file version. Version numbersare defined byauthor and may not be changed by third parties. You are entitled to create onlyfiles according to current version definition. Current version is 1.2 and this document

    describes it. If you want to read older versions, look support site ofMPGTools DelphiUnit for details. Check this site for future structure updates.

    8

    characters

    Always contains #9'MP3DATA' characters. This part

    may be used by third party applications to recognizefile as MPEG Audio Datafile.

    1 byte Contains file version number (currently 1)

    1 byte Contains file subversion number (currently 2)

    This section describes application used to create file. It also may contain other, application

    specific data. Third party applications may or may read data from this section or just skipthem.

    http://www.datavoyage.com/mpgscript/http://www.datavoyage.com/mpgscript/http://www.datavoyage.com/mpgscript/http://www.datavoyage.com/mpgscript/mpgtools.htmhttp://www.datavoyage.com/mpgscript/mpgtools.htmhttp://www.datavoyage.com/mpgscript/mpgtools.htmhttp://www.datavoyage.com/mpgscript/m3dspecs.htmhttp://www.datavoyage.com/mpgscript/m3dspecs.htmhttp://www.datavoyage.com/mpgscript/mpgtools.htmhttp://www.datavoyage.com/mpgscript/mpgtools.htmhttp://www.datavoyage.com/mpgscript/mpgtools.htmhttp://www.datavoyage.com/mpgscript/mpgtools.htmhttp://www.datavoyage.com/mpgscript/mpgtools.htmhttp://www.datavoyage.com/mpgscript/mpgtools.htmhttp://www.datavoyage.com/mpgscript/mpgtools.htmhttp://www.datavoyage.com/mpgscript/mpgtools.htmhttp://www.datavoyage.com/mpgscript/mpgtools.htmhttp://www.datavoyage.com/mpgscript/mpgtools.htmhttp://www.datavoyage.com/mpgscript/m3dspecs.htmhttp://www.datavoyage.com/mpgscript/mpgtools.htmhttp://www.datavoyage.com/mpgscript/
  • 7/28/2019 Mpeg Audio Datafile Format Specification

    2/17

    1 byte Contains length of section excluding byte

    1 byte Contains number of bytes used for application name(maximum value is 15).

    up to 15characters

    Containing application name. Should be used todetermine which application created m3d file.Application name must be registered with author to

    avoid same ID-s for different applications.

    Space used by application. Author of that application

    is responsible for publishing data structure of thisblock if he wants third parties to use his data. He doesnot have to do that. This block length may be

    calculated as --1.It may contain any additional data application needs

    Contains header info about owner of file, catalogue info, order info or else. It may usedifferent structure, therefore it's divided into three blocks:

    1 byte Describes type of header:

    0 - custom header used by third party application (use to see which one) that created this header.Other applicat ions should just skip reading header

    informations. Third party applications may publishtheir custom format. In that case, it is advisable to

    contact author of m3d to register its own type.

    1 - catalogue header. Should be used for distributedMPEG catalogues.

    Pascal (Delphi) structure definition:TMPEGDataCatalogue = packed record

    Title : string[30]; {

    Catalogue title }

    Publisher : string[30]; {

    Catalogue publisher name }

    City : String[30]; {

    Publisher's contact info }

    ZIP : String[10];

    Country : String[20];

    Address : String[30];

    Phone: String[15];

    Fax: string[15];

    Email: string[30];

    WWWURL: string[30];

    end;

  • 7/28/2019 Mpeg Audio Datafile Format Specification

    3/17

    2 - order header. Should be used for MPEG ordersgenerated from catalogues.

    Pacal (Delphi) structure defition:TMPEGDataOrder1v1 = packed record

    CustomerID : string[15]; {

    customer unique ID

    used by

    catalogue publisher }

    Name : string[30]; {

    customer name and }

    City : String[30]; { other

    contact data }

    ZIP : String[10];

    Country : String[20];

    Address : String[30];

    Phone: String[15];

    Fax: string[15];Email: string[30];

    end;

    Other - we may define other publicly available headertypes, that may be used by all applications. It is

    recommended to let us know about specific headersyou use for your application, so we may add it to

    public header types if they are of general interest.

    2 bytes Contains length of header data. Maximum value is65535. It may be used to simplify reading header info

    and to skip unsupported header types.

    up to65535

    bytes

    Contains additional info about owner and data in file.It may have predefined or custom structure, which is

    described by

    This section contains unlimited number of MPEG data records.

    This is Pascal (Delphi) record structure (note that string3, string30, string4, string255 and

    string20 are actually string[3], string[30], string[4], string[255] and string[20]):

    TMPEGData1v2 = packed record

    Header : String3; { Should contain "TAG" if header is

    correct }

    Title : String30; { Song title }

    Artist : String30; { Artist name }

    Album : String30; { Album }

    Year : String4; { Year }

    Comment : String30; { Comment }

    Genre : Byte; { Genre code }

    Track : byte; { Track number on Album }

    Duration : word; { Song duration }

  • 7/28/2019 Mpeg Audio Datafile Format Specification

    4/17

    FileLength : LongInt; { File length }

    Version : byte; { MPEG audio version index (1 - Version

    1,

    2 - Version 2, 3 - Version 2.5,

    0 - unknown }

    Layer : byte; { Layer (1, 2, 3, 0 - unknown) }

    SampleRate : LongInt; { Sampling rate in Hz}BitRate : LongInt; { Bit Rate }

    BPM : word; { bits per minute - for future use }

    Mode : byte; { Number of channels (0 - Stereo,

    1 - Joint-Stereo, 2 - Dual-channel,

    3 - Single-Channel) }

    Copyright : Boolean; { Copyrighted? }

    Original : Boolean; { Original? }

    ErrorProtection : boolean; { Error protected? }

    Padding : Boolean; { If frame is padded }

    FrameLength : Word; { total frame size including CRC }

    CRC : word; { 16 bit File CRC (without TAG).

    Not implemented yet. }

    FileName : String255; { MPEG audio file name }FileDateTime : LongInt; { File last modification date and time

    in

    DOS internal format }

    FileAttr : Word; { File attributes }

    VolumeLabel : string20; { Disk label }

    Selected : word; { If this field's value is greater than

    zero then file is selected. Value

    determines order of selection. }

    Reserved : array[1..45] of byte; { for future use }

    end;

    This document and file specification are copyrighted byPredrag Supurovic (c)1998.You may use it freely.

    Package Name

    DvmMpeg

    Description

    Dali supports reading and processing of MPEG-1 Video, Audio and System.

    Currently, only Video encoding is supported.

    MPEG Video

    http://pedja.supurovic.net/http://pedja.supurovic.net/http://pedja.supurovic.net/http://pedja.supurovic.net/
  • 7/28/2019 Mpeg Audio Datafile Format Specification

    5/17

    The format of MPEG-1 Video is as follows. A MPEG-1 Video starts with arbitrarynumber of bytes, followed by a sequence header, followed by zero, one or more

    alternating sequence of GOP(Group of Pictures) header and GOPs, followed by a

    sequence end marker. A GOP is an series of pictures (frames) each of whichconsists of a picture header and the actual picture data. A picture can be of type I

    (intracoded), P (predicted) or B (bidirectional-predicted). Each GOP must have at

    least one I frame.

    Dal provides abstraction for each of these headers

    : MpegSeqHdr, MpegGopHdr and MpegPicHdr. Each header type supports the following

    basic primitives : find, dump, skip, parse, encode. A picture is decoded intothree ScImages and zero (I), one (P) or two (B) VectorImages.

    To encode an MPEG-1 Video sequence, convert each RGB image frame to

    YUV ByteImages; perform a motion vector search on the Y ByteImage (for P and Bframes); compress the ByteImages toScImages; and finally encode the ScImages into

    a Bitstream.

    Dali provides support for frame extraction through the use of

    an MpegVideoIndex abstraction. An MpegVideoIndex is used to store informationabout each frame. Frame type, length, number of reference frame (0 for I, 1 for P,

    or 2 for B), and the offsets needed to get to these frames.

    MPEG Audio

  • 7/28/2019 Mpeg Audio Datafile Format Specification

    6/17

    MPEG-1 Audio is sequence of frames. Each frame has a header and body. The

    body encodes some number samples for current frame. Samples from the next fewframes can also be encoded in the current frame if space permits.

    We have four main abstractions for MPEG-1 Audio File. One for the mpeg audioheader (MpegAudioHdr), and the other three for the encoded audio frames (one for

    each layer of encoding: MpegAudioL1,MpegAudioL2, MpegAudioL3). The audio frame

    abstraction for layer 1 and layer 2 contain compressed data for one channel of

    audio. For layer 3 stereo audio, since encoding of left and right channels depend oneach other, the audio data contains compressed data for both channels.

    The encoding of each audio frames depends on the previous frames. To store these

    dependencies, we have two auxilary structures : MpegSynData and MpegGraData.Each corresponds to the dependencies of a channel from an audio stream, and is

    updated during each decoding. This makes direct access into the middle of thestreams impossible at the moment. Note : We can still have direct access, providedthat we precompute these dependencies and store them in files. But this feature is

    not available right now.

    MPEG System

    A MPEG-1 System stream is a multiplex of videos and audios. It consists of asequence of packets, each one contains raw compressed data from one of the

    streams. Dal provides abstractions for all three header structures that exists in

    MPEG-1 System streams: system header, pack header, and packet header. Dalalso allows creation of a MpegSysToc, a table-of-content structure that stores

    offset and length of each packet, and the timestamp from each packet.

    MPEG-1 Data Structures

    The ISO/IEC 11172 specification defines the audio, video and multiplexingstandards collectively and colloquially referred to as the MPEG-1 (Motion Picture

  • 7/28/2019 Mpeg Audio Datafile Format Specification

    7/17

    Experts Group) compression standard. The data structures for the various

    components in an encoded bitstream are given in a pseudo-C syntax, and areextensively discussed. However, it is difficult to get the big picture from reading

    the spec. More practically, in order to parse an MPEG-1 bitstream, it is necessary

    to know byte offsets within each structure. To make this information more readilyaccessible, we have condensed it into graphic form. Of course, this is no substitute

    for the original spec. Where more information is required than can be squeezed

    into the diagram, references are provided to the spec.

    The Big Picture

    A multiplexed MPEG-1 stream is composed of distinct Packs. Each Pack consistsof a Pack header and any number of Packets. Within those Packets is either video

    or audio data. These structures above the video or audio level are called the system

    layer. Video or audio data is divided into Packets without regard to lower-levelstructures -- Groups, Pictures, etc. may break across Packet boundaries. Video

    information is composed of individual Pictures. We will not discuss the

    substructures of Pictures. Pictures themselves are of three types: I (intra), P(predictive), and B (bidirectional). I Pictures are self-contained, compressing the

    image using Discrete Cosine Transform (DCT) processing. P Pictures use less data

    and are predicted from the preceding I Picture. B Pictures use the least data and areinterpolated using information from surrounding P and I Pictures. Pictures are

    organized into Groups of (typically) 15 or so Pictures. If a Group is preceded by a

    Sequence header, its first Picture is called an entrypoint. Audio information iscomposed of Frames. We will not discuss the substructure of Frames. There are nohigher-level audio structures.

  • 7/28/2019 Mpeg Audio Datafile Format Specification

    8/17

  • 7/28/2019 Mpeg Audio Datafile Format Specification

    9/17

  • 7/28/2019 Mpeg Audio Datafile Format Specification

    10/17

  • 7/28/2019 Mpeg Audio Datafile Format Specification

    11/17

  • 7/28/2019 Mpeg Audio Datafile Format Specification

    12/17

  • 7/28/2019 Mpeg Audio Datafile Format Specification

    13/17

  • 7/28/2019 Mpeg Audio Datafile Format Specification

    14/17

  • 7/28/2019 Mpeg Audio Datafile Format Specification

    15/17

  • 7/28/2019 Mpeg Audio Datafile Format Specification

    16/17

  • 7/28/2019 Mpeg Audio Datafile Format Specification

    17/17