Mpeg Audio Datafile Format Specification

7/28/2019 Mpeg Audio Datafile Format Specification

1/17

Mpeg Audio Datafile Format Specification

MPEG audio datafile (*.m3d) is database of informations gathered from mpeg

audio files. You do not need to have mpeg audio file itself to get it's info if you

previously stored that info in database. This is very useful for cataloguing datawhen mpeg audio files are on removable media.

Main purpose of this file format is mpeg audio info distribution amongapplications. It may be used as personal database, or as catalogue.

It is currently supported byMPGScriptMPEG audio cataloguing applicationandMPGToolsDelphi unit for accessing informations from MPEG files.

Last updated version of this document may be found atm3dspecs.htm

File format specification

Generally, M3D is divided in four sections: file signature, application

identification, header info, and mpeg data. In file structure they are sorted out like

this:

This section signs file as MPEG Audio datafile and determines file version. Version numbersare defined byauthor and may not be changed by third parties. You are entitled to create onlyfiles according to current version definition. Current version is 1.2 and this document

describes it. If you want to read older versions, look support site ofMPGTools DelphiUnit for details. Check this site for future structure updates.

8

characters

Always contains #9'MP3DATA' characters. This part

may be used by third party applications to recognizefile as MPEG Audio Datafile.

1 byte Contains file version number (currently 1)

1 byte Contains file subversion number (currently 2)

This section describes application used to create file. It also may contain other, application

specific data. Third party applications may or may read data from this section or just skipthem.
http://www.datavoyage.com/mpgscript/http://www.datavoyage.com/mpgscript/http://www.datavoyage.com/mpgscript/http://www.datavoyage.com/mpgscript/mpgtools.htmhttp://www.datavoyage.com/mpgscript/mpgtools.htmhttp://www.datavoyage.com/mpgscript/mpgtools.htmhttp://www.datavoyage.com/mpgscript/m3dspecs.htmhttp://www.datavoyage.com/mpgscript/m3dspecs.htmhttp://www.datavoyage.com/mpgscript/mpgtools.htmhttp://www.datavoyage.com/mpgscript/mpgtools.htmhttp://www.datavoyage.com/mpgscript/mpgtools.htmhttp://www.datavoyage.com/mpgscript/mpgtools.htmhttp://www.datavoyage.com/mpgscript/mpgtools.htmhttp://www.datavoyage.com/mpgscript/mpgtools.htmhttp://www.datavoyage.com/mpgscript/mpgtools.htmhttp://www.datavoyage.com/mpgscript/mpgtools.htmhttp://www.datavoyage.com/mpgscript/mpgtools.htmhttp://www.datavoyage.com/mpgscript/mpgtools.htmhttp://www.datavoyage.com/mpgscript/m3dspecs.htmhttp://www.datavoyage.com/mpgscript/mpgtools.htmhttp://www.datavoyage.com/mpgscript/


2/17

1 byte Contains length of section excluding byte

1 byte Contains number of bytes used for application name(maximum value is 15).

up to 15characters

Containing application name. Should be used todetermine which application created m3d file.Application name must be registered with author to

avoid same ID-s for different applications.

Space used by application. Author of that application

is responsible for publishing data structure of thisblock if he wants third parties to use his data. He doesnot have to do that. This block length may be

calculated as --1.It may contain any additional data application needs

Contains header info about owner of file, catalogue info, order info or else. It may usedifferent structure, therefore it's divided into three blocks:

1 byte Describes type of header:

0 - custom header used by third party application (use to see which one) that created this header.Other applicat ions should just skip reading header

informations. Third party applications may publishtheir custom format. In that case, it is advisable to

contact author of m3d to register its own type.

1 - catalogue header. Should be used for distributedMPEG catalogues.

Pascal (Delphi) structure definition:TMPEGDataCatalogue = packed record

Title : string[30]; {

Catalogue title }

Publisher : string[30]; {

Catalogue publisher name }

City : String[30]; {

Publisher's contact info }

ZIP : String[10];

Country : String[20];

Address : String[30];

Phone: String[15];

Fax: string[15];

Email: string[30];

WWWURL: string[30];

end;


3/17

2 - order header. Should be used for MPEG ordersgenerated from catalogues.

Pacal (Delphi) structure defition:TMPEGDataOrder1v1 = packed record

CustomerID : string[15]; {

customer unique ID

used by

catalogue publisher }

Name : string[30]; {

customer name and }

City : String[30]; { other

contact data }

ZIP : String[10];

Country : String[20];

Address : String[30];

Phone: String[15];

Fax: string[15];Email: string[30];

end;

Other - we may define other publicly available headertypes, that may be used by all applications. It is

recommended to let us know about specific headersyou use for your application, so we may add it to

public header types if they are of general interest.

2 bytes Contains length of header data. Maximum value is65535. It may be used to simplify reading header info

and to skip unsupported header types.

up to65535

bytes

Contains additional info about owner and data in file.It may have predefined or custom structure, which is

described by

This section contains unlimited number of MPEG data records.

This is Pascal (Delphi) record structure (note that string3, string30, string4, string255 and

string20 are actually string[3], string[30], string[4], string[255] and string[20]):

TMPEGData1v2 = packed record

Header : String3; { Should contain "TAG" if header is

correct }

Title : String30; { Song title }

Artist : String30; { Artist name }

Album : String30; { Album }

Year : String4; { Year }

Comment : String30; { Comment }

Genre : Byte; { Genre code }

Track : byte; { Track number on Album }

Duration : word; { Song duration }


4/17

FileLength : LongInt; { File length }

Version : byte; { MPEG audio version index (1 - Version

1,

2 - Version 2, 3 - Version 2.5,

0 - unknown }

Layer : byte; { Layer (1, 2, 3, 0 - unknown) }

SampleRate : LongInt; { Sampling rate in Hz}BitRate : LongInt; { Bit Rate }

BPM : word; { bits per minute - for future use }

Mode : byte; { Number of channels (0 - Stereo,

1 - Joint-Stereo, 2 - Dual-channel,

3 - Single-Channel) }

Copyright : Boolean; { Copyrighted? }

Original : Boolean; { Original? }

ErrorProtection : boolean; { Error protected? }

Padding : Boolean; { If frame is padded }

FrameLength : Word; { total frame size including CRC }

CRC : word; { 16 bit File CRC (without TAG).

Not implemented yet. }

FileName : String255; { MPEG audio file name }FileDateTime : LongInt; { File last modification date and time

in

DOS internal format }

FileAttr : Word; { File attributes }

VolumeLabel : string20; { Disk label }

Selected : word; { If this field's value is greater than

zero then file is selected. Value

determines order of selection. }

Reserved : array[1..45] of byte; { for future use }

end;

This document and file specification are copyrighted byPredrag Supurovic (c)1998.You may use it freely.

Package Name

DvmMpeg

Description

Dali supports reading and processing of MPEG-1 Video, Audio and System.

Currently, only Video encoding is supported.

MPEG Video
http://pedja.supurovic.net/http://pedja.supurovic.net/http://pedja.supurovic.net/http://pedja.supurovic.net/


5/17

The format of MPEG-1 Video is as follows. A MPEG-1 Video starts with arbitrarynumber of bytes, followed by a sequence header, followed by zero, one or more

alternating sequence of GOP(Group of Pictures) header and GOPs, followed by a

sequence end marker. A GOP is an series of pictures (frames) each of whichconsists of a picture header and the actual picture data. A picture can be of type I

(intracoded), P (predicted) or B (bidirectional-predicted). Each GOP must have at

least one I frame.

Dal provides abstraction for each of these headers

: MpegSeqHdr, MpegGopHdr and MpegPicHdr. Each header type supports the following

basic primitives : find, dump, skip, parse, encode. A picture is decoded intothree ScImages and zero (I), one (P) or two (B) VectorImages.

To encode an MPEG-1 Video sequence, convert each RGB image frame to

YUV ByteImages; perform a motion vector search on the Y ByteImage (for P and Bframes); compress the ByteImages toScImages; and finally encode the ScImages into

a Bitstream.

Dali provides support for frame extraction through the use of

an MpegVideoIndex abstraction. An MpegVideoIndex is used to store informationabout each frame. Frame type, length, number of reference frame (0 for I, 1 for P,

or 2 for B), and the offsets needed to get to these frames.

MPEG Audio


6/17

MPEG-1 Audio is sequence of frames. Each frame has a header and body. The

body encodes some number samples for current frame. Samples from the next fewframes can also be encoded in the current frame if space permits.

We have four main abstractions for MPEG-1 Audio File. One for the mpeg audioheader (MpegAudioHdr), and the other three for the encoded audio frames (one for

each layer of encoding: MpegAudioL1,MpegAudioL2, MpegAudioL3). The audio frame

abstraction for layer 1 and layer 2 contain compressed data for one channel of

audio. For layer 3 stereo audio, since encoding of left and right channels depend oneach other, the audio data contains compressed data for both channels.

The encoding of each audio frames depends on the previous frames. To store these

dependencies, we have two auxilary structures : MpegSynData and MpegGraData.Each corresponds to the dependencies of a channel from an audio stream, and is

updated during each decoding. This makes direct access into the middle of thestreams impossible at the moment. Note : We can still have direct access, providedthat we precompute these dependencies and store them in files. But this feature is

not available right now.

MPEG System

A MPEG-1 System stream is a multiplex of videos and audios. It consists of asequence of packets, each one contains raw compressed data from one of the

streams. Dal provides abstractions for all three header structures that exists in

MPEG-1 System streams: system header, pack header, and packet header. Dalalso allows creation of a MpegSysToc, a table-of-content structure that stores

offset and length of each packet, and the timestamp from each packet.

MPEG-1 Data Structures

The ISO/IEC 11172 specification defines the audio, video and multiplexingstandards collectively and colloquially referred to as the MPEG-1 (Motion Picture


7/17

Experts Group) compression standard. The data structures for the various

components in an encoded bitstream are given in a pseudo-C syntax, and areextensively discussed. However, it is difficult to get the big picture from reading

the spec. More practically, in order to parse an MPEG-1 bitstream, it is necessary

to know byte offsets within each structure. To make this information more readilyaccessible, we have condensed it into graphic form. Of course, this is no substitute

for the original spec. Where more information is required than can be squeezed

into the diagram, references are provided to the spec.

The Big Picture

A multiplexed MPEG-1 stream is composed of distinct Packs. Each Pack consistsof a Pack header and any number of Packets. Within those Packets is either video

or audio data. These structures above the video or audio level are called the system

layer. Video or audio data is divided into Packets without regard to lower-levelstructures -- Groups, Pictures, etc. may break across Packet boundaries. Video

information is composed of individual Pictures. We will not discuss the

substructures of Pictures. Pictures themselves are of three types: I (intra), P(predictive), and B (bidirectional). I Pictures are self-contained, compressing the

image using Discrete Cosine Transform (DCT) processing. P Pictures use less data

and are predicted from the preceding I Picture. B Pictures use the least data and areinterpolated using information from surrounding P and I Pictures. Pictures are

organized into Groups of (typically) 15 or so Pictures. If a Group is preceded by a

Sequence header, its first Picture is called an entrypoint. Audio information iscomposed of Frames. We will not discuss the substructure of Frames. There are nohigher-level audio structures.


8/17


9/17


10/17


11/17


12/17


13/17


14/17


15/17


16/17


17/17

Mpeg Audio Datafile Format Specification

Documents

Transcript of Mpeg Audio Datafile Format Specification