Audio Video Graphics Spec 5fR2

Jump to: navigation, search

Draft 0.3


Audio, video, and graphics processing is at the core of many CE products. The AVG requirements for CE devices are different than those for PCs/Servers, notably with respect to footprint, input devices, interlacing, streaming, etc.. Multiple graphics planes and video planes may be combined using, e.g., alpha blending and animation.


No single default/standard interfaces exist for AVG. Having a well defined, well supported interface for AVG devices will reduce fragmentation of solutions and encourage the CE community to develop solutions that apply to conforming interfaces, so that they can be deployed across a wider range of systems.


Acronyms and terms

Acronyms and Terms
Term Definition
ALSA Advanced Linux Sound Architecture -- functional level audio API, now standard in 2.6 Linux kernels, replacing OSS.
API Application Programmers Interface
ARIB Association of Radio Industries and Businesses. Most relevant to AVG is the proposed graphics architecture proposed for High Definition TV Broadcast (the 5-plane model).
ATSC Advanced Television Systems Committee. American standard body for digital television broadcasting.
Back-end Scaler A Scaler which manipulates the graphics planes and data, but does not allow the host processor access to the (blended) end result, mainly for efficiency reasons.
CCIR 601 In 1982 CCIR 601 established a digital video standard, which uses the Y, Cr, Cb color space (often incorrectly referred to as YUV). Unlike YUV, Cr,Cb range [-0.5, -0.5]. A full conversion matrix is included below (*)
CE Consumer Electronics: a class of devices used in the home or on the move. Includes DVD, DVR, PVR, PDA, TV, set-top box, cellular phones, etc.
DVB Digital Video Broadcast: European standards body for digital television broadcasting.
DVD Digital Versatile Disc: high capacity multimedia data storage medium.
DVR Digital Video Recorder: a consumer electronic device.
FB,Framebuffer Abstraction of video-out hardware with a low level (ioctl) API. Standard in >2.4 Linux kernel (see the /usr/src/linux/Documentation/fb kernel tree directory for more information).
Front-end Scaler A Scaler which manipulates the graphics planes and data and allows the host processor access the (in-between and) end results.
HDTV High Definition Television: provides a higher quality television broadcast, with progressive and interlaced ( 720p to 1080i ) video and support for 16:9 aspect (movie) ratio.
JPEG Joint Photographic Experts Group: (lossy) still image compression standard.
MHP Multimedia Home Platform: an API used together with MPEG-2 transmissions.
MIME Multipurpose Internet Mail Extension: a standard for identifying the type of data contained in a file. MIME is an Internet protocol that allows sending binary files across the Internet as attachments to e-mail messages. This includes graphics, photos, sound, video files, and formatted text documents.
MP3 MPEG-1 Audio Layer 3: a popular audio compression standard.
MPEG-1/2/4 Moving Picture Experts Group: a compression standard for digital audio & video with varying levels of complexity and achievable compression ratios.
NTSC National Television Systems Committee: American standard for analog television broadcasting.
PAL Phase Alternating Line: American standard for analog television broadcasting.
PNG Portable Network Graphics: (lossless) still image compression standard.
PVR Personal Video Recorder: a consumer electronic device.
RGB[A] Colorspace representation commonly used in computer graphics. It uses three orthogonal components -- Red, Green and Blue -- to represent colors in to human visible spectrum, e.g. by combining red and green as additive colors it can fool the eye into seeing "yellow" light. An optional A at the end denotes the presence of per-pixel alpha. See also CCIR 601.
Scaler Graphics hardware accelerator which may scale and reformat (e.g. convert from YCC to RGB) graphics data and merge multiple independent graphics planes for final display.
V4L Video for Linux: low level (ioctl) video input and overlay API, standard in 2.4. Originally designed for control of analog video capture and tuner cards, as well as parallel port and USB video cameras. Incorporated in many other higher level APIs such as DirectFB.
!V4L2 Video for Linux, second version, made to be more flexible and extensible. Added specifications for digital tuner control and capture.
YC'bCr[A] Colorspace representation commonly used in analog and digital video broadcasts, and video compression technologies such as MPEG. It uses three orthogonal components, one for luminance (Y) and two for the color-difference signals (Cr,Cb). Since the eye is less sensitive to color than luminance, the color difference signals often get a smaller bandwidth allocated (or lower pixel resolution in the digital domain). An optional A at the end denotes the presence of per-pixel alpha. See also CCIR 601.
YIQ Colorspace representation commonly used in North American TV broadcast and is similar to YUV (see definition of YUV). The relation with YUV is: I == 0.74 V - 0.27 U and Q == 0.48 V + 0.41 U
YUV Colorspace representation commonly used in European TV broadcast. It is similar to YC'bCr and often meant to be the same (incorrectly) with U referring to Cb and V referring to Cr. With Y (luminance) defined as Y==0.299 R + 0.587 G + 0.114 B, by definition, U==B-Y, thus U represents colors from blue (U>0) to yellow (U<0). Likewise V==R-Y, thus V represents colors from magenta (V>0) to Cyan (blue green) (V<0).

RGB to YCbCr conversion matrix
Y 0.299 0.587 0.114 R
Cr == 0.500 -0.419 -0.081 G
Cb -0.169 -0.331 0.500 B

Compliance classifiers

Terminology conventions are adopted here as they are defined in IETF RFC 2119, "Key words for use in RFCs to Indicate Requirement Levels" (by S. Bradner, March 1997). A compliance classifier from the following set may be used:

[M]ust, Required, Shall
This is the minimum set of requirements. The CELF based products are expected to comply with these requirements when expressed in unconditional form. A conditional requirement expressed in the form, "If X, then Y must be implemented", means that the requirement "Y" must be met when the conditional aspect "X" applies to a given implementation.
[S]hould, Recommended
Recommended items are optional items that are strongly recommended for inclusion in CELF based products. The difference between "recommended" items and "optional" items, below, is one of priority. When considering features for inclusion in a product, recommended items should be included first.
[O]ptional, May
Optional items are suggestions for features that will enhance the user experience or are offered as a less preferred choice relative to another recommended feature. If optional features are included, they should comply with the requirement to ensure interoperability with other implementations.
E[X]pressly Forbidden
This term means that an item must not be incorporated in a CELF based product.


[O] Three target platforms are used or under consideration:

  • Renesas SH4 host with SM501 graphics (See SzwgPlatform3)
  • TI OMAP (See SzwgPlatform1)
  • X86 generic with Matrox G450/550

For the first two, the System Size Spec page has a full description under "Definition - Platform".

Audio Specification

[O] No additional Audio specifications have been defined. ALSA, defined in kernel 2.6, may be used. Further evaluation is required before it can be considered for recommendation (see work in progress). Future extensions relate to AV streaming and synchronization.

Video-in/Capture Specification

[O] No additional Video input (capture) specifications have been defined. !V4L2, as defined in kernel 2.6, may be used.

[O] Proprietary solutions may also be used for video capture and digital tuners if !V4L2 does not suffice.

[O] DirectFB may be used as a higher level API.

Note: Video output can be seen as an (interlaced) sub-set of graphics. See graphics specification below for more details.

Video-out/Graphics Specification

[S] The standard Framebuffer is recommended for use in embedded CE devices.

[O] DirectFB may also be used in combination with the framebuffer.

Extensions to both are under consideration (see work in progress).

Graphics formats

[O] The framebuffer supports CLUT, RGB and RGBA packet data formats, but not YC'bCr[A]. Hardware capable of accelerating the display YC'bCr[A] packed data may develop their own extensions to the framebuffer for now.

[O] Also, the DirectFB framework which supports these formats may be used.

Multi-plane support

[O] Graphics hardware capable of multiple planes may be implemented with a single or multiple device drivers with one device per plane e.g. /dev/fb0, /dev/fb1,.../dev/fb5 for a 5 plane capable device. Front-end based scalers are recommended to use the DirectFB framework.

[O] Back-end scalers may add ioctl's to their framebuffer drivers.

Work in progress

Both DirectFB and the Framebuffer can be extended with YCbCr formats and multi-plane blending features commonly found in embedded CE devices. However, it is likely that only one of them will be supported in the future.

Framebuffer specification

YCbCr Format

Resolution Support

The recommended formats are:

  • 4:4:4 Equal number of samples of Y, Cb and Cr.
  • 4:2:2 Cb/Cr are subsampled by a factor of two horizontally.
  • 4:2:0 Cb/Cr are subsampled by a factor of two in both directions.
  • 4:1:1 Cb/Cr are subsampled by a factor of four horizontally (used in DV).

If any of these formats are used, the CCIR 601 standard must be used. It defines how the data is interleaved and the relative positions of the Cb/Cr samples in relation to the Y samples.

Memory representation

YCbCr may be stored in e.g a framebuffer in various ways:

  • packed YC'bCrA 4:4:4 : 32-bit unit containing one pixel with alpha
  • packed YC'bCr 4:2:2 : 16-bit unit, two successive units contain two horizontally adjacent pixels, no alpha
  • planar YC'bCr[A] 4:2:2 : three [four] arrays, one for each component
  • semi-planar YC'bCr 4:2:2 : two arrays, one with all Ys, one with Cb and Cr.
  • planar YC'bCr[A] 4:2:0 : three [four] arrays, one for each component
  • semi-planar YC'bCr 4:2:0 : two arrays, one with all Ys, one with U and Vs

Following CCIR601, only the packed formats are recommended, with the possible exception of a separate alpha plane in some cases (see ARIB [O6] proposal).

Font rendering

  • freetype [O5]

Basic 2D acceleration

  • lines (horz./vert. vs. anti-aliased lines)
  • rectangles (fill and copy)
  • pixmaps (bitblt, scaling)

Video format control

  • resolution
  • interlaced/progressive

Multi-plane support

  • Each plane is represented by... [/dev/fb0, /dev/fb1,...]
  • Additional API (ioctl) calls... [display order, placement, scaling,...]

DirectFB specification

DirectFB overview [G2] provides a list of currently supported features, summarized below.

Important Terminology

Memory region physically reserved for rendering pixels. Surfaces are used for regular rendering of pixels, sprites and so on.
Sub-region of surface. No physical memory allocated.
Primary Surface
Visible screen in full screen mode.
Each layer is different video memory. They are alpha-blended and displayed.
Each layer may have multiple window. Windowstack is a stack of windows. Each window has surface. Their locations and orders may be changed.

YCbCr Format

Resolution Support

Supported formats are:

  • 4:2:2 Cb/Cr are subsampled by a factor of two horizontally.
  • 4:2:0 Cb/Cr are subsampled by a factor of two in both directions.
Memory representation
  • packed YCbCr 4:2:2 : 16-bit unit, two successive units contain two horizontally adjacent pixels, no alpha
  • planar YCbCr 4:2:0 : three arrays, one for each component
Font rendering
  • DirectFB bitmap font
  • !TrueType (using !FreeType2)
  • No bold or italics support other than by specifying a different typeface from the same font family. For example, 'Times New Roman Regular' and 'Times New Roman Italic' correspond to two different faces.

Basic 2D acceleration

  • lines (anti-aliased)
  • rectangles (fill and copy)
  • triangle (fill and copy)
  • pixmaps (bitblt, scaling)
  • Per-pixel alpha blending (a.k.a. texture alpha)
  • Per-plane alpha blending (a.k.a. alpha modulation)
  • Colorizing (a.k.a. color modulation)
  • Source and destination color keying

Video format control

  • resolution
  • interlaced/progressive support

Multi-plane support

  • DirectFB layers (not surfaces) support the concept of planes.
  • Layer API is provided through IDirectFBD'isplayLayer Interface.
  • Opacity is available through IDirectFBD'isplayLayer::!SetOpacity.
  • IDirectFBD'isplayLayer::!SetScreenLocation() controls scaling of the plane. Back-End Scaler(BES) is used, for instance for Matrox. It requires hardware support.
  • Explicit Front-End Scaler(FES) is not available. Thus, stretched blit to the primary surface should be used.
  • To execute a specific graphics operation (e.g. blitting of a surface), the DirectFB driver will access the memory mapped io ports of the graphics hardware to submit the command to the acceleration engine (actual hardware acceleration is done entirely from user space).

GFX Card Driver

  • DirectFB abstracts the video driver through GFX driver.
  • Graphic operation is executed through IDirectFBSurface interface. The interface calls appropriate callback routine in gfxcard driver(src/core/gfxcard.c). The callback routine decides whether the video device has hardware acceleration capability or not, and invokes appropriate functions.
  • The following is the model used in the core gfxcard driver. Blit, !DrawLine,

DrawRect and similar operations are implemented in this way:

void dfb_gfxcard_OPERATION() 
	bool hw == false;
	/* check if acceleration is available, and then acquire  */
	if (hardware_accel_available(OPERATION) && hardware_accel_acquire(OPERATION)) {
		hw == card->funcs.OPERATION();
	/* if hardware acceleration is not available */
	if (!hw) {

DirectFb benchmarks

You can refer 'DirectFB' benchmark on various environment from Benchmark section of Evaluate Direct Fb Task Page


G - Graphics/Video out: 1 Framebuffer - (1) KD26/fb - - 2 DirectFB (uses 1) - - 3 NanoX - 4 SDL - 5 Gstreamer - 6 OpenGL (OpenML) - -

V – Video in: 1 V4L[2] - (1) KD26/video4linux - 2 OpenML - 3 LinuxTV (DVB API) -

A – Audio in/out: 1 OSS - (1) KD26/sound/oss - 2 ALSA - (1) KD26/sound/alsa - 3 OpenAL -

U – Users of AVG: 1 V'ideoLan - 2 Freevo - 3 LinuxTV - 4 MythTV - 5 DVR - 6 OpenPVR - -

O – Other: 1 TV Linux Alliance - 2 TV Anytime - 3 Digital Home Working Group - 4 B'ootSplash - 5 F'reeType - 6 ARIB architecture -

Note (1) - KD26 refers to the Linux 2.6.X kernel tree, which has a "Documentation" sub-directory.

Remaining Issues

See Work in progress.