7.3. Encoding with the libavcodec codec family

7.3. Encoding with the `libavcodec` codec family
Poprzedni	Rozdział 7. Kodowanie przy użyciu MEncodera	Następny

7.3. Encoding with the `libavcodec` codec family

libavcodec provides simple encoding to a lot of interesting video and audio formats. You can encode to the following codecs (more or less up to date):

7.3.1. `libavcodec`'s video codecs

Video codec name	Description
mjpeg	Motion JPEG
ljpeg	lossless JPEG
jpegls	JPEG LS
targa	Targa image
gif	GIF image
bmp	BMP image
png	PNG image
h261	H.261
h263	H.263
h263p	H.263+
mpeg4	ISO standard MPEG-4 (DivX, Xvid compatible)
msmpeg4	pre-standard MPEG-4 variant by MS, v3 (AKA DivX3)
msmpeg4v2	pre-standard MPEG-4 by MS, v2 (used in old ASF files)
wmv1	Windows Media Video, version 1 (AKA WMV7)
wmv2	Windows Media Video, version 2 (AKA WMV8)
rv10	RealVideo 1.0
rv20	RealVideo 2.0
mpeg1video	MPEG-1 video
mpeg2video	MPEG-2 video
huffyuv	lossless compression
ffvhuff	FFmpeg modified huffyuv lossless
asv1	ASUS Video v1
asv2	ASUS Video v2
ffv1	FFmpeg's lossless video codec
svq1	Sorenson video 1
flv	Sorenson H.263 used in Flash Video
flashsv	Flash Screen Video
dvvideo	Sony Digital Video
snow	FFmpeg's experimental wavelet-based codec
zmbv	Zip Motion Blocks Video
dnxhd	AVID DNxHD

The first column contains the codec names that should be passed after the vcodec config, like: -lavcopts vcodec=msmpeg4

An example with MJPEG compression:

mencoder dvd://2 -o title2.avi -ovc lavc -lavcopts vcodec=mjpeg -oac copy

7.3.2. `libavcodec`'s audio codecs

Audio codec name	Description
ac3	Dolby Digital (AC-3)
adpcm_*	Adaptive PCM formats - see supplementary table
flac	Free Lossless Audio Codec (FLAC)
g726	G.726 ADPCM
libamr_nb	3GPP Adaptive Multi-Rate (AMR) narrow-band
libamr_wb	3GPP Adaptive Multi-Rate (AMR) wide-band
libfaac	Advanced Audio Coding (AAC) - using FAAC
libgsm	ETSI GSM 06.10 full rate
libgsm_ms	Microsoft GSM
libmp3lame	MPEG-1 audio layer 3 (MP3) - using LAME
mp2	MPEG-1 audio layer 2 (MP2)
pcm_*	PCM formats - see supplementary table
roq_dpcm	Id Software RoQ DPCM
sonic	experimental FFmpeg lossy codec
sonicls	experimental FFmpeg lossless codec
vorbis	Vorbis
wmav1	Windows Media Audio v1
wmav2	Windows Media Audio v2

The first column contains the codec names that should be passed after the acodec option, like: -lavcopts acodec=ac3

An example with AC-3 compression:

mencoder dvd://2 -o title2.avi -oac lavc -lavcopts acodec=ac3 -ovc copy

Contrary to libavcodec's video codecs, its audio codecs do not make a wise usage of the bits they are given as they lack some minimal psychoacoustic model (if at all) which most other codec implementations feature. However, note that all these audio codecs are very fast and work out-of-the-box everywhere MEncoder has been compiled with libavcodec (which is the case most of time), and do not depend on external libraries.

7.3.2.1. PCM/ADPCM format supplementary table

PCM/ADPCM codec name	Description
pcm_s32le	signed 32-bit little-endian
pcm_s32be	signed 32-bit big-endian
pcm_u32le	unsigned 32-bit little-endian
pcm_u32be	unsigned 32-bit big-endian
pcm_s24le	signed 24-bit little-endian
pcm_s24be	signed 24-bit big-endian
pcm_u24le	unsigned 24-bit little-endian
pcm_u24be	unsigned 24-bit big-endian
pcm_s16le	signed 16-bit little-endian
pcm_s16be	signed 16-bit big-endian
pcm_u16le	unsigned 16-bit little-endian
pcm_u16be	unsigned 16-bit big-endian
pcm_s8	signed 8-bit
pcm_u8	unsigned 8-bit
pcm_alaw	G.711 A-LAW
pcm_mulaw	G.711 μ-LAW
pcm_s24daud	signed 24-bit D-Cinema Audio format
pcm_zork	Activision Zork Nemesis
adpcm_ima_qt	Apple QuickTime
adpcm_ima_wav	Microsoft/IBM WAVE
adpcm_ima_dk3	Duck DK3
adpcm_ima_dk4	Duck DK4
adpcm_ima_ws	Westwood Studios
adpcm_ima_smjpeg	SDL Motion JPEG
adpcm_ms	Microsoft
adpcm_4xm	4X Technologies
adpcm_xa	Phillips Yellow Book CD-ROM eXtended Architecture
adpcm_ea	Electronic Arts
adpcm_ct	Creative 16->4-bit
adpcm_swf	Adobe Shockwave Flash
adpcm_yamaha	Yamaha
adpcm_sbpro_4	Creative VOC SoundBlaster Pro 8->4-bit
adpcm_sbpro_3	Creative VOC SoundBlaster Pro 8->2.6-bit
adpcm_sbpro_2	Creative VOC SoundBlaster Pro 8->2-bit
adpcm_thp	Nintendo GameCube FMV THP
adpcm_adx	Sega/CRI ADX

7.3.3. Encoding options of libavcodec

Ideally, you would probably want to be able to just tell the encoder to switch into "high quality" mode and move on. That would probably be nice, but unfortunately hard to implement as different encoding options yield different quality results depending on the source material. That is because compression depends on the visual properties of the video in question. For example, Anime and live action have very different properties and thus require different options to obtain optimum encoding. The good news is that some options should never be left out, like mbd=2, trell, and v4mv. See below for a detailed description of common encoding options.

Options to adjust:

vmax_b_frames: 1 or 2 is good, depending on the movie. Note that if you need to have your encode be decodable by DivX5, you need to activate closed GOP support, using libavcodec's cgop option, but you need to deactivate scene detection, which is not a good idea as it will hurt encode efficiency a bit.
vb_strategy=1: helps in high-motion scenes. On some videos, vmax_b_frames may hurt quality, but vmax_b_frames=2 along with vb_strategy=1 helps.
dia: motion search range. Bigger is better and slower. Negative values are a completely different scale. Good values are -1 for a fast encode, or 2-4 for slower.
predia: motion search pre-pass. Not as important as dia. Good values are 1 (default) to 4. Requires preme=2 to really be useful.
cmp, subcmp, precmp: Comparison function for motion estimation. Experiment with values of 0 (default), 2 (hadamard), 3 (dct), and 6 (rate distortion). 0 is fastest, and sufficient for precmp. For cmp and subcmp, 2 is good for Anime, and 3 is good for live action. 6 may or may not be slightly better, but is slow.
last_pred: Number of motion predictors to take from the previous frame. 1-3 or so help at little speed cost. Higher values are slow for no extra gain.
cbp, mv0: Controls the selection of macroblocks. Small speed cost for small quality gain.
qprd: adaptive quantization based on the macroblock's complexity. May help or hurt depending on the video and other options. This can cause artifacts unless you set vqmax to some reasonably small value (6 is good, maybe as low as 4); vqmin=1 should also help.
qns: very slow, especially when combined with qprd. This option will make the encoder minimize noise due to compression artifacts instead of making the encoded video strictly match the source. Do not use this unless you have already tweaked everything else as far as it will go and the results still are not good enough.
vqcomp: Tweak ratecontrol. What values are good depends on the movie. You can safely leave this alone if you want. Reducing vqcomp puts more bits on low-complexity scenes, increasing it puts them on high-complexity scenes (default: 0.5, range: 0-1. recommended range: 0.5-0.7).
vlelim, vcelim: Sets the single coefficient elimination threshold for luminance and chroma planes. These are encoded separately in all MPEG-like algorithms. The idea behind these options is to use some good heuristics to determine when the change in a block is less than the threshold you specify, and in such a case, to just encode the block as "no change". This saves bits and perhaps speeds up encoding. vlelim=-4 and vcelim=9 seem to be good for live movies, but seem not to help with Anime; when encoding animation, you should probably leave them unchanged.
qpel: Quarter pixel motion estimation. MPEG-4 uses half pixel precision for its motion search by default, therefore this option comes with an overhead as more information will be stored in the encoded file. The compression gain/loss depends on the movie, but it is usually not very effective on Anime. qpel always incurs a significant cost in CPU decode time (+25% in practice).
psnr: does not affect the actual encoding, but writes a log file giving the type/size/quality of each frame, and prints a summary of PSNR (Peak Signal to Noise Ratio) at the end.

Options not recommended to play with:

vme: The default is best.
lumi_mask, dark_mask: Psychovisual adaptive quantization. You do not want to play with those options if you care about quality. Reasonable values may be effective in your case, but be warned this is very subjective.
scplx_mask: Tries to prevent blocky artifacts, but postprocessing is better.

7.3.4. Encoding setting examples

The following settings are examples of different encoding option combinations that affect the speed vs quality tradeoff at the same target bitrate.

All the encoding settings were tested on a 720x448 @30000/1001 fps video sample, the target bitrate was 900kbps, and the machine was an AMD-64 3400+ at 2400 MHz in 64 bits mode. Each encoding setting features the measured encoding speed (in frames per second) and the PSNR loss (in dB) compared to the "very high quality" setting. Please understand that depending on your source, your machine type and development advancements, you may get very different results.

Description	Encoding options	speed (in fps)	Relative PSNR loss (in dB)
Very high quality	`vcodec=mpeg4:mbd=2:mv0:trell:v4mv:cbp:last_pred=3:predia=2:dia=2:vmax_b_frames=2:vb_strategy=1:precmp=2:cmp=2:subcmp=2:preme=2:qns=2`	6fps	0dB
High quality	`vcodec=mpeg4:mbd=2:trell:v4mv:last_pred=2:dia=-1:vmax_b_frames=2:vb_strategy=1:cmp=3:subcmp=3:precmp=0:vqcomp=0.6:turbo`	15fps	-0.5dB
Fast	`vcodec=mpeg4:mbd=2:trell:v4mv:turbo`	42fps	-0.74dB
Realtime	`vcodec=mpeg4:mbd=2:turbo`	54fps	-1.21dB

7.3.5. Custom inter/intra matrices

With this feature of libavcodec you are able to set custom inter (I-frames/keyframes) and intra (P-frames/predicted frames) matrices. It is supported by many of the codecs: mpeg1video and mpeg2video are reported as working.

A typical usage of this feature is to set the matrices preferred by the KVCD specifications.

The KVCD "Notch" Quantization Matrix:

Intra:

 8  9 12 22 26 27 29 34
 9 10 14 26 27 29 34 37
12 14 18 27 29 34 37 38
22 26 27 31 36 37 38 40
26 27 29 36 39 38 40 48
27 29 34 37 38 40 48 58
29 34 37 38 40 48 58 69
34 37 38 40 48 58 69 79

Inter:

16 18 20 22 24 26 28 30
18 20 22 24 26 28 30 32
20 22 24 26 28 30 32 34
22 24 26 30 32 32 34 36
24 26 28 32 34 34 36 38
26 28 30 32 34 36 38 40
28 30 32 34 36 38 42 42
30 32 34 36 38 40 42 44

Usage:

mencoder input.avi -o output.avi -oac copy -ovc lavc \
    -lavcopts inter_matrix=...:intra_matrix=...

mencoder input.avi -ovc lavc -lavcopts \
vcodec=mpeg2video:intra_matrix=8,9,12,22,26,27,29,34,9,10,14,26,27,29,34,37,\
12,14,18,27,29,34,37,38,22,26,27,31,36,37,38,40,26,27,29,36,39,38,40,48,27,\
29,34,37,38,40,48,58,29,34,37,38,40,48,58,69,34,37,38,40,48,58,69,79\
:inter_matrix=16,18,20,22,24,26,28,30,18,20,22,24,26,28,30,32,20,22,24,26,\
28,30,32,34,22,24,26,30,32,32,34,36,24,26,28,32,34,34,36,38,26,28,30,32,34,\
36,38,40,28,30,32,34,36,38,42,42,30,32,34,36,38,40,42,44 -oac copy -o svcd.mpg

7.3.6. Example

So, you have just bought your shiny new copy of Harry Potter and the Chamber of Secrets (widescreen edition, of course), and you want to rip this DVD so that you can add it to your Home Theatre PC. This is a region 1 DVD, so it is NTSC. The example below will still apply to PAL, except you will omit -ofps 24000/1001 (because the output framerate is the same as the input framerate), and of course the crop dimensions will be different.

After running mplayer dvd://1, we follow the process detailed in the section How to deal with telecine and interlacing in NTSC DVDs and discover that it is 24000/1001 fps progressive video, which means that we need not use an inverse telecine filter, such as pullup or filmdint.

Next, we want to determine the appropriate crop rectangle, so we use the cropdetect filter:

mplayer dvd://1 -vf cropdetect

Make sure you seek to a fully filled frame (such as a bright scene, past the opening credits and logos), and you will see in MPlayer's console output:

crop area: X: 0..719  Y: 57..419  (-vf crop=720:362:0:58)

We then play the movie back with this filter to test its correctness:

mplayer dvd://1 -vf crop=720:362:0:58

And we see that it looks perfectly fine. Next, we ensure the width and height are a multiple of 16. The width is fine, however the height is not. Since we did not fail 7th grade math, we know that the nearest multiple of 16 lower than 362 is 352.

We could just use crop=720:352:0:58, but it would be nice to take a little off the top and a little off the bottom so that we retain the center. We have shrunk the height by 10 pixels, but we do not want to increase the y-offset by 5-pixels since that is an odd number and will adversely affect quality. Instead, we will increase the y-offset by 4 pixels:

mplayer dvd://1 -vf crop=720:352:0:62

Another reason to shave pixels from both the top and the bottom is that we ensure we have eliminated any half-black pixels if they exist. Note that if your video is telecined, make sure the pullup filter (or whichever inverse telecine filter you decide to use) appears in the filter chain before you crop. If it is interlaced, deinterlace before cropping. (If you choose to preserve the interlaced video, then make sure your vertical crop offset is a multiple of 4.)

If you are really concerned about losing those 10 pixels, you might prefer instead to scale the dimensions down to the nearest multiple of 16. The filter chain would look like:

-vf crop=720:362:0:58,scale=720:352

Scaling the video down like this will mean that some small amount of detail is lost, though it probably will not be perceptible. Scaling up will result in lower quality (unless you increase the bitrate). Cropping discards those pixels altogether. It is a tradeoff that you will want to consider for each circumstance. For example, if the DVD video was made for television, you might want to avoid vertical scaling, since the line sampling corresponds to the way the content was originally recorded.

On inspection, we see that our movie has a fair bit of action and high amounts of detail, so we pick 2400Kbit for our bitrate.

We are now ready to do the two pass encode. Pass one:

mencoder dvd://1 -ofps 24000/1001 -oac copy -o Harry_Potter_2.avi -ovc lavc \
    -lavcopts vcodec=mpeg4:vbitrate=2400:v4mv:mbd=2:trell:cmp=3:subcmp=3:autoaspect:vpass=1 \
    -vf pullup,softskip,crop=720:352:0:62,hqdn3d=2:1:2

And pass two is the same, except that we specify vpass=2:

mencoder dvd://1 -ofps 24000/1001 -oac copy -o Harry_Potter_2.avi -ovc lavc \
    -lavcopts vcodec=mpeg4:vbitrate=2400:v4mv:mbd=2:trell:cmp=3:subcmp=3:autoaspect:vpass=2 \
    -vf pullup,softskip,crop=720:352:0:62,hqdn3d=2:1:2

The options v4mv:mbd=2:trell will greatly increase the quality at the expense of encoding time. There is little reason to leave these options out when the primary goal is quality. The options cmp=3:subcmp=3 select a comparison function that yields higher quality than the defaults. You might try experimenting with this parameter (refer to the man page for the possible values) as different functions can have a large impact on quality depending on the source material. For example, if you find libavcodec produces too much blocky artifacts, you could try selecting the experimental NSSE as comparison function via *cmp=10.

For this movie, the resulting AVI will be 138 minutes long and nearly 3GB. And because you said that file size does not matter, this is a perfectly acceptable size. However, if you had wanted it smaller, you could try a lower bitrate. Increasing bitrates have diminishing returns, so while we might clearly see an improvement from 1800Kbit to 2000Kbit, it might not be so noticeable above 2000Kbit. Feel free to experiment until you are happy.

Because we passed the source video through a denoise filter, you may want to add some of it back during playback. This, along with the spp post-processing filter, drastically improves the perception of quality and helps eliminate blocky artifacts in the video. With MPlayer's autoq option, you can vary the amount of post-processing done by the spp filter depending on available CPU. Also, at this point, you may want to apply gamma and/or color correction to best suit your display. For example:

mplayer Harry_Potter_2.avi -vf spp,noise=9ah:5ah,eq2=1.2 -autoq 3

Poprzedni	Początek rozdziału	Następny
7.2. How to deal with telecine and interlacing within NTSC DVDs	Spis treści	7.4. Encoding with the `Xvid` codec