[FFmpeg-user] 2 pass CBR or VBR not really fixing the bitrate?

Discussion:

Manuel Tiglio

2017-07-29 20:16:07 UTC

Hi, I was wondering if anyone has seen similar things.

Trying 2 pass CBR or VBR in ffmpeg

ffmpeg -I <input> -c:v libx264 -pass 1 -f mp4 /dev/null
ffmpeg -I <input> -c:v libx264 -b:v avg -maxrate max -minrate min -bufsize buf -pass 2 <output>

appears to give bitrates that have peak deviations from the average of at least 40%. I tried making maxrate = minrate, 110% of it, removed minrate, removed maxrate, played with different bufsize (1 sec, 2 secs, 1/2 a sec, decreasing it makes the fluctuations smaller but still at 1/2 sec the deviations are large, at 1 sec they are up to 2x from the avg)

I tried multiple versions of ffmpeg and different clips. Is this is a known issue?

Can anyone post an example of a case in which ffmpeg really gets CBR or say 110% VBR?

Thanks!

_______________________________________________
ffmpeg-user mailing list
ffmpeg-***@ffmpeg.org
http://ffmpeg.org/mailman/listinfo/ffmpeg-user

To unsubscribe, visit link above, or email
ffmpeg-user-***@ffmpeg.org with subject "unsubsc

Manuel Tiglio

2017-07-31 01:52:08 UTC

Permalink

Hi there,

I apologize for my delay in replying, I found that the mailing list messages were being sent to my junk folder :-(

Post by Manuel Tiglio
ffmpeg -I <input> -c:v libx264 -pass 1 -f mp4 /dev/null
ffmpeg -I <input> -c:v libx264 -b:v avg -maxrate max -minrate min -bufsize buf -pass 2 <output>

Have you tried using the identical video encoding settings for pass 1
and pass 2? I seem to recall that that is important.

Yes, I have tried. All kind of combinations. Interestingly enough, different options do make a difference, in the sense that the results are slightly different, but still no luck in trying to control the bitrate.

Post by Manuel Tiglio
Can anyone post an example of a case in which ffmpeg really gets CBR or say 110% VBR?

I can't, but I can point out that ffmpeg mostly does ABR, not CBR.

Thanks for sharing! But apparently I cannot even do 110% VBR, it seems to somewhat control the bitrate fluctuations but far from what it should do. I would imagine that people have already gone through this, so I am kind of puzzled.

Thanks again so much.

Manuel

Moritz
_______________________________________________
ffmpeg-user mailing list
http://ffmpeg.org/mailman/listinfo/ffmpeg-user
To unsubscribe, visit link above, or email

_______________________________________________
ffmpeg-user mailing list
ffmpeg-***@ffmpeg.org
http://ffmpeg.org/mailman/listinfo/ffmpeg-user

To unsubscribe, visit link above, or email
ffmpeg-user-***@ffmpeg.org with subject

Manuel Tiglio

2017-07-31 19:14:47 UTC

Permalink

Hi Jonathan,
This helps

Hi,
ffmpeg -i input.mp4 -c:v libx264 -x264-params "nal-hrd=cbr" -b:v 1M -minrate 1M -maxrate 1M -bufsize 2M
Source: https://trac.ffmpeg.org/wiki/Encode/H.264

Will give it a try. I had seen in other forums discussing this same issued and that this should work, but ‘nal-hrd=cbr” only work for producing ts files. I was wondering if one could also control the bitrate prior to that, at the mp4 level.
Notice that in the documentation that you quote, just below this example, it says
Constained encoding (VBV / maximum bit rate) <https://trac.ffmpeg.org/wiki/Encode/H.264#ConstainedencodingVBVmaximumbitrate>
ffmpeg -i input -c:v libx264 -crf 23 -maxrate 1M -bufsize 2M output.mp4
This will effectively "target" -crf 23, but if the output were to exceed 1 MBit/s, the encoder would increase the CRF to prevent bitrate spikes. However, be aware that libx264does not strictly control the maximum bit rate as you specified (the maximum bit rate may be well over 1M for the above file). To reach a perfect maximum bit rate, use two-pass.
ffmpeg -i input -c:v libx264 -b:v 1M -maxrate 1M -bufsize 2M -pass 1 -f mp4 /dev/null
ffmpeg -i input -c:v libx264 -b:v 1M -maxrate 1M -bufsize 2M -pass 2 output.mp4
And that, does not seem to work as advertised. That is, the two pass encoding still gives a max rate which is far from the specified target (1M in the above example).
Somebody must have gone through this before.

Of course it needs a bit modification to fit your two pass encoding.

That would be fine. But ideally I would like to have the mp4 files with bitrate control before transmuxing, and keep those mp4 files in case they need to be reused for a number of reasons.

You just worry about hls, or your really need to get this close persistence?
Because we make life streaming with crf and maxrate/buffsize over hls and never had problems with it.

Interesting, as in above? Ie

ffmpeg -i input.mp4 -c:v libx264 -x264-params "nal-hrd=cbr" -b:v 1M -minrate 1M -maxrate 1M -bufsize 2M

What does seem to work in ffmpeg is capped crf (in the sense that the obtained crf is close to the specified cap value), but then the lower bitrates are far from that cap and there is degradation in quality.
Ideally I would like to do 110% VBV, which changing the 2-pass example should be achieved by
ffmpeg -i input -c:v libx264 -b:v 1M -maxrate 1M -bufsize 2M -pass 1 -f mp4 /dev/null
ffmpeg -i input -c:v libx264 -b:v 1M -maxrate 1.1M -bufsize 2M -pass 2 output.mp4
Or with a bufsize of 1M (one sec)
Thanks Jonathan, would you mind following up on my comments above? I really appreciate it.
Manuel

Hi Manuel,
I don't believe nal-hrd=cbr is only for ts, because it is a x264 parameter, not an ffmpeg paramter. And as I know x264 don't know ts, ts is just the container, or not?
There is also a mode: nal-hrd=vbr, maybe this fills more your need, because you don't want 100% constant.

Thanks Jon, let me try that out and report back to you guys. You are right, I’d prefer 110%

I had read that those parameters are for ts in fact in the example that you sent from that ffmpeg page it actually reads

ffmpeg -i input.mp4 -c:v libx264 -x264-params "nal-hrd=cbr" -b:v 1M -minrate 1M -maxrate 1M -bufsize 2M output.ts

But I will give it a shot.

Manuel

Jonathan
_______________________________________________
ffmpeg-user mailing list
http://ffmpeg.org/mailman/listinfo/ffmpeg-user
To unsubscribe, visit link above, or email

Andy Furniss

2017-07-31 20:19:05 UTC

Permalink

Post by Manuel Tiglio
ffmpeg -i input.mp4 -c:v libx264 -x264-params "nal-hrd=cbr" -b:v 1M -minrate 1M -maxrate 1M -bufsize 2M output.ts
But I will give it a shot.

Also see x264 --fullhelp WRT nal-hrd=cbr, it may not be what you need,
no mp4 and possible "filling".

x264 --fullhelp | grep nal-hrd -A 3
--nal-hrd <string> Signal HRD information (requires vbv-bufsize)
- none, vbr, cbr (cbr not allowed in
.mp4)
--filler Force hard-CBR and generate filler
(implied by
--nal-hrd cbr)

_______________________________________________
ffmpeg-user mailing list
ffmpeg-***@ffmpeg.org
http://ffmpeg.org/mailman/listinfo/ffmpeg-user

To unsubscribe, visit link above, or email
ffmpeg-user-requ

Manuel Tiglio

2017-07-31 01:58:00 UTC

Permalink

Hi guys,

Have you tried using the identical video encoding settings for pass 1
and pass 2? I seem to recall that that is important.

Interesting that you say that. I use "-an" for the first pass and then
"-codec:a libmp3lame -b:a 128k" for the second pass when doing two-pass
recording, and it works fine. My impression was that audio was
irrelevant for the first pass, since you were only building a set of
compression guide data for the video compressor to use in pass two.

Please note that I wrote "identical video encoding settings". Indeed,
the docs say it's okay to omit the audio on first pass, while other
sources say it makes a difference, as does the target container.

The parameters above work for me. I tried VBR and could never get audio
that was synchronized properly with the video, using LAME. Since 128k,
CBR gives acceptable sound, to my ear, and good compression, I wasn't

Are you talking audio? Yes, libmp3lame hits the target bitrate (be it
ABR or CBR) pretty much spot on. But Manuel obviously asked about

You are right Moritz, I am talking about video.

ffmpeg -I <input> -c:v libx264 -pass 1 -f mp4 /dev/null
ffmpeg -I <input> -c:v libx264 -b:v avg -maxrate max -minrate min -bufsize buf -pass 2 <output>

And x264 and x265 (or possibly H.264 and H.265) are known to be tricky.

I guess so, but has anyone had “luck” trying to control the video bitrate fluctuations? I am more than happy to share my experience and details with some CC video, such as Sintel, or any other. I haven’t done that yet so as not to spam everyone, but it is kind of interesting I’d say.

Decreasing the buffer size improves the fluctuations (they become smaller) but going down to 0.5 secs already seems like a stretch.

Cheers

Manuel

Moritz
_______________________________________________
ffmpeg-user mailing list
http://ffmpeg.org/mailman/listinfo/ffmpeg-user
To unsubscribe, visit link above, or email

Nicolas George

2017-07-31 21:52:01 UTC

Permalink

Correct. I tried first fixing the distance between keyframes to 2
seconds and then 2-pass VBV and the other way around and also not
fixing the distance between keyframes.
I think I see your point now. To answer your question, the average is
done through the entire video, so in a much longer timescale than the
distance between keyframes. By looking at the data (I can send you
some plots) Iâd say that decreasing the averaging time would not
change much, but I can try. Any recommendations on that?

Can you type a typical example of what you mean for that? CBR is a
100% constrained VBR (i.e. ideally no fluctuations in the bitrate from
its average), essentially.

Except it does not mean anything practical.

What do you mean?

Ok, I will try to explain one last time.

You never have "100% constrained VBR". That would mean every single
frame has the exact same size. There are only one class of codecs that
do that: uncompressed video. Yes, with uncompressed video, every full-HD
image will use exactly 6220800 octets (assuming no chroma sub-sampling
and 8-bits depth), and every minute of movie will fill a dual-layer DVD.

But compressed codecs are very different. I already mentioned I-frames,
who are much larger than P- and B-frames. But that is not all. Simple
scenes, like a shot of the sky, need much less data than complex scenes,
like water or foliage. Still scenes require much less data than fast
scenes.

For actual streaming, the client has a buffer, with a certain capacity
that amounts to a certain duration. If you want the streaming to go
smoothly, the constraint is that the average bit rate over a window with
the same width as the buffer is never beyond the bandwidth.

So, to achieve smooth streaming, you need to know the size of the buffer
of your clients, or at least a lower bound for it, and set your encoder
accordingly.

Now, about two-pass encoding.

Remember that your viewers will always notice the parts with the worst
quality. Therefore, the most pleasing result will be achieved when the
quality is as constant as possible. If you have no constraint at all,
then set a constant quality (cbr for x264).

But if you have bit rate constraints, it is the same as having a budget:
you have to make sacrifices, and you better choose the sacrifices that
will be the most painless.

Let us take an example: you must encode one minute into 4 mega-octets.
If the minute is made of 30 seconds fast complex content and 30 seconds
slow simple content, then you want to invest 3 mega-octets into the
first part and only 1 into the second part. But if the content is fast
and complex for the whole minute, then you need to invest 2 mega-octets
for both halves; the quality will be lower, but you cannot do anything
about it if you are on a budget.

That is where two-pass encoding comes into play: when the codec is
processing the first 30 seconds, it does not know about the contents of
the second half, so it cannot decide whether to cut 50%-50% or 75%-25%
or anything else. So you make a first run to collect statistics about
the complexity of the video, and then a second pass to allocate your
budget according to these statistics.

But encoders also have a buffer. They do not encode one frame at a time,
they encode many at once and make global decisions on them. If the
window of the encoder is larger than the window of the decoder, then
two-pass encoding is completely useless.

As for how to apply this to your specific problem, only you can know the
constraints.

Regards,

--
Nicolas George

Manuel Tiglio

2017-07-31 22:12:56 UTC

Permalink

Post by Nicolas George

Correct. I tried first fixing the distance between keyframes to 2
seconds and then 2-pass VBV and the other way around and also not
fixing the distance between keyframes.
I think I see your point now. To answer your question, the average is
done through the entire video, so in a much longer timescale than the
distance between keyframes. By looking at the data (I can send you
some plots) I’d say that decreasing the averaging time would not
change much, but I can try. Any recommendations on that?

Can you type a typical example of what you mean for that? CBR is a
100% constrained VBR (i.e. ideally no fluctuations in the bitrate from
its average), essentially.

Except it does not mean anything practical.

What do you mean?

Ok, I will try to explain one last time.
You never have "100% constrained VBR". That would mean every single
frame has the exact same size. There are only one class of codecs that
do that: uncompressed video. Yes, with uncompressed video, every full-HD
image will use exactly 6220800 octets (assuming no chroma sub-sampling
and 8-bits depth), and every minute of movie will fill a dual-layer DVD.
But compressed codecs are very different. I already mentioned I-frames,
who are much larger than P- and B-frames. But that is not all. Simple
scenes, like a shot of the sky, need much less data than complex scenes,
like water or foliage. Still scenes require much less data than fast
scenes.
For actual streaming, the client has a buffer, with a certain capacity
that amounts to a certain duration. If you want the streaming to go
smoothly, the constraint is that the average bit rate over a window with
the same width as the buffer is never beyond the bandwidth.
So, to achieve smooth streaming, you need to know the size of the buffer
of your clients, or at least a lower bound for it, and set your encoder
accordingly.
Now, about two-pass encoding.
Remember that your viewers will always notice the parts with the worst
quality. Therefore, the most pleasing result will be achieved when the
quality is as constant as possible. If you have no constraint at all,
then set a constant quality (cbr for x264).
you have to make sacrifices, and you better choose the sacrifices that
will be the most painless.
Let us take an example: you must encode one minute into 4 mega-octets.
If the minute is made of 30 seconds fast complex content and 30 seconds
slow simple content, then you want to invest 3 mega-octets into the
first part and only 1 into the second part. But if the content is fast
and complex for the whole minute, then you need to invest 2 mega-octets
for both halves; the quality will be lower, but you cannot do anything
about it if you are on a budget.
That is where two-pass encoding comes into play: when the codec is
processing the first 30 seconds, it does not know about the contents of
the second half, so it cannot decide whether to cut 50%-50% or 75%-25%
or anything else. So you make a first run to collect statistics about
the complexity of the video, and then a second pass to allocate your
budget according to these statistics.
But encoders also have a buffer. They do not encode one frame at a time,
they encode many at once and make global decisions on them. If the
window of the encoder is larger than the window of the decoder, then
two-pass encoding is completely useless.
As for how to apply this to your specific problem, only you can know the
constraints.
Regards,
--
Nicolas George

Thanks Nicolas, so your point seems to be not be concerned with peak bitrates but only with average ones and buffer sizes. Is that a more or less accurate summary?

Post by Nicolas George
_______________________________________________
ffmpeg-user mailing list
http://ffmpeg.org/mailman/listinfo/ffmpeg-user
To unsubscribe, visit link above, or email

Manuel Tiglio

2017-07-31 22:14:05 UTC

Permalink

Post by Nicolas George

Correct. I tried first fixing the distance between keyframes to 2
seconds and then 2-pass VBV and the other way around and also not
fixing the distance between keyframes.
I think I see your point now. To answer your question, the average is
done through the entire video, so in a much longer timescale than the
distance between keyframes. By looking at the data (I can send you
some plots) I’d say that decreasing the averaging time would not
change much, but I can try. Any recommendations on that?

Can you type a typical example of what you mean for that? CBR is a
100% constrained VBR (i.e. ideally no fluctuations in the bitrate from
its average), essentially.

Except it does not mean anything practical.

What do you mean?

Ok, I will try to explain one last time.
You never have "100% constrained VBR". That would mean every single
frame has the exact same size. There are only one class of codecs that
do that: uncompressed video. Yes, with uncompressed video, every full-HD
image will use exactly 6220800 octets (assuming no chroma sub-sampling
and 8-bits depth), and every minute of movie will fill a dual-layer DVD.
But compressed codecs are very different. I already mentioned I-frames,
who are much larger than P- and B-frames. But that is not all. Simple
scenes, like a shot of the sky, need much less data than complex scenes,
like water or foliage. Still scenes require much less data than fast
scenes.
For actual streaming, the client has a buffer, with a certain capacity
that amounts to a certain duration. If you want the streaming to go
smoothly, the constraint is that the average bit rate over a window with
the same width as the buffer is never beyond the bandwidth.
So, to achieve smooth streaming, you need to know the size of the buffer
of your clients, or at least a lower bound for it, and set your encoder
accordingly.
Now, about two-pass encoding.
Remember that your viewers will always notice the parts with the worst
quality. Therefore, the most pleasing result will be achieved when the
quality is as constant as possible. If you have no constraint at all,
then set a constant quality (cbr for x264).
you have to make sacrifices, and you better choose the sacrifices that
will be the most painless.
Let us take an example: you must encode one minute into 4 mega-octets.
If the minute is made of 30 seconds fast complex content and 30 seconds
slow simple content, then you want to invest 3 mega-octets into the
first part and only 1 into the second part. But if the content is fast
and complex for the whole minute, then you need to invest 2 mega-octets
for both halves; the quality will be lower, but you cannot do anything
about it if you are on a budget.
That is where two-pass encoding comes into play: when the codec is
processing the first 30 seconds, it does not know about the contents of
the second half, so it cannot decide whether to cut 50%-50% or 75%-25%
or anything else. So you make a first run to collect statistics about
the complexity of the video, and then a second pass to allocate your
budget according to these statistics.

Sure, but gets us back to my original question of anyone can control the maxrate in ffmpeg for x.264

Post by Nicolas George
But encoders also have a buffer. They do not encode one frame at a time,
they encode many at once and make global decisions on them. If the
window of the encoder is larger than the window of the decoder, then
two-pass encoding is completely useless.
As for how to apply this to your specific problem, only you can know the
constraints.
Regards,
--
Nicolas George
_______________________________________________
ffmpeg-user mailing list
http://ffmpeg.org/mailman/listinfo/ffmpeg-user
To unsubscribe, visit link above, or email

Andy Furniss

2017-07-31 19:02:54 UTC

Permalink

1.23. For VOD content the peak bit rate SHOULD be no more than
200% of the average bit rate.

Without a proper definition of "peak bit rate", this sentence is
meaningless.

Are you saying that Apple’s authoring requirements for HLS are
meaningless?

A bit rate is a quantity of information divided by a time. Since it
is discrete, there is no way of doing calculus, and therefore there
is no notion of instantaneous bit rate. Thus, without saying the
period of time on which the bit rate is computed, it does not mean
anything.

When doing streaming you typically send packets of 6-10 secs, so in
that interval the peak (i.e. maximum value) bitrate does not have to
exceed 10% (110% constrained VBR) or 100% (200% constrained VBR) of
the average bitrate in that interval.
But the exact value of that length of time is irrelevant.
The fact that it is a discrete series is also irrelevant, you can
compute discrete bitrates in the same way that you compute finite
differences (for example).
This is really really standard for anyone working on streaming. What
is not standard is that ffmpeg has such large fluctuations between
the peak and average bitrate

You are encoding with libx264 not ffmpeg, it's just passing params.

What are you using to measure these "large fluctuations" and over what
timescale?

I see in the guide you linked that chunks are required to start with an
I frame - these for the same quality are usually far bigger than b or p
frames, so if you measure over small time there is bound to be a high
variation.

FWIW the player buffer size constraint AIUI is meant to absorb these
sort of things when fed at a constant/limited rate = it's not a
parameter that means the bitstream will at a small scale be constant
rate, it means the player using that buffer size at a certain "feed"
rate will not under/overflow when it meets short term fluctuations. If
you set it too small in some effort to make the encoder avoid these you
will likely end up with I frame pulsing.
_______________________________________________
ffmpeg-user mailing list
ffmpeg-***@ffmpeg.org
http://ffmpeg.org/mailman/listinfo/ffmpeg-user

To unsubscribe, visit link above, or email
ffmpeg-user-***@ffmpeg.org wi