Fmiser | 8 Sep 2011 10:47
Picon

Re: To normalize volume level in a file

> Safiye Turgut (Garanti Teknoloji) wrote:

> Two methods are appropriate for us. But we choose the second
> method that you said turn up the quiet portions but leave the
> loud parts the same.  All of parts level should be same.
> 
> How can I do it? 

With the compand effect.  The man page and "soxexam" have
information.

The "compand" effect can be setup as a compressor, or expander,
or AGC.

Technically, an AGC is a compressor with a make-up gain with
the controls "turned around" so the target level is set rather
than trigger point.  The compand effect can do the same thing.

I usually use the "play" command to fiddle with the values.

Command [infile] [effect] [attack time],[release time] 
[knee] [in dB],[out dB] [gain] [initial level] [delay]

To make more even slow changes try:

play file.wav compand 1,4 6:-80,-80,-75,-25,0,0 -5 -30 1
This will make a very slow change - maybe too slow. And it
won't help at all for short, quick sounds - like if someone
bumps the mic.

So, to clamp down on the short duration noises
play file.wav compand .03,.2 -80,-80,-15,-15,0,-15 -15 -40 .1

Those strings of numbers look daunting, but don't let them
scare you.

The first two are attack and release - like any compressor
would have. The smaller the number, the "faster" the response
time. So to have the gain adjustment not be affected by a
single hand clap, or a book dropping set the attack
slower. Release is then how quickly the gain adjustment "lets
go" of the signal. 

The next set of comma separated numbers are a map. Meaning "if
the input level is -80, make the output level -80". The next
pair is "-75,-25". This is a 3:1 ratio compression. And until
there is another number pair, it will treat all the levels
in between with the same ratio. Below is another way to see
it, the numbers in parentheses are implied.  

    0      0
  (-3     -1)
  (-6     -2)
 (-12     -4)
 (-18     -6)
 (-30    -10)
 (-45    -15)
 (-60    -20)
  -75    -25
  -80    -80
  IN     OUT

After the pairs is an overall gain. 

That is followed by a "starting position" gain level. Since
the whole point if the effect is to respond to the input
signal over time, it takes time to adjust at the very start of
the file. This lets _you_ guess what the incoming audio level
will be to give the effect a head start.

The last number is how far behind the measuring should the
adjusting occur. Nearly always having it match the "attack"
value is best.

With the first command, the compand part is adding gain. In
the second command I turn that around. I set the attack and
release much faster. then instead of turning up the output as
compared to the input, I turn down the output.

After processing the audio with both of these, I would apply
some form of "normalizing". Sox can only do this with peak
values - which do _not_ correspond to how our ears will hear
it.  There is a linux tool called "normalize-audio" that bases
it's gain adjustment on RMS values, not peak, which makes it
a much better match to what we hear.

Better yet would be what's called "replay-gain".  This uses RMS,
statistical analysis, and EQ filters to make it an even better
match to how human ears would hear it.

I have used the linux tool wavegain as well as vorbisgain and
metaflac.  I think wavegain is the only one that will actually
change the file.  The other ones just create a tag so the
playback software can adjust the level accordingly.   

-- Philip

------------------------------------------------------------------------------
Doing More with Less: The Next Generation Virtual Desktop 
What are the key obstacles that have prevented many mid-market businesses
from deploying virtual desktops?   How do next-generation virtual desktops
provide companies an easier-to-deploy, easier-to-manage and more affordable
virtual desktop model.http://www.accelacomm.com/jaw/sfnl/114/51426474/

Gmane