Human voice editing
I am attaching an example of my voice (full of imperfections) for advice on how to improve it. It was recorded with a Tascam DR-40 and has not been edited.
User Michi provided generic settings for an Equalizer:
Frequenz Gain bandwidth
100 Hz - Inf. 12 Low Shelf
230 Hz -5,0 dB 2,0 Band
500 Hz -3,0 dB 2,0 Band
2.500 Hz +5,0 dB 3,0 Band
10.000 Hz -Inf. 12 High Shelf
Using the LV2_Calf_EQ_5_Band plugin I got a good improvement, maybe it became too "telephone".
Again Michi provided an example suitable for my voice, using CinGG's native plugin: EQ Graphic (see image1).
This leads to even better results.
As a complete beginner I wanted to ask some theoretical/practical questions about the use of Equalizers:
1- Do you examine the frequency spectrum of the entire audio file and intervene by trying to level out peaks and valleys?
2- If I lower the bass frequencies in the graphic EQ, at the same time the high frequencies are lowered, as if there is a low-pass filter. Is this normal? (see image2)
3- I've tried some de-essers but don't know how to make them work. Any advice?
4- Is it better to do sound editing inside CinGG or use external programs (Ardour, Audacity, etc.)?
EDIT: changed audio file.
Is CinGG self-sufficient for sound editing, or do you have to use external programs?
I have made several feature films. All the video editing is done under cinelerraGG and the audio via ardour. Then the video and audio mixing is done in avidemux.
This works very well but I realised that any changes to the video editing had to be reflected in ardour, which quickly becomes complicated.
Moreover, I use several applications which is not without danger either.
So a month ago I decided to do everything in cinelerraGG.
Currently the video editing of my last film is finished so I started editing the audio with all the corresponding filters in ardour.
I have to say that for now:
1- Editing the video afterwards is simplified, everything is done in cinelerraGG.
2- The audio filters are for my part comparable to ardour but less intuitive.
3- Mixing the audio tracks is more complicated than in ardour.
So for now I have to say that I am not disappointed but I am not finished yet.
(Copied over from MantisBT).
I loaded your voice from your video into Cinelerra and applied the EQ Graphic. See screenshot. Try it out. Of course, the goal is not to have the peaks of the volume distribution all smoothed out, but you can see where there are a lot of valleys and mountains. Try it out and see where you like your voice best.
Oh, how wonderful! Yes, I would like to follow the approach of fary54 as well: Do everything with one application, that is a video editor. My desire is to work with Cinelerra. I'm willing to give up that last bit of quality. (Otherwise, I guess I would have had to stick with MS Windows and Vegas Pro).
Unfortunately, the audio effects are often not that intuitive to use. Some even horrible.
But I imagine the three of us can achieve a lot together. And everyone else who wants to use Cinelerra can benefit from it, too.
My suggestion that we first identify a few features that are necessary to improve the sound of a video. I want to start now from a proper audio recording. Bad, noisy or clipped recordings are a special case and need to be considered separately.
For me, it is currently quite enough to improve the sound a bit with an equalizer and a compressor. How is it with you?
As far as @Michi's proposal on audio enhancement is concerned, I am not against participating according to my knowledge.
Besides, as I said in my previous messages, I am looking for a reliable and viable solution to do everything via CinelerraGG.
At the moment, based on @Michi's advice I am busy discovering the F_acompressor filter.
I will come back to give my impressions later.
What is the position of @andreapaz on this partnership? Does it also have a more relevant solution?
Looking forward to reading you.
Translated with www.DeepL.com/Translator (free version)
Unfortunately I have no experience in sound editing. In the videos I made for youtube I used Audacity only as denoise, compressor and normalize. With equalization I made the voice worse instead of better. Starting from the parameters given by Michi (Hans in the forum) I'm doing tests. But I lack the theoretical basis.
The creator of Cinelerra is a musician and the first version was only for audio; only later was the video part added. But Adam created the tools for his own use and for a beginner like me it's difficult to understand how they work. Maybe using Jack and a low-latency kernel would make things better.
In this regard I must say that the part of the manual on audio plugins is the most neglected and lacking, simply for lack of competence. If you have pointers on how best to explain the various plugins, you are welcome. Probably a rewrite from the beginning would be better....
Glen is a musician and creator of the sound mixing oriented linux distribution called AVLinux. He is an early Cinelerra user. I remember that he thinks the same as you about CinGG: it has several tools but not very usable. That's why he uses Ardour for his productions. I think it is the same for another musician user: Dan Kinzelman. It would be nice to hear their opinion.
Can your productions be found anywhere (Amazon, etc)?
Hallo @fary54 and @andreapaz
Is it not possible to make decent videos with Cinelerra without having to edit the audio externally? That would be a bit disillusioning for me.
My experience so far: For me it now works well with CIN. I have very high demands on the sound! However, they are good recordings (Rode NTG) of only one spoken voice.
As I have already mentioned, I switched from Windows and the very good Vegas Pro to Manjaro Linux, I would like to ask you directly: What is the best way to make videos with Linux? It would burst a very big hope if it did not work at all.
Hallo, here is my attempt to solve the compressor problem.
The F_acompressor is probably the best in Cinelerra. At the beginning I had big problems with it because I didn't understand the values. Now I know how to understand them:
dB Amplitude ratio
-3 0.708 ≈ √1⁄2
-6 0.501 ≈ 1⁄2
But maybe everything is not so complicated. There is a certain basic rule that helps well:
My preferred setting for spoken language is
Threshold - ca. 10 dB to 30 dB, depends on sound
Ratio - 3:1 or 4:1, sometimes more
Attack: ca. 7-10 ms
Release: ca. 20-30 ms
Compression ca.-5 dB bis -6 dB,
Output-Gain: as much as the material was compressed
Now a graphical display of the compressor would be very useful. You would have to be able to see how much volume is lost through compression and compensate for it through the makeup. But we have to do without this help here. The normal compressor (single band compressor) in CIN shows that. That's why I was looking for it in the beginning - but unfortunately I didn't get along with it, and I think it only worsens the audio.
Hans = Michi
"Can your productions be found anywhere (Amazon, etc)?"
My productions are only for private use and are not visible elsewhere.
A) "I didn't understand the values
To convert db you can also easily calculate it using the formula 10^(db/20)
For example: what amplitude ratio for -20db?
B) "You would have to be able to see how much volume is lost through compression and compensate for it through the makeup".
The principle is simple: you only need to know the volume before and after the F_acompressor effect to calculate the real loss of volume.
To do this, and I think this is the easiest way:
1) I place a SoundLevel effect before and after the compressor
2) I start playing the film
3) I read the RMS value of my two SoundLevel effects and calculate the difference, so I get the loss in db.
Calculation: RMS before F_acompressor - RMS after F_acompressor.
4) And then you can compensate the loss by placing this value in the makeup of the compressor.
What do you think, is this the right approach ?
Yes, that's a good approach. Unfortunately, I can't interpret the formula: 10^(-20/20)=-1. What does this sign ^ mean? I will be able to cope with it. Especially because I usually need values between -10 dB and -22 dB. I have to remember these decimal numbers.
But allow me to ask again very clearly: if CIN offers these effects, why is it questioned whether it can be used to edit audio professionally?
And one more question: fary54, you obviously know Cinelerra very well: is it good enough to get seriously into it? Do you use it for most of your work or do you also edit videos with other software?
1) What does this sign ^ mean ?
The sign ^ means exponential.
2) Why is it questioned whether it can be used to edit audio professionally?
I think that the lack of documentation on audio filters and the fact that they are not intuitive is holding back candidates. I also created my audio in Ardour before I decided to try to do everything via CinelerraGG.
3) Is it good enough to get seriously into it?
Absolutely, but I can also confirm that the approach of the programme is complex because it is unlike any other. However, in the end, the effort made to understand it is worthwhile.
I see nothing that CinelerraGG cannot do.
4) Do you use it for most of your work or do you also edit videos with other software?
I have used it to make more than 40 films of one hour each and I never use any other programme except Ardour which I hope to leave soon to do everything via CinelerraGG.
Translated with www.DeepL.com/Translator (free version)
Very good, then together we will manage to get to know even the old-fashioned audio filters to use them successfully.
Michi asked a pro (thanks Gerald!) to analyze the audio file; here are his suggestions:
"In the lower mids is definitely too much - that should be applied a HiPass filter
(approx. at 200-250 Hz, slope: -12dB or even -18dB)
In the upper mids (where the speech intelligibility is) is too little! There I recommend
an EQ, at 1.500 - 2.000 Hz and at approx. 4.000 Hz, with a wide Q factor - approx. +4dB to +5dB,
to use. He should take care that this does not make the "S" sounds too concise.
Otherwise use de-esser."
On this basis, Michi took the original audio and processed it in CinGG. The resulting file is available here for comparison:
Michi used EQ graphic (see image) and F_acompressor with the following parameters:
The result clearly improves the voice.
Do you have any other advice?
Forgot to mention that Gerald analyzed the voice and sent a clarifying image.
Michi provided the project (xml) with which he did his sound editing (requires the file "raw-audio-voice.wav" in the first post, which must be renamed to "raw-audio-voice 2.wav"):
I would scale back my improvements in CIN a little bit:
1. the compressor is too powerful: makeup 2.000 > better only 1.00
2. EQ Graphic: To make the voice sound not quite so thin, I would rather not make the cut at the beginning up to 250 Hz anymore, but only up to 200 Hz, so that more bass remains.
Everyone, especially Andra, has to decide for himself how he likes it most.
But I think you can achieve a lot with the two effects F_acrompressor and EQ Graphic.
If I have the audio file independent of the video, as here with this example, then you don't necessarily have to edit it with CIN. Sometimes it's easier with other tools. But if the audio is connected to the video, then I really appreciate it when the video editor also enhances the audio without having to export, import and then sync it.
I have once again refined the settings to improve the sound. The difference is not great from my previous one. But this time I used a different compressor: The SC4. Surely it is not better than the F_acompressor, because it has fewer settings. But there is one big advantage: It shows the threshold level in dB. This is much easier to imagine than the abstract numbers of the F_acompressor, the amplitude ratio. To use the raw-audio-voice 2 Michi SC4.xml in CIN GG, the audio file from Andrea has to be renamed to raw-audio-voice 2.wav.
My setting can certainly be improved further, but it is already a good start. In the end, the person who wants to make the video decides how he needs the sound.
Audacity is good, Ardour is even better. But for me it's important that sound editing can also happen in the video editor. Because if I later realize how the sound could sound even better, then I just want to improve the settings in the equalizer or compressor, and not have to export the sound, edit it there, and then try to paste it back in sync with the video.