Audio Processing
Narratorr can process audiobook files after import — merging multiple files into a single audiobook file and converting between formats. This is powered by ffmpeg.
Processing is not an automatic step in the import pipeline. Files are imported as-is, and you then trigger processing manually: a per-book Merge from the book page or a bulk Convert job. Processing runs only when an ffmpeg path is configured — there is no master on/off switch.
Requirements
Section titled “Requirements”- Docker: ffmpeg is pre-installed in the production image. No setup needed.
- Manual install: Install ffmpeg and configure its path in Settings > Post Processing > ffmpeg Path.
Settings
Section titled “Settings”Configure audio processing in Settings > Post Processing.
| Setting | Description |
|---|---|
| ffmpeg Path | Path to the ffmpeg binary. Auto-detected in Docker. Processing runs only when this is set. |
| Output Format | Target format: m4b (chapters supported) or mp3 (no chapter support). Default: m4b. Applies to both Merge and bulk Convert output. |
| Keep Original Bitrate | Drop the target-bitrate cap and re-encode at (up to) the source bitrate instead of the target bitrate. Default: on. |
| Target Bitrate | Target bitrate in kbps (32–512). Default: 128. Ignored when keep original bitrate is enabled. |
| Merge Behavior | When the bulk Convert job merges multiple files: Always / Only when multiple files / Never. Default: “Only when multiple files”. (The per-book Merge action always merges — see note below.) |
| Max Concurrent Jobs | Maximum number of manual merge jobs that run at once (1–8). Default: 1. |
| Post-Processing Script | Absolute path to a script run after each successful import. Receives NARRATORR_BOOK_TITLE, NARRATORR_BOOK_AUTHOR, NARRATORR_IMPORT_PATH, and NARRATORR_IMPORT_FILE_COUNT env vars. |
| Script Timeout | Maximum seconds the post-processing script may run. Default: 300. Required when a script is set. |
Tag Embedding
Section titled “Tag Embedding”The same Settings > Post Processing page configures Tag Embedding — writing metadata into the audio file’s tags on import. It offers a Tag Embedding toggle, a Tag Mode (populate missing tags only, or overwrite existing tags), and an Embed Cover Art option. Like the rest of audio processing, tag embedding depends on ffmpeg.
Merge Behavior
Section titled “Merge Behavior”The Merge Behavior setting controls only the bulk Convert job. The per-book Merge action always merges (it ignores this setting).
| Option | Convert-job behavior |
|---|---|
| Always merge | Combine all audio files into a single output file with chapter markers |
| Only when multiple files | Merge only when a book has more than one audio file; single-file books are just converted to the target format |
| Never (convert only) | Never merge — each file is converted to the target format individually, no merging |
The per-book Merge action only runs on books that have 2 or more top-level audio files; single-file books are skipped/rejected. Many audiobook releases come as 20-50 individual chapter files — merging produces a single file (M4B by default) with chapter markers that’s easier to manage.
How It Works
Section titled “How It Works”Import places the downloaded files into the library folder as-is — no processing happens automatically. You then trigger processing on demand:
- From a book’s page, start a Merge (requires 2+ audio files). The job is queued and runs under a concurrency limit, with progress reporting.
- Merge copies the files to a staging directory and runs ffmpeg to combine them in order.
- The output is encoded to the configured Output Format (M4B by default) at the configured bitrate (unless keep original bitrate is on).
- Narratorr verifies the output, then swaps it into the book folder and deletes the originals.
A bulk Convert job (from settings) applies the same processing across multiple books.
Bitrate Guidance
Section titled “Bitrate Guidance”| Bitrate | Quality | File Size |
|---|---|---|
| 32 kbps | Low (talk radio) | Smallest |
| 64 kbps | Acceptable for speech | Small |
| 128 kbps | Good for audiobooks (default) | Moderate |
| 192 kbps | High quality | Larger |
| 320+ kbps | Diminishing returns for speech | Large |
For spoken word, 64–128 kbps is the sweet spot. Higher bitrates increase file size without perceptible quality improvement for narration.
When keep original bitrate is enabled, the bitrate setting is ignored. This does not mean a lossless copy — files are still re-encoded, but the target-bitrate cap is dropped so the output is encoded at (up to) the source bitrate. The effective bitrate is capped at the lower of the source and target to prevent upsampling. Use this when your source files are already at your preferred quality and you don’t want to force a lower target bitrate.
Troubleshooting
Section titled “Troubleshooting”- “ffmpeg not found” — install ffmpeg or verify the path in Settings > Post Processing
- Processing takes a long time — merging large audiobooks (20+ hours) is CPU-intensive. This is normal.
- Output file is much larger/smaller than expected — check your bitrate setting. A high bitrate on a low-quality source won’t improve quality but will increase size.
- “Unsupported format” — the source file uses a codec ffmpeg can’t decode. Check ffmpeg’s supported formats.
- Disk space errors — processing needs temporary space for the output file. Ensure sufficient free space in the library directory.