'Is there an elegant way to split a file by chapter using ffmpeg?
In this page, Albert Armea share a code to split videos by chapter using ffmpeg. The code is straight forward, but not quite good-looking.
ffmpeg -i "$SOURCE.$EXT" 2>&1 |
grep Chapter |
sed -E "s/ *Chapter #([0-9]+\.[0-9]+): start ([0-9]+\.[0-9]+), end ([0-9]+\.[0-9]+)/-i \"$SOURCE.$EXT\" -vcodec copy -acodec copy -ss \2 -to \3 \"$SOURCE-\1.$EXT\"/" |
xargs -n 11 ffmpeg
Is there an elegant way to do this job?
Solution 1:[1]
ffmpeg -i "$SOURCE.$EXT" 2>&1 \ # get metadata about file
| grep Chapter \ # search for Chapter in metadata and pass the results
| sed -E "s/ *Chapter #([0-9]+.[0-9]+): start ([0-9]+.[0-9]+), end ([0-9]+.[0-9]+)/-i \"$SOURCE.$EXT\" -vcodec copy -acodec copy -ss \2 -to \3 \"$SOURCE-\1.$EXT\"/" \ # filter the results, explicitly defining the timecode markers for each chapter
| xargs -n 11 ffmpeg # construct argument list with maximum of 11 arguments and execute ffmpeg
Your command parses through the files metadata and reads out the timecode markers for each chapter. You could do this manually for each chapter..
ffmpeg -i ORIGINALFILE.mp4 -acodec copy -vcodec copy -ss 0 -t 00:15:00 OUTFILE-1.mp4
or you can write out the chapter markers and run through them with this bash script which is just a little easier to read..
#!/bin/bash
# Author: http://crunchbang.org/forums/viewtopic.php?id=38748#p414992
# m4bronto
# Chapter #0:0: start 0.000000, end 1290.013333
# first _ _ start _ end
while [ $# -gt 0 ]; do
ffmpeg -i "$1" 2> tmp.txt
while read -r first _ _ start _ end; do
if [[ $first = Chapter ]]; then
read # discard line with Metadata:
read _ _ chapter
ffmpeg -vsync 2 -i "$1" -ss "${start%?}" -to "$end" -vn -ar 44100 -ac 2 -ab 128 -f mp3 "$chapter.mp3" </dev/null
fi
done <tmp.txt
rm tmp.txt
shift
done
or you can use HandbrakeCLI, as originally mentioned in this post, this example extracts chapter 3 to 3.mkv
HandBrakeCLI -c 3 -i originalfile.mkv -o 3.mkv
or another tool is mentioned in this post
mkvmerge -o output.mkv --split chapters:all input.mkv
Solution 2:[2]
A version of the original shell code with:
- improved efficiency by
- using
ffprobeinstead offfmpeg - splitting the input rather than the output
- using
- improved reliability by avoiding
xargsandsed - improved readability by using multiple lines
- carrying over of multiple audio or subtitle streams
- remove chapters from output files (as they would be invalid timecodes)
- simplified command-line arguments
#!/bin/sh -efu
input="$1"
ffprobe \
-print_format csv \
-show_chapters \
"$input" |
cut -d ',' -f '5,7,8' |
while IFS=, read start end chapter
do
ffmpeg \
-nostdin \
-ss "$start" -to "$end" \
-i "$input" \
-c copy \
-map 0 \
-map_chapters -1 \
"${input%.*}-$chapter.${input##*.}"
done
To prevent it from interfering with the loop, ffmpeg is instructed not to read from stdin.
Solution 3:[3]
A little more simple than extracting data with sed by using JSON with jq :
#!/usr/bin/env bash
# For systems where "bash" in not in "/bin/"
set -efu
videoFile="$1"
ffprobe -hide_banner \
"$videoFile" \
-print_format json \
-show_chapters \
-loglevel error |
jq -r '.chapters[] | [ .id, .start_time, .end_time | tostring ] | join(" ")' |
while read chapter start end; do
ffmpeg -nostdin \
-ss "$start" -to "$end" \
-i "$videoFile" \
-map 0 \
-map_chapters -1 \
-c copy \
-metadata title="$chapter"
"${videoFile%.*}-$chapter.${videoFile##*.}";
done
I use the tostring jq function because chapers[].id is an integer.
Solution 4:[4]
I modified Harry's script to use the chapter name for the filename. It outputs into a new directory with the name of the input file (minus extension). It also prefixes each chapter name with "1 - ", "2 - ", etc in case there are chapters with the same name.
#!/usr/bin/env python
import os
import re
import pprint
import sys
import subprocess as sp
from os.path import basename
from subprocess import *
from optparse import OptionParser
def parseChapters(filename):
chapters = []
command = [ "ffmpeg", '-i', filename]
output = ""
m = None
title = None
chapter_match = None
try:
# ffmpeg requires an output file and so it errors
# when it does not get one so we need to capture stderr,
# not stdout.
output = sp.check_output(command, stderr=sp.STDOUT, universal_newlines=True)
except CalledProcessError, e:
output = e.output
num = 1
for line in iter(output.splitlines()):
x = re.match(r".*title.*: (.*)", line)
print "x:"
pprint.pprint(x)
print "title:"
pprint.pprint(title)
if x == None:
m1 = re.match(r".*Chapter #(\d+:\d+): start (\d+\.\d+), end (\d+\.\d+).*", line)
title = None
else:
title = x.group(1)
if m1 != None:
chapter_match = m1
print "chapter_match:"
pprint.pprint(chapter_match)
if title != None and chapter_match != None:
m = chapter_match
pprint.pprint(title)
else:
m = None
if m != None:
chapters.append({ "name": `num` + " - " + title, "start": m.group(2), "end": m.group(3)})
num += 1
return chapters
def getChapters():
parser = OptionParser(usage="usage: %prog [options] filename", version="%prog 1.0")
parser.add_option("-f", "--file",dest="infile", help="Input File", metavar="FILE")
(options, args) = parser.parse_args()
if not options.infile:
parser.error('Filename required')
chapters = parseChapters(options.infile)
fbase, fext = os.path.splitext(options.infile)
path, file = os.path.split(options.infile)
newdir, fext = os.path.splitext( basename(options.infile) )
os.mkdir(path + "/" + newdir)
for chap in chapters:
chap['name'] = chap['name'].replace('/',':')
chap['name'] = chap['name'].replace("'","\'")
print "start:" + chap['start']
chap['outfile'] = path + "/" + newdir + "/" + re.sub("[^-a-zA-Z0-9_.():' ]+", '', chap['name']) + fext
chap['origfile'] = options.infile
print chap['outfile']
return chapters
def convertChapters(chapters):
for chap in chapters:
print "start:" + chap['start']
print chap
command = [
"ffmpeg", '-i', chap['origfile'],
'-vcodec', 'copy',
'-acodec', 'copy',
'-ss', chap['start'],
'-to', chap['end'],
chap['outfile']]
output = ""
try:
# ffmpeg requires an output file and so it errors
# when it does not get one
output = sp.check_output(command, stderr=sp.STDOUT, universal_newlines=True)
except CalledProcessError, e:
output = e.output
raise RuntimeError("command '{}' return with error (code {}): {}".format(e.cmd, e.returncode, e.output))
if __name__ == '__main__':
chapters = getChapters()
convertChapters(chapters)
This took a good bit to figure out since I'm definitely NOT a Python guy. It's also inelegant as there were many hoops to jump through since it is processing the metadata line by line. (Ie, the title and chapter data are found in separate loops through the metadata output)
But it works and it should save you a lot of time. It did for me!
Solution 5:[5]
I wanted a few extra things like:
- extracting the cover
- using the chapter name as filename
- prefixing a counter to the filename with leading zeros, so alphabetical ordering will work correctly in every software
- making a playlist
- modifying the metadata to include the chapter name
- outputting all the files to a new directory based on metadata (year author - title)
Here's my script (I used the hint with ffprobe json output from Harry)
#!/bin/bash
input="input.aax"
EXT2="m4a"
json=$(ffprobe -activation_bytes secret -i "$input" -loglevel error -print_format json -show_format -show_chapters)
title=$(echo $json | jq -r ".format.tags.title")
count=$(echo $json | jq ".chapters | length")
target=$(echo $json | jq -r ".format.tags | .date + \" \" + .artist + \" - \" + .title")
mkdir "$target"
ffmpeg -activation_bytes secret -i $input -vframes 1 -f image2 "$target/cover.jpg"
echo "[playlist]
NumberOfEntries=$count" > "$target/0_Playlist.pls"
for i in $(seq -w 1 $count);
do
j=$((10#$i))
n=$(($j-1))
start=$(echo $json | jq -r ".chapters[$n].start_time")
end=$(echo $json | jq -r ".chapters[$n].end_time")
name=$(echo $json | jq -r ".chapters[$n].tags.title")
ffmpeg -activation_bytes secret -i $input -vn -acodec -map_chapters -1 copy -ss $start -to $end -metadata title="$title $name" "$target/$i $name.$EXT2"
echo "File$j=$i $name.$EXT2" >> "$target/0_Playlist.pls"
done
Solution 6:[6]
I was trying to split an .m4b audiobook myself the other day, and stumbled over this thread and others, but I couldn't find any examples using batch-script. I don't know python or bash, and I am no expert in batch at all, but I tried to read up on how one might do it, and came up with the following which seems to work.
This exports MP3-file numbered by chapter to the same path as the source file:
@echo off
setlocal enabledelayedexpansion
for /f "tokens=2,5,7,8 delims=," %%G in ('c:\ffmpeg\bin\ffprobe -i %1 -print_format csv -show_chapters -loglevel error 2^> nul') do (
set padded=00%%G
"c:\ffmpeg\bin\ffmpeg" -ss %%H -to %%I -i %1 -vn -c:a libmp3lame -b:a 32k -ac 1 -metadata title="%%J" -id3v2_version 3 -write_id3v1 1 -y "%~dpnx1-!padded:~-3!.mp3"
)
For your video file file, I have changed it to the following to handle both video and audio data by straight copying. I don't have a video-file with chapters, so I can't test it, but I hope it works.
@echo off
setlocal enabledelayedexpansion
for /f "tokens=2,5,7,8 delims=," %%G in ('c:\ffmpeg\bin\ffprobe -i %1 -print_format csv -show_chapters -loglevel error 2^> nul') do (
set padded=00%%G
"c:\ffmpeg\bin\ffmpeg" -ss %%H -to %%I -i %1 -c:v copy -c:a copy -metadata title="%%J" -y "%~dpnx1-!padded:~-3!.mkv"
)
Solution 7:[7]
in python
#!/usr/bin/env python3
import sys
import os
import subprocess
import shlex
def split_video(pathToInputVideo):
command="ffprobe -v quiet -print_format csv -show_chapters "
args=shlex.split(command)
args.append(pathToInputVideo)
output = subprocess.check_output(args, stderr=subprocess.STDOUT, universal_newlines=True)
cpt=0
for line in iter(output.splitlines()):
dec=line.split(",")
st_time=dec[4]
end_time=dec[6]
name=dec[7]
command="ffmpeg -i _VIDEO_ -ss _START_ -to _STOP_ -vcodec copy -acodec copy"
args=shlex.split(command)
args[args.index("_VIDEO_")]=pathToInputVideo
args[args.index("_START_")]=st_time
args[args.index("_STOP_")]=end_time
filename=os.path.basename(pathToInputVideo)
words=filename.split(".");
l=len(words)
ext=words[l-1]
cpt+=1
filename=" ".join(words[0:l-1])+" - "+str(cpt)+" - "+name+"."+ext
args.append(filename)
subprocess.call(args)
for video in sys.argv[1:]:
split_video(video)
Solution 8:[8]
Naive solution in NodeJS / JavaScript
const probe = function (fpath, debug) {
var self = this;
return new Promise((resolve, reject) => {
var loglevel = debug ? 'debug' : 'error';
const args = [
'-v', 'quiet',
'-loglevel', loglevel,
'-print_format', 'json',
'-show_chapters',
'-show_format',
'-show_streams',
'-i', fpath
];
const opts = {
cwd: self._options.tempDir
};
const cb = (error, stdout) => {
if (error)
return reject(error);
try {
const outputObj = JSON.parse(stdout);
return resolve(outputObj);
} catch (ex) {
self.logger.error("probe failed %s", ex);
return reject(ex);
}
};
console.log(args)
cp.execFile('ffprobe', args, opts, cb)
.on('error', reject);
});
}//probe
The json output raw object will contain a chapters array with the following structure:
{
"chapters": [{
"id": 0,
"time_base": "1/1000",
"start": 0,
"start_time": "0.000000",
"end": 145000,
"end_time": "135.000000",
"tags": {
"title": "This is Chapter 1"
}
}]
}
Solution 9:[9]
This is the PowerShell version
$filePath = 'C:\InputVideo.mp4'
$file = Get-Item $filePath
$json = ConvertFrom-Json (ffprobe -i $filePath -print_format json -show_chapters -loglevel error | Out-String)
foreach($chapter in $json.chapters)
{
ffmpeg -loglevel error -i $filePath -c copy -ss $chapter.start_time -to $chapter.end_time "$($file.DirectoryName)\$($chapter.id).$($file.Extension)"
}
Sources
This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.
Source: Stack Overflow
| Solution | Source |
|---|---|
| Solution 1 | Community |
| Solution 2 | Tarulia |
| Solution 3 | akostadinov |
| Solution 4 | |
| Solution 5 | |
| Solution 6 | |
| Solution 7 | Pepin55i5 |
| Solution 8 | loretoparisi |
| Solution 9 | Milton Carranza |
