For obvious copyright reasons, TV series episodes cannot be freely distributed.
Instead, we provide scripts to reproduce the corpus locally from your own legal copy of the official DVD sets.
Reproduction scripts are mostly wrappers around open-source command line tools which may be cumbersome to install. Therefore, we encourage you to use the provided Docker image that comes with all dependencies pre-installed:
$ docker pull tvddataset/create
or create your own from source:
$ git clone http://www.github.com/tvd-dataset/tvd
$ cd tvd && docker build -t tvddataset/create .
If you would rather not use Docker, have a look at the Dockerfile for details on how to install all dependencies for Ubuntu 14.04 LTS.
The following steps (2 and 3) suppose that you are trying to reproduce GameOfThrones
subset:
$ export TVD_CORPUS='/path/to/tvd_corpus'
$ export TVD_PLUGIN='GameOfThrones'
tvd.create dump
copies DVDs on your hard drive once and for all.
$ export SEASON=1;
$ export DISC=1;
$ export DVD=/dev/dvd;
$ docker run -v $DVD:/dvd -v $TVD_CORPUS:/tvd tvddataset/create dump /tvd $TVD_PLUGIN $SEASON $DISC
tvd.create rip
extracts audio, video and subtitles.
$ export SEASON=1
$ docker run -v $TVD_CORPUS:/tvd tvddataset/create rip /tvd $TVD_PLUGIN $SEASON
tvd.create metadata
copies metadata provided by the plugin.
$ docker run -v $TVD_CORPUS:/tvd tvddataset/create metadata /tvd $TVD_PLUGIN
/path/to/tvd_corpus/GameOfThrones
├── dvd
│ ├── dump
│ │ ├── Season01.Disc01
│ │ ├── Season01.Disc02
│ │ └── ...
│ └── rip
│ ├── video
│ │ ├── GameOfThrones.Season01.Episode01.mkv
│ │ ├── GameOfThrones.Season01.Episode02.mkv
│ │ ├── GameOfThrones.Season01.Episode03.mkv
│ │ ├── GameOfThrones.Season01.Episode04.mkv
│ │ └── ...
│ ├── audio
│ │ ├── GameOfThrones.Season01.Episode01.en.wav
│ │ ├── GameOfThrones.Season01.Episode02.fr.wav
│ │ └── ...
│ └── subtitles
│ ├── GameOfThrones.Season01.Episode01.en.srt
│ ├── GameOfThrones.Season01.Episode02.fr.srt
│ └── ...
└── metadata
├── transcript
│ ├── GameOfThrones.Season01.Episode01.json
│ ├── GameOfThrones.Season01.Episode02.json
│ └── ...
├── scenes
│ ├── GameOfThrones.Season01.Episode01.json
│ ├── GameOfThrones.Season01.Episode02.json
│ └── ...
└── ...
Why don't you try and have fun with TVD
?