AI no jimaku gumi is a cli utility to facilitate the translation and subtitle making of video.
To get started with AI no jimaku gumi, follow these steps:
- Clone the repository:
git clone https://github.com/Inokinoki/ai-no-jimaku-gumi.git
- Navigate to the project directory:
cd ai-no-jimaku-gumi - Install build dependencies:
Using Homebrew:
brew install cmake ffmpegUbuntu:
apt-get install -y clang cmake make pkg-config \
libavcodec-dev libavdevice-dev libavfilter-dev libavformat-dev \
libavutil-dev libpostproc-dev libswresample-dev libswscale-dev Fedora:
dnf install clang cmake ffmpeg-free-devel make pkgconf-pkg-configArch Linux:
pacman -S clang cmake ffmpeg make pkgconfPlease look for clang, cmake, make, pkgconfig and ffmpeg packages in your distribution, if it's not one of above.
You might need to install some other packages to enable GPU/NPU acceleration.
TODO
Build with cargo:
cargo buildDownload whisper model(you can also download other models refer: https://huggingface.co/ggerganov/whisper.cpp):
wget https://huggingface.co/ggerganov/whisper.cpp/resolve/main/ggml-tiny.binRun it with your video path after --input-video-path and target language after -t.
To use AI no jimaku gumi, refer this help:
aI NO jimaKu gumI, a subtitle maker using AI.
Usage: ainojimakugumi [OPTIONS] --input-video-path <INPUT_VIDEO_PATH>
Options:
-i, --input-video-path <INPUT_VIDEO_PATH>
Path to the input video
--source-language <SOURCE_LANGUAGE>
Which language to translate from (default: "ja") (possible values: "en", "es", "fr", "de", "it", "ja", "ko", "pt", "ru", "zh") (example: "ja") [default: ja]
--target-language <TARGET_LANGUAGE>
Which language to translate to (default: "en") (possible values: "en", "es", "fr", "de", "it", "ja", "ko", "pt", "ru", "zh") (example: "en") [default: en]
--start-time <START_TIME>
Video start time (not used yet) [default: 0]
--end-time <END_TIME>
Video end time (not used yet) [default: 0]
--subtitle-source <SUBTITLE_SOURCE>
Subtitle source (default: "audio") (possible values: "audio", "container", "ocr") (example: "audio") (long_about: "Subtitle source to use") [default: audio]
--ggml-model-path <GGML_MODEL_PATH>
ggml model path (default: "ggml-tiny.bin") (example: "ggml-tiny.bin", ggml-small.bin") (long_about: "Path to the ggml model") [default: ggml-tiny.bin]
--only-extract-audio
Only extract the audio (default: false) (long_about: "Only extract the audio, if subtitle source is audio, but do not transcribe (Debug purpose)") (example: true)
--only-transcript
Only save the transcripted subtitle (default: false) (long_about: "Only save the transcripted subtitle but do not translate (Debug purpose)") (example: true)
--original-subtitle-path <ORIGINAL_SUBTITLE_PATH>
Original subtitle SRT file path (default: "") (example: "origin.srt") (long_about: "Original subtitle path to save the transcripted subtitle as SRT") [default: ]
--only-translate
Only translate the subtitle (default: false) (long_about: "Only translate the subtitle but do not export (Debug purpose)")
-s, --subtitle-backend <SUBTITLE_BACKEND>
Subtitle backend (default: "srt") (possible values: "srt", "container", "embedded") (example: "srt") (long_about: "Subtitle backend to use") [default: srt]
--subtitle-output-path <SUBTITLE_OUTPUT_PATH>
Subtitle output path (default: "None") (example: "output.srt") (long_about: "Subtitle output path (if srt) or video output path (if container or embedded)")
-t, --translator-backend <TRANSLATOR_BACKEND>
Translator backend (default: "deepl") (possible values: "deepl", "google", "llm", "whisper") (example: "google") (long_about: "Translator backend to use") [default: deepl]
--llm-model-name <LLM_MODEL_NAME>
Model name (if llm) (default: "gpt-4o") (example: "gpt-4o") (long_about: "Model name (if using llm for translation)") [default: gpt-4o]
--llm-api-base <LLM_API_BASE>
API base (if llm) (default: "https://api.openai.com") (example: "https://api.openai.com") (long_about: "API base used in `genai` crate (if using llm for translation)") [default: https://api.openai.com]
--llm-prompt <LLM_PROMPT>
Prompt (if llm) (default: "") (example: "Translate the following text to English") (long_about: "Prompt (if using llm for translation)") [default: ]
-h, --help
Print help
-V, --version
Print version
We are currently supporting only deepl, llm, whisper translation and srt export.
You might need to follow the specific instructions to use a translator backend:
deepl(default): please provide your own DeepL API key inDEEPL_API_KEYenv, andDEEPL_API_URL=https://api.deepl.comif you are using the paid API version.llm: if you are using llm translate, please refer the repo rust-genai for more detail. An example here:
export CUSTOM_API_KEY=sk-xxxxxxxxxxxxxxxxxxxxxxx
./target/debug/ainojimakugumi --input-video-path one.webm \
--translator-backend llm \
--llm-api-base https://sssss.com/v1/ \
--llm-prompt 'translate this to English' \
--llm-model-name 'gpt-4o-mini'
--ggml-model-path ggml-small.bin
whisper(experimental): use Whisper.cpp to directly output translated subtitles from audio (audio only, English only).