projectNMT
projectNMT focuses on practical neural machine translation for English to German/French using a research-style codebase based on fairseq components.
What it does
- Trains translation models using encoder-decoder architectures.
- Runs batch generation and interactive translation.
- Supports preprocessing/tokenization workflows (including SentencePiece utilities).
- Packages translation models for local service workflows.
How it is built
- Includes a full
fairseqcode tree with task/model/criterion registries. - Uses dedicated scripts for training (
train.py), generation (generate.py), and interactive decoding. - Adds utility scripts for binarized dataset inspection and score analysis.
- Provides project-level instructions for deploying trained checkpoints for EN-DE and EN-FR use cases.
Tech stack
Python, PyTorch, fairseq, SentencePiece, sacrebleu and related MT tooling.