Company: EchoVerse Labs
Employment Type: Contract (6 months) · Fully Remote · Competitive day rate
Role Overview
Design, train, and optimize deep‑learning models that power our real‑time voice‑cloning and noise‑suppression SDK.
Core Duties
- Build data pipelines to ingest, clean, and augment multi‑speaker audio corpora (> 5 TB).
- Experiment with transformer‑based architectures (Conformer, Whisper derivatives) in PyTorch/Lightning.
- Quantize & distill models for on‑device inference (TensorRT, ONNX, iOS CoreML).
- Collaborate with acoustics researchers to evaluate perceptual audio quality (PESQ/STOI, MOS).
Required Background
- 4+ yrs ML experience with a focus on speech/audio.
- Proficiency in PyTorch and one experiment‑tracking stack (Weights & Biases, MLflow).
- Strong grasp of signal‑processing fundamentals (STFT, mel‑spectrograms).
- Published work or OSS contributions in ASR, TTS, or voice conversion is a plus.