More  Posts
Senior Applied Machine Learning Scientist @Sanas AI Private Limited,
Posted in BITS Pilani

Hey Guys, I have an opening to share with you.


Please find the Job Description below.

Sanas is revolutionizing the way we communicate with the world’s first real-time algorithm, designed to modulate accents, eliminate background noises, and magnify speech clarity. Pioneered by seasoned startup founders with a proven track record of creating and steering multiple unicorn companies, our groundbreaking GDP-shifting technology sets a gold standard.


Sanas is a 200-strong team, established in 2020. In this short span, we’ve successfully secured over $100 million in funding. Our innovation have been supported by the industry’s leading investors, including Insight Partners, Google Ventures, Quadrille Capital, General Catalyst, Quiet Capital, and other influential investors. Our reputation is further solidified by collaborations with numerous Fortune 100 companies. With Sanas, you’re not just adopting a product; you’re investing in the future of communication.


We are seeking a Senior Applied Machine Learning Scientist with deep expertise in foundational modeling and large-scale speech AI systems. In this role, you will lead the development of advanced models that push the boundaries of speech processing, including self-supervised learning, large-scale pretraining, and multimodal architectures. Your focus will be on scaling models efficiently while ensuring real-time performance, robustness, and adaptability to diverse environments.


This position requires a strong foundation in ML techniques, an innovative mindset, and a deep commitment to continuous improvement of deployed systems.

Key Responsibilities:

  • 1. Advanced Model Development & Scaling
  • Architect, train, and optimize large-scale speech AI models, including speech-to-speech, speech restoration, and speech translation.
  • Leverage self-supervised learning, contrastive learning, and transformer-based architectures (e.g., wav2vec, Whisper, GPT-style models) to improve model accuracy and adaptability.
  • Develop efficient model distillation and quantization strategies to deploy large models with low-latency inference.
  • Innovate on cross-lingual and multilingual speech processing using large-scale pretraining and fine-tuning.

  • 2. Data-Driven Model Optimization
  • Curate and scale massive diverse, multilingual, and multimodal datasets for robust model training.
  • Apply active learning, domain adaptation, and synthetic data generation to overcome data limitations.
  • Lead efforts in data quality assessment, augmentation, and curation for large-scale training pipelines.

  • 3. Scaling ML Infrastructure & Performance Monitoring
  • Develop distributed training strategies for large-scale models using cloud-based and on-prem GPU clusters.
  • Design and implement scalable model evaluation frameworks, tracking WER, MOS, and latency across diverse scenarios.
  • Optimize real-time inference pipelines to ensure high-throughput, low-latency speech processing.

  • 4. Research & Thought Leadership
  • Stay ahead of advancements in foundational models, generative AI, and large-scale speech modeling.
  • Collaborate with academia, open-source communities, and research partners to drive innovation.

  • 5. Cross-Team Collaboration & Deployment
  • Work closely with MLOps, Data Engineering, and Product teams to deploy scalable AI systems.
  • Ensure seamless integration of foundational models with edge devices, real-time applications, and cloud platforms.
  • Translate cutting-edge research into production-grade models that power real-world communication.

Must have qualifications:

  • Bachelor’s, Master’s or Ph.D. in Computer Science, Electrical Engineering, or a related field with a focus on Machine Learning, Deep Learning, or Speech Processing.
  • 5+ years of hands-on experience in developing and deploying large-scale models for:
  • Speech-to-text (ASR)
  • Text-to-speech (TTS)
  • Voice conversion & speech enhancement
  • Speech translation & multimodal learning
  • Strong proficiency in transformer-based architectures (e.g., wav2vec 2.0, Whisper, GPT, BERT).
  • Expertise in deep learning frameworks such as PyTorch, TensorFlow, and large-scale training techniques.
  • Experience with distributed training and optimization across multi-GPU clusters.
  • Strong understanding of self-supervised learning, contrastive learning, and generative modeling for speech AI.
  • Hands-on experience with cloud-based AI platforms (AWS, GCP, Azure) and model deployment.

Preferred qualifications:

  • Experience in developing multimodal AI models integrating speech, text, and vision.
  • Track record of publishing in top-tier AI/ML conferences.
  • Experience optimizing large models for real-time inference on edge devices.
  • Proficiency with MLOps best practices for deploying and monitoring models in production.
  • Familiarity with open-source ASR/TTS toolkits.


If this is interesting for you then please share your resume while applying. Also, please feel free to reach out in case of any queries.


Thanks,

Naveenkumar T

Message
More  Posts
Feedback