Update: I will be joining the faculty at IIIT Hyderabad in July 2025. Please consider applying for any relevant open positions detailed here.
My interests lie in machine learning and its applications to various perceptory and behavioral data including vision, language, speech, robotics and search. I am with Amazon while in the past I worked with a startup (Cube26 later acquired by PayTM) and taught at a university (IIT-BHU, Varanasi). Before that I worked with SIERRA/INRIA in Paris and with MLS/XRCE in Grenoble towards a doctoral degree defending it in February of 2014 graduating from the University of Paris. I was advised by Cedric, Francis and Guillaume and even earlier worked with Edmond for my masters. Short bio.
Education
- Doctorate, Universite de Paris VI / INRIA Paris, 2014
- Masters, Institut Polytechnique de Grenoble / INRIA Grenoble, 2010
- Bachelors, Interational Institute of Information Technology Hyderabad, 2008
Publications (and the unpublished)
- ParrotTTS: Text-to-speech synthesis exploiting disentangled self-supervised representations with Neil Shah, Saiteja Kosgi, Vishal Tambrahallia, Neha Sahipjohn, Niranjan Pedanekar, and Vineet Gandhi in EACL Conference 2024. On Arxiv.
- Empathic machines: using intermediate features as levers to emulate emotions in text-to-speech system with Saiteja Kosgi, Sarath Sivaprasad, Niranjan Pedanekar, Vineet Gandhi in NAACL Conference 2022 On ACL.
- Interactive post-editing for verbosity controlled translation with Prabhakar Gupta, Anil Nelakanti, Grant M. Berry, Abhishek Sharma, in COLING Conference, 2022. On ACL.
- Adapting neural machine translation for automatic post-editing with Abhishek Sharma, Prabhakar Gupta, in Conference on Machine Translation (WMT) 2021. On ACL.
- Object-level context modeling for scene classification with Context-CNN with Syed Ashar Javed in CVPR Workshsop 2017. On arxiv.
- Structured penalties for log-linear language models with Cedric Archambeau, Julien Mairal, Francis Bach and Guillaume Bouchard, in EMNLP Conference 2013. On ACL. Oral slides.
- Tree learning strategies for large-scale taxonomies with Cedric Archambeau, Francis Bach and Guillaume Bouchard. Draft.
- Generalized linear language models with Cedric Archambeau, Francis Bach and Guillaume Bouchard. Draft.
- Planar scene modeling from quasiconvex subproblems with Visesh Chari, Chetan Jakkoju, C.V. Jawahar in ACCV 2009. On ACM.
- Path planning for visual servoing and navigation using convex optimization with Abdul Hafez, C.V. Jawahar, in the Journal of Robotics and Automation, 2014. On web.
- Path planning approach to visual servoing:convex optimization based solution with Abdul Hafez, C.V. Jawahar, in Proceedings of the Intelligent Robots and Systems (IROS), 2008. On IEEE.
Patents
Filed and granted
- Emotion mismatch detection for autodubs, granted in US to Amazon Technologies.
- Song generation using neural network granted in US to Amazon Technologies.
- Voice content selection for video content, filed in US to Amazon Technologies.
- Automated quality assessment of translations, granted in US to Amazon Technologies.
- Language model with structured penalty, granted in US and EU to Xerox Corp.
Filed and pending
- Salient region detection in digital entertainment content, filed in US.
- Audio-lip movement correlation measurement for dubbed content, filed in US.
- Language agnostic song detection and identification, filed in US.