My interests lie in machine learning and its applications to various perceptory and behavioral data including vision, language, speech, robotics and search. I am with Amazon while in the past I worked with a startup (Cube26 later acquired by PayTM) and taught at a university (IIT-BHU, Varansi). Before that I worked with SIERRA/INRIA in Paris and with MLS/XRCE in Grenoble towards a doctoral degree defending it in February of 2014 graduating from the University of Paris. I was advised by Cedric, Francis and Guillaume and even earlier worked with Edmond for my masters.
Education
- Doctorate, Universite de Paris VI / INRIA Paris, 2014
- Masters, Institut Polytechnique de Grenoble / INRIA Grenoble, 2010
- Bachelors, Interational Institute of Information Technology Hyderabad, 2008
Publications (and the unpublished)
- ParrotTTS: Text-to-speech synthesis exploiting disentangled self-supervised representations with Neil Shah, Saiteja Kosgi, Vishal Tambrahallia, Neha Sahipjohn, Niranjan Pedanekar, and Vineet Gandhi in EACL Conference 2024. On Arxiv.
- Empathic machines: using intermediate features as levers to emulate emotions in text-to-speech system with Saiteja Kosgi, Sarath Sivaprasad, Niranjan Pedanekar, Vineet Gandhi in NAACL Conference 2022 On ACL.
- Interactive post-editing for verbosity controlled translation with Prabhakar Gupta, Anil Nelakanti, Grant M. Berry, Abhishek Sharma, in COLING Conference, 2022. On ACL.
- Adapting neural machine translation for automatic post-editing with Abhishek Sharma, Prabhakar Gupta, in Conference on Machine Translation (WMT) 2021. On ACL.
- Object-level context modeling for scene classification with Context-CNN with Syed Ashar Javed in CVPR Workshsop 2017. On arxiv.
- Structured penalties for log-linear language models with Cedric Archambeau, Julien Mairal, Francis Bach and Guillaume Bouchard, in EMNLP Conference 2013. On ACL. Oral slides.
- Tree learning strategies for large-scale taxonomies with Cedric Archambeau, Francis Bach and Guillaume Bouchard. Draft.
- Generalized linear language models with Cedric Archambeau, Francis Bach and Guillaume Bouchard. Draft.
- Planar scene modeling from quasiconvex subproblems with Visesh Chari, Chetan Jakkoju, C.V. Jawahar in ACCV 2009. On ACM.
- Path planning for visual servoing and navigation using convex optimization with Abdul Hafez, C.V. Jawahar, in the Journal of Robotics and Automation, 2014. On web.
- Path planning approach to visual servoing:convex optimization based solution with Abdul Hafez, C.V. Jawahar, in Proceedings of the Intelligent Robots and Systems (IROS), 2008. On IEEE.
Patents
Filed and granted
- Emotion mismatch detection for automated dubbing, granted in US to Amazon Technologies.
- Song generation using neural network granted in US to Amazon Technologies.
- Voice content selection for video content, filed in US to Amazon Technologies.
- Automated quality assessment of translations, granted in US to Amazon Technologies.
- Language model with structured penalty, granted in US and EU to Xerox Corp.
Filed and pending
- Salient region detection in digital entertainment content, filed in US.
- Audio-lip movement correlation measurement for dubbed content, filed in US.
- Language agnostic song detection and identification, filed in US.