My interests lie in machine learning and its applications to various perceptory and behavioral data including vision, language, speech, robotics and search. I am with Amazon while in the past I worked with a startup (Cube26 later acquired by PayTM) and taught at a university (IIT-BHU, Varansi). Before that I worked with SIERRA/INRIA in Paris and with MLS/XRCE in Grenoble towards a doctoral degree defending it in February of 2014 graduating from the University of Paris. I was advised by Cedric, Francis and Guillaume and even earlier worked with Edmond for my masters.

Education

  • Doctorate, Universite de Paris VI / INRIA Paris, 2014
  • Masters, Institut Polytechnique de Grenoble / INRIA Grenoble, 2010
  • Bachelors, Interational Institute of Information Technology Hyderabad, 2008

Publications (and the unpublished)

  • ParrotTTS: Text-to-speech synthesis exploiting disentangled self-supervised representations with Neil Shah, Saiteja Kosgi, Vishal Tambrahallia, Neha Sahipjohn, Niranjan Pedanekar, and Vineet Gandhi in EACL Conference 2024. On Arxiv.
  • Empathic machines: using intermediate features as levers to emulate emotions in text-to-speech system with Saiteja Kosgi, Sarath Sivaprasad, Niranjan Pedanekar, Vineet Gandhi in NAACL Conference 2022 On ACL.
  • Interactive post-editing for verbosity controlled translation with Prabhakar Gupta, Anil Nelakanti, Grant M. Berry, Abhishek Sharma, in COLING Conference, 2022. On ACL.
  • Adapting neural machine translation for automatic post-editing with Abhishek Sharma, Prabhakar Gupta, in Conference on Machine Translation (WMT) 2021. On ACL.
  • Object-level context modeling for scene classification with Context-CNN with Syed Ashar Javed in CVPR Workshsop 2017. On arxiv.
  • Structured penalties for log-linear language models with Cedric Archambeau, Julien Mairal, Francis Bach and Guillaume Bouchard, in EMNLP Conference 2013. On ACL. Oral slides.
  • Tree learning strategies for large-scale taxonomies with Cedric Archambeau, Francis Bach and Guillaume Bouchard. Draft.
  • Generalized linear language models with Cedric Archambeau, Francis Bach and Guillaume Bouchard. Draft.
  • Planar scene modeling from quasiconvex subproblems with Visesh Chari, Chetan Jakkoju, C.V. Jawahar in ACCV 2009. On ACM.
  • Path planning for visual servoing and navigation using convex optimization with Abdul Hafez, C.V. Jawahar, in the Journal of Robotics and Automation, 2014. On web.
  • Path planning approach to visual servoing:convex optimization based solution with Abdul Hafez, C.V. Jawahar, in Proceedings of the Intelligent Robots and Systems (IROS), 2008. On IEEE.

Patents

Filed and granted

  • Emotion mismatch detection for automated dubbing, granted in US to Amazon Technologies.
  • Song generation using neural network granted in US to Amazon Technologies.
  • Voice content selection for video content, filed in US to Amazon Technologies.
  • Automated quality assessment of translations, granted in US to Amazon Technologies.
  • Language model with structured penalty, granted in US and EU to Xerox Corp.

Filed and pending

  • Salient region detection in digital entertainment content, filed in US.
  • Audio-lip movement correlation measurement for dubbed content, filed in US.
  • Language agnostic song detection and identification, filed in US.