Resume

EmailWebsiteGithub

Research Interests:

Natural Language Generation, Reinforcement Learning, Imitation Learning, AI Safety and Alignment, Toxicity and Bias in Language Generation, and Code Generation.

EDUCATION:

PhD, Computer Science
McGill University, Montreal
Advisors: Jackie CK Cheung and Doina Precup

January 2018 - Present
GPA: 4.0/4.0

Master of Science, Computer Engineering
University of Florida, Gainesville, FL
Master's Thesis: Compositional Language Modeling

Aug 2013 - Dec 2015
GPA: 3.74/4.0

B. Tech, Electronics and Communication Engineering
Motilal Nehru National Institute of Technology, Allahabad

July 2006 - May 2010
GPA: 7.57/10

PUBLICATIONS:

Kushal Arora, T. J. O'Donnell, D. Precup, J. Weston, J. Cheung, "The Stable Entropy Hypothesis and Entropy-Aware Decoding: An Analysis and Algorithm for Robust Natural Language Generation", Under Submission, ICML 2023.

J. Xu, M. Ung, M. Komeili, Kushal Arora, Y. Boureau, J. Weston, "Learning New Skills after Deployment: Improving open-domain internet-driven dialogue with human feedback", arXiv:2208.03270, Under Submission, ACL 2023.

P. Banerjee, S. Mahajan, Kushal Arora, C. Baral, O. Riva, "Lexi: Self-Supervised Learning of the UI Language", Finding of EMNLP, 2022.

K. Shuster, J. Xu, M. Komeili, D. Ju, E.M. Smith, S. Roller, M. Ung, M. Chen,Kushal Arora, J. Lane, M. Behrooz, W. Ngan, S. Poff, N. Goyal, A. Szlam, Y. Boureau, M. Kambadur, J. Weston, "BlenderBot 3: a deployed conversational agent that continually learns to responsibly engage.", arXiv:2208.03188.

Kushal Arora, K. Shuster, S. Sukhbaatar, J. Weston, "DIRECTOR: Generator-Classifiers For Supervised Language Modeling.", arxiv:2206.07694, 2022. AACL 2022.

Kushal Arora, L. El Asri, H. Bahuleyan, J. Cheung, "Why Exposure Bias Matters: An Imitation Learning Perspective of Error Accumulation in Language Generation.", Findings of ACL, 2022

Kushal Arora\*, A. Chakraborty*, and J. Cheung. “Learning Lexical Subspaces in the Distributional Vector Space”, Transactions of the Association for Computational Linguistics (presented at ACL 2020).

S. Thakur, H. Van Hoof, Kushal Arora, D. Precup, and D. Meger "Sample Efficient Learning From Demonstrations on Multiple Tasks using Bayesian Neural Networks", Imitation Learning and its Challenges in Robotics Workshop, NeurIPS 2018.

Kushal Arora and A. Rangarajan. “A Compositional Approach to Language Modeling.” arXiv:1604.00100, 2016.

Kushal Arora and A. Rangarajan. “Contrastive Entropy: A new evaluation metric for unnormalized language models.” arXiv:1601.00248, 2016.

S. Grover, Kushal Arora, and S. K. Mitra. "Text extraction from document images using edge information." India Conference (INDICON), 2009 Annual IEEE. IEEE, 2009.

PROFESSIONAL EXPERIENCE:

Meta AI, FAIR, Remote
Research Scientist Intern

Jan 2022 - July 2022
Host: Jason Weston

Proposed and implemented DIRECTOR model for supervised language modeling. The proposed model is competitive to standard language modeling in training and decoding speeds while alleviating issues such as toxicity, contradiction, and repetition, and maintaining generation quality.

This work was accepted at AACL 2022 and was a part of BlenderBot3 model from Meta AI.

Microsoft Research, Remote
Research Scientist Intern

Feb 2021 - May 2021
Host: Oriana Riva

I was responsible for designing and implementing a new approach for U.I. representation with a goal of building autonomous natural language based web-agents.

This work was accepted at EMNLP (Findings), 2022.

Borealis AI, Montreal
Research Scientist Intern

Sept 2019 - Feb 2020
Host: Layla El Asri

Proposed a way to quantify the impact of error accumulation due to exposure bias on language generation by posing language generation as an imitation learning problem.

This work was accepted in ACL(Findings), 2022,

Amazon
Software Engineer, Alexa Algorithms

Jun 2016 - Aug 2017

I was part of the team that developed the in-house deep learning library used within Alexa.

Major contributions:
  • Implemented MPI-based distributed CRF algorithm for Alexa’s Name Entity Recognition system that achieved linear speedup with respect to number of machines.
  • Designed an asynchronous layerwise gradient update approach for improving horizontal scaling of distributed training for Alexa’s deep learning library.

Amazon
Software Engineer, Alexa Machine Learning Platform

Sept 2015 - Jun 2016

Designed and implemented a pipeline to build, validate and release supplemental language model for runtime augmentation of static global language model. This model is used for doing pronunciations hotfixes or for adjusting weights in the global language model.

Amazon
Software Engineering Intern, Transactional Risk Management Services

May 2014 -Aug 2014

Analyzed counterfeit spike problem for high volume items on Amazon’s third party marketplace. I also designed a generic framework that flags and blocks the sale of dubious products based on a rule based criteria. The rules for the framework were derived from counterfeit spike analysis mentioned above.

Chatimity
Software Engineer

Sept 2011-June 2013

First employee at Chatimity. Along with two founders, I helped built a scalable pseudo-anonymous chat based social network that handled millions of messages per day. At Chatimity, I was responsible for a wide variety of projects, ranging from user recommendation, community detection, search ranking, and indexing. Chatimity was recently acquired by Freshdesk.

ST-Ericsson
System Software Engineer, Multimedia Audio Team

Aug 2010-Sept 2011

Developed OpenMaxIL layer components for Audio 3D Mixer and AAC Encoder and features like HTTP-streaming, buffering and seek features at the framework level.

TECHNICAL SKILLS:

Languages: C, C++, Java, Python, Javascript, MySql.

ML Frameworks: PyTorch, Tensorflow, Chainer, Theano.

Tools: Git, GDB, MongoDB, Hadoop, Makefiles, SLURM.