Resume | Kushal Arora

Research Interests:

Natural Language Generation, Reinforcement Learning, Imitation Learning, AI Safety and Alignment, Toxicity and Bias in Language Generation, and Code Generation.

EDUCATION:

PhD, Computer Science McGill University, Montreal Advisors: Jackie CK Cheung and Doina Precup	January 2018 - Present GPA: 4.0/4.0
Master of Science, Computer Engineering University of Florida, Gainesville, FL Master's Thesis: Compositional Language Modeling	Aug 2013 - Dec 2015 GPA: 3.74/4.0
B. Tech, Electronics and Communication Engineering Motilal Nehru National Institute of Technology, Allahabad	July 2006 - May 2010 GPA: 7.57/10

PUBLICATIONS:

Kushal Arora, T. J. O'Donnell, D. Precup, J. Weston, J. Cheung, "The Stable Entropy Hypothesis and Entropy-Aware Decoding: An Analysis and Algorithm for Robust Natural Language Generation", Under Submission, ICML 2023.

J. Xu, M. Ung, M. Komeili, Kushal Arora, Y. Boureau, J. Weston, "Learning New Skills after Deployment: Improving open-domain internet-driven dialogue with human feedback", arXiv:2208.03270, Under Submission, ACL 2023.

P. Banerjee, S. Mahajan, Kushal Arora, C. Baral, O. Riva, "Lexi: Self-Supervised Learning of the UI Language", Finding of EMNLP, 2022.

K. Shuster, J. Xu, M. Komeili, D. Ju, E.M. Smith, S. Roller, M. Ung, M. Chen,Kushal Arora, J. Lane, M. Behrooz, W. Ngan, S. Poff, N. Goyal, A. Szlam, Y. Boureau, M. Kambadur, J. Weston, "BlenderBot 3: a deployed conversational agent that continually learns to responsibly engage.", arXiv:2208.03188.

Kushal Arora, K. Shuster, S. Sukhbaatar, J. Weston, "DIRECTOR: Generator-Classifiers For Supervised Language Modeling.", arxiv:2206.07694, 2022. AACL 2022.

Kushal Arora, L. El Asri, H. Bahuleyan, J. Cheung, "Why Exposure Bias Matters: An Imitation Learning Perspective of Error Accumulation in Language Generation.", Findings of ACL, 2022

Kushal Arora\*, A. Chakraborty*, and J. Cheung. “Learning Lexical Subspaces in the Distributional Vector Space”, Transactions of the Association for Computational Linguistics (presented at ACL 2020).

S. Thakur, H. Van Hoof, Kushal Arora, D. Precup, and D. Meger "Sample Efficient Learning From Demonstrations on Multiple Tasks using Bayesian Neural Networks", Imitation Learning and its Challenges in Robotics Workshop, NeurIPS 2018.

Kushal Arora and A. Rangarajan. “A Compositional Approach to Language Modeling.” arXiv:1604.00100, 2016.

Kushal Arora and A. Rangarajan. “Contrastive Entropy: A new evaluation metric for unnormalized language models.” arXiv:1601.00248, 2016.

S. Grover, Kushal Arora, and S. K. Mitra. "Text extraction from document images using edge information." India Conference (INDICON), 2009 Annual IEEE. IEEE, 2009.

PROFESSIONAL EXPERIENCE:

Meta AI, FAIR, Remote *Research Scientist Intern*	Jan 2022 - July 2022 Host: Jason Weston
Proposed and implemented DIRECTOR model for supervised language modeling. The proposed model is competitive to standard language modeling in training and decoding speeds while alleviating issues such as toxicity, contradiction, and repetition, and maintaining generation quality. This work was accepted at AACL 2022 and was a part of BlenderBot3 model from Meta AI.
Microsoft Research, Remote *Research Scientist Intern*	Feb 2021 - May 2021 Host: Oriana Riva
I was responsible for designing and implementing a new approach for U.I. representation with a goal of building autonomous natural language based web-agents. This work was accepted at EMNLP (Findings), 2022.
Borealis AI, Montreal *Research Scientist Intern*	Sept 2019 - Feb 2020 Host: Layla El Asri
Proposed a way to quantify the impact of error accumulation due to exposure bias on language generation by posing language generation as an imitation learning problem. This work was accepted in ACL(Findings), 2022,
Amazon Software Engineer, Alexa Algorithms	Jun 2016 - Aug 2017
I was part of the team that developed the in-house deep learning library used within Alexa. Major contributions: Implemented MPI-based distributed CRF algorithm for Alexa’s Name Entity Recognition system that achieved linear speedup with respect to number of machines. Designed an asynchronous layerwise gradient update approach for improving horizontal scaling of distributed training for Alexa’s deep learning library.
Amazon *Software Engineer, Alexa Machine Learning Platform*	Sept 2015 - Jun 2016
Designed and implemented a pipeline to build, validate and release supplemental language model for runtime augmentation of static global language model. This model is used for doing pronunciations hotfixes or for adjusting weights in the global language model.
Amazon *Software Engineering Intern, Transactional Risk Management Services*	May 2014 -Aug 2014
Analyzed counterfeit spike problem for high volume items on Amazon’s third party marketplace. I also designed a generic framework that flags and blocks the sale of dubious products based on a rule based criteria. The rules for the framework were derived from counterfeit spike analysis mentioned above.
Chatimity *Software Engineer*	Sept 2011-June 2013
First employee at Chatimity. Along with two founders, I helped built a scalable pseudo-anonymous chat based social network that handled millions of messages per day. At Chatimity, I was responsible for a wide variety of projects, ranging from user recommendation, community detection, search ranking, and indexing. Chatimity was recently acquired by Freshdesk.
ST-Ericsson *System Software Engineer, Multimedia Audio Team*	Aug 2010-Sept 2011
Developed OpenMaxIL layer components for Audio 3D Mixer and AAC Encoder and features like HTTP-streaming, buffering and seek features at the framework level.

TECHNICAL SKILLS:

Languages: C, C++, Java, Python, Javascript, MySql.

ML Frameworks: PyTorch, Tensorflow, Chainer, Theano.

Tools: Git, GDB, MongoDB, Hadoop, Makefiles, SLURM.