Resume
Research Interests:
Natural Language Generation, Reinforcement Learning, Imitation Learning, AI Safety and Alignment, Toxicity and Bias in Language Generation, and Code Generation.
EDUCATION:
PhD, Computer Science | January 2018 - Present |
Master of Science, Computer Engineering | Aug 2013 - Dec 2015 |
B. Tech, Electronics and Communication Engineering | July 2006 - May 2010 |
PUBLICATIONS:
Kushal Arora, T. J. O'Donnell, D. Precup, J. Weston, J. Cheung, "The Stable Entropy Hypothesis and Entropy-Aware Decoding: An Analysis and Algorithm for Robust Natural Language Generation", Under Submission, ICML 2023.
J. Xu, M. Ung, M. Komeili, Kushal Arora, Y. Boureau, J. Weston, "Learning New Skills after Deployment: Improving open-domain internet-driven dialogue with human feedback", arXiv:2208.03270, Under Submission, ACL 2023.
P. Banerjee, S. Mahajan, Kushal Arora, C. Baral, O. Riva, "Lexi: Self-Supervised Learning of the UI Language", Finding of EMNLP, 2022.
K. Shuster, J. Xu, M. Komeili, D. Ju, E.M. Smith, S. Roller, M. Ung, M. Chen,Kushal Arora, J. Lane, M. Behrooz, W. Ngan, S. Poff, N. Goyal, A. Szlam, Y. Boureau, M. Kambadur, J. Weston, "BlenderBot 3: a deployed conversational agent that continually learns to responsibly engage.", arXiv:2208.03188.
Kushal Arora, K. Shuster, S. Sukhbaatar, J. Weston, "DIRECTOR: Generator-Classifiers For Supervised Language Modeling.", arxiv:2206.07694, 2022. AACL 2022.
Kushal Arora, L. El Asri, H. Bahuleyan, J. Cheung, "Why Exposure Bias Matters: An Imitation Learning Perspective of Error Accumulation in Language Generation.", Findings of ACL, 2022
Kushal Arora\*, A. Chakraborty*, and J. Cheung. “Learning Lexical Subspaces in the Distributional Vector Space”, Transactions of the Association for Computational Linguistics (presented at ACL 2020).
S. Thakur, H. Van Hoof, Kushal Arora, D. Precup, and D. Meger "Sample Efficient Learning From Demonstrations on Multiple Tasks using Bayesian Neural Networks", Imitation Learning and its Challenges in Robotics Workshop, NeurIPS 2018.
Kushal Arora and A. Rangarajan. “A Compositional Approach to Language Modeling.” arXiv:1604.00100, 2016.
Kushal Arora and A. Rangarajan. “Contrastive Entropy: A new evaluation metric for unnormalized language models.” arXiv:1601.00248, 2016.
S. Grover, Kushal Arora, and S. K. Mitra. "Text extraction from document images using edge information." India Conference (INDICON), 2009 Annual IEEE. IEEE, 2009.
PROFESSIONAL EXPERIENCE:
Meta AI, FAIR, Remote | Jan 2022 - July 2022 |
Proposed and implemented DIRECTOR model for supervised language modeling. The proposed model is competitive to standard language modeling in training and decoding speeds while alleviating issues such as toxicity, contradiction, and repetition, and maintaining generation quality. This work was accepted at AACL 2022 and was a part of BlenderBot3 model from Meta AI. | |
Microsoft Research, Remote | Feb 2021 - May 2021 |
I was responsible for designing and implementing a new approach for U.I. representation with a goal of building autonomous natural language based web-agents. This work was accepted at EMNLP (Findings), 2022. | |
Borealis AI, Montreal | Sept 2019 - Feb 2020 |
Proposed a way to quantify the impact of error accumulation due to exposure bias on language generation by posing language generation as an imitation learning problem. This work was accepted in ACL(Findings), 2022, | |
Amazon | Jun 2016 - Aug 2017 |
I was part of the team that developed the in-house deep learning library used within Alexa. Major contributions:
| |
Amazon | Sept 2015 - Jun 2016 |
Designed and implemented a pipeline to build, validate and release supplemental language model for runtime augmentation of static global language model. This model is used for doing pronunciations hotfixes or for adjusting weights in the global language model. | |
Amazon | May 2014 -Aug 2014 |
Analyzed counterfeit spike problem for high volume items on Amazon’s third party marketplace. I also designed a generic framework that flags and blocks the sale of dubious products based on a rule based criteria. The rules for the framework were derived from counterfeit spike analysis mentioned above. | |
Chatimity | Sept 2011-June 2013 |
First employee at Chatimity. Along with two founders, I helped built a scalable pseudo-anonymous chat based social network that handled millions of messages per day. At Chatimity, I was responsible for a wide variety of projects, ranging from user recommendation, community detection, search ranking, and indexing. Chatimity was recently acquired by Freshdesk. | |
ST-Ericsson | Aug 2010-Sept 2011 |
Developed OpenMaxIL layer components for Audio 3D Mixer and AAC Encoder and features like HTTP-streaming, buffering and seek features at the framework level. |
TECHNICAL SKILLS:
Languages: C, C++, Java, Python, Javascript, MySql.
ML Frameworks: PyTorch, Tensorflow, Chainer, Theano.
Tools: Git, GDB, MongoDB, Hadoop, Makefiles, SLURM.