Kalpesh Krishna

I am a Staff Research Scientist at Google DeepMind in the Gemini team. I work in the model quality team in the post-training phase for Gemini, with a focus on SFT, RLHF and evaluation of Gemini’s instruction following capabilities.

I completed my PhD in 2023 in Computer Science at UMass Amherst advised by Prof. Mohit Iyyer in the UMass NLP lab. My research at UMass was supported by the Google PhD Fellowship. Before UMass, I received my undergraduate degree at IIT Bombay. During my PhD and undergrad, I also did some fun internships at Google DeepMind (Summer 2019 - Spring 2022), Allen AI (Summer 2022), Toyota Technological Institute at Chicago (Summer 2017) and Mozilla (Summer 2016).

I maintain a list of my publications under the Research tab. I also blog every now and then compiling my personal experiences.

CV / Resume, Google Scholar, LinkedIn
Email ID: kalpeshk2011@gmail.com

Updates

Dec 2025:	check out Gemini 3.0 Flash, and Gemini 3.0 Pro: state-of-the-art models for multimodality, reasoning, and LMArena, at the pareto frontier of speed-performance! (flash blogpost, pro blogpost)
Nov 2025:	one new preprint on interleaved reasoning in LLMs with planning steps (paper link)
Mar 2025:	check out Gemini 2.5 Pro, a very strong reasoning model, and #1 on the LMArena! (official blogpost, preprint)
Jan 2025:	one paper to appear at NAACL 2025: FRAMES, a challenging factual reasoning QA benchmark (dataset link)! FRAMES has recently been used for as a long-context evaluation benchmark in the DeepSeek-v3 / DeepSeek-R1 papers.
Dec 2024:	check out Gemini 2.0 Flash, a stronger model than Gemini 1.5 Pro, but at 2x the speed! (official blogpost)
Sept 2024:	two papers to appear at EMNLP 2024, on foundational autoraters (FLAMe) and posthoc watermarking of language models.
July 2024:	check out our new paper on foundational autoraters, the best performing generative model on RewardBench trained solely on publicly available data!
Apr 2024:	check out the Gemini 1.5 Pro API, a top-tier LLM on the LMSys leaderboard! (technical report, tweet)
Apr 2024:	talk at Georgia Tech on LLM evaluation
Feb 2024:	check out Gemini Advanced, our most capable Bard model powered by Gemini Ultra! (tech report, blogpost)
Nov 2023:	our paper on inferring LLM decoding algorithms won a Distinguished Paper award at CCS 2023!
Nov 2023:	talk at the University of Texas at Dallas on LLM evaluation.
Nov 2023:	talks at the University of Pittsburgh on LLM evaluation and AI-generated text detection.
Oct 2023:	one paper to appear at EMNLP 2023 on fine-grained automatic evaluation of long-form text generation. Check out our PIP package too!
Sept 2023:	one paper to appear in NeurIPS 2023 on paraphrase attacks on AI-generated text detection and defending against these attacks using retrieval. Our model, data and code is available here.
Sept 2023:	talk at NLP with Friends on AI-generated text detection.
Aug 2023:	I joined the Google Bard team as a Research Scientist.
Aug 2023:	I defended my PhD thesis on long-form text generation!
July 2023:	talk at IBM research on long-form text generation evaluation.
July 2023:	talk at University of Toronto on AI-generated text detection.
May 2023:	Happy to receive an outstanding paper award (tweet) at EACL 2023 for LongEval, our paper on human evaluation of long-form summarization!
Mar 2023:	appeared on CBS Boston and WWLP for a photograph of the northern lights!
Jan 2023:	two papers to appear at EACL 2023, on better human evaluation of long-form summarization (LongEval), and guidelines for coreference annotation (ezCoref).
Oct 2022:	three papers to appear at EMNLP 2022, on improving text generation (RankGen), a benchmark to evaluate Chinese language models (SLING), and a dataset for document-level literary translations (Par3).
Sept 2022:	talk at the University of Washington on RankGen (slides).
June 2022:	started my summer internship at Allen AI where I will be working with Kyle Lo, Arman Cohan and Pradeep Dasigi.
Apr 2022:	new preprint: RankGen: Improving Text Generation with Large Ranking Models. The code and model checkpoints have been added here.
Feb 2022:	two papers to appear in ACL 2022, on few-shot multilingual style transfer and a new retrieval benchmark on literary text.
Oct 2021:	started a part-time student researcher position at Google AI Language, where I will be working with John Wieting.
Sept 2021:	received the Google PhD Fellowship for 2021! (list of recipients)
June 2021:	started my summer internship at Google Research India where I will be working with Partha Talukdar and Bidisha Samanta
May 2021 - July 2021:	talks at Google Research (slides), University of Texas at Austin (slides), University of Southern California (slides, video) on text generation and perils of its evaluation.
Mar 2021:	new paper on longform question answering on ELI5 to appear in NAACL 2021! Read more in our Google AI blogpost.
Dec 2020:	passed my PhD candidacy with distinction!
Sep 2020:	I am excited to share a new bird photography webpage! Check the Birding tab.
Sep 2020:	new paper on paraphrasing for unsupervised style transfer to appear at EMNLP 2020. Check out a live demo and the codebase here.
May 2020:	started my summer internship at Google Brain, where I will be working with Aurko Roy
Apr 2020:	talk at IBM research on model extraction attacks on BERT (slides)
Apr 2020:	new blogpost with Nicolas Papernot on our ICLR 2020 paper on model extraction attacks on BERT.
Jan 2020:	I am co-organizing the Machine Learning and Friends Lunch at UMass Amherst with Neha Nayak Kennard. If you have speaker recommendations, fill them here!
Dec 2019:	new paper on model extraction attacks on BERT-based models to appear at ICLR 2020.
Oct 2019:	new blog surveying twelve recent NLP PhD applicants on the graduate school admission process! Also an Insight IITB article on my personal experience.
Aug 2019:	lightning talk at the AllenNLP Summit 2019 on using AllenNLP for education. Check out the AllenNLP homework I designed for our grad NLP class here!
Jul 2019:	presented papers on QA generation and faster transformer decoding at ACL 2019. Check out our web demo on hierarchical QA generation!
Jul 2019:	awarded the ACL 2019 Student Scholarship and the Victor Lesser Graduate Scholarship
Jun 2019:	Purva Tendulkar won the Best Presentation Award for our paper in ICCC 2019!
May 2019:	started summer internship at Google AI Language in New York
Apr 2019:	talk at the UMass Data Science Research Symposium 2019
Apr 2019:	new paper on thematic doodle generation to appear in ICCC 2019
Nov 2018:	presented paper on logic rules for sentiment classification at EMNLP 2018 (slides)
Sep 2018:	started my PhD in Computer Science at UMass Amherst
Aug 2018:	graduated from IIT Bombay, receiving the Sharad Maloo Memorial Gold Medal
Jul 2018:	new preprint on hierarchical multitask learning for speech recognition
Jun 2018:	new blogs on grad resources, IIT Bombay CS opportunities and crowdsourcing
Apr 2018:	presented paper on CNNs for end-to-end speech recognition at ICASSP 2018 (poster)