skip to content
Site header image Mimansa Jaiswal

LLM (ML) Job Interviews - Resources

A collection of all the resources I used to prepare for my ML/LLM research science/engineering focused interviews in Fall 2024.

Last Updated:

Navigation

This post has two parts:

  1. Job Search Mechanics (including context, applying, and industry information), which you can read at LLM (ML) Job Interviews (Fall 2024) - Process, and,
  2. Preparation Material and Overview of Questions, which you can continue reading below.
Disclaimer

Last Updated: Feb 24, 2025

This is my personal process, which may work differently for others depending on their circumstances. I'm writing this in December 2024, based on my experience during Fall 2024. While the field of LLMs evolves rapidly and specific details may become outdated, the core principles should remain useful.

I want to be clear: this isn't a definitive guide or preparation manual, nor is it a guaranteed path to success. It's simply a record of what I did, without any claims about its effectiveness. If you find it helpful, wonderful—if not, I'd love to hear what worked for you instead. I'm sharing my experience, not prescribing a universal approach.

I typically avoid rewriting my personal posts using LLMs, but this content has been heavily edited, not generated (primarily using Notion AI, Claude 3.5 Sonnet and GPT-4o) to keep it concise and professional while maintaining my authentic voice—more or less.

Database View and Contributions

Some readers just want a simple list without the detailed explanations and context—and that's perfectly fine.

I've included two toggles below. The first shows these same resources in a Notion database view that you can freely explore—expand it, open it in a new page, or browse as needed.

The second toggle contains a contribution form for a separate database. I personally review all submissions to filter out spam, and I'll move valuable contributions to the main database. Contributing is entirely optional—while I don't expect anyone to do so, I welcome any worthwhile additions.

The collection currently lacks resources in certain areas, particularly multimodal and speech-based content. Though I can't write extensively about these topics myself, I'm happy to serve as a curator. If you have relevant resources to share, please use the form—I'll review submissions and incorporate valuable ones into the main database.

To view this content as a searchable database, expand this toggle. You can explore the embedded database by expanding it or opening it in a new page. (Last Updated: Feb 6, 2025)
To contribute to this collection, expand this toggle to find the submission form. I review all entries to filter out spam and add valuable contributions to the main database.

Content Storage

I use Notion pages for content storage. Though Capacities would have been ideal, its lack of a reliable output API meant I'd need to copy content into Notion anyway for this post. Additionally, Capacities doesn't support multi-user collaboration—a feature I occasionally need—and I prefer keeping my content in one place rather than scattered across multiple apps. I've ruled out Obsidian since I prefer rich-text block-based systems. Capacities does have one standout feature: the ability to mark individual blocks as database objects with properties while keeping the block content embedded in place. This feature is particularly valuable for learning, and I highly recommend it. This isn't meant to be a comprehensive overview of my content/knowledge management practices—just a brief explanation of my organizational preferences for interview preparation.

Classification

I maintained 7 major sections in my interview preparation page: Statistical Knowledge, ML Knowledge, ML Design, DSA Coding, ML Coding, Behavioral, and “I know of these papers”. Each section (other than the papers had a Question subpage with 4 categories: "Aced it," "Took time," "Didn't get it," and "Just saw it somewhere."

As mentioned in my LLM (ML) Job Interviews (Fall 2024) - Process post, I meticulously documented and categorized every interview question in the question bank, along with my performance evaluation. For those familiar with JEE preparation, this approach might sound familiar—it's my tried and tested method (though I failed at JEE), and it's the only system I know (fingers crossed for better results this time). Whenever questions repeated across interviews, I added comments in Notion noting their previous appearances and updated their categories based on my latest performance.

Leetcode

Let me walk you through my LeetCode interview preparation approach. Companies consistently test candidates with LeetCode questions during interviews—yes, even startups. Here's the preparation strategy that worked for me.

Most companies don't allow code execution during interviews. While some Microsoft and Apple teams did permit running code (which was helpful), I learned to spot and prevent common mistakes without needing to test the code.

I tackled all questions from both recommended lists systematically. My first pass focused on finding solutions—whether brute force or optimized. After at least two days, I'd return to each question to implement an optimized solution. When successful, I'd move on. When stuck, I'd consult ChatGPT for the algorithm, implement it myself, and submit the solution. To maintain accurate progress tracking, I'd then intentionally submit an incorrect version to keep the question unmarked. After another two-day break following failed attempts, I'd try implementing the optimized version again.

I continued this process until I mastered all questions on the Grind 75 list, completed all easy and medium questions, and solved 50% of hard questions on the NeetCode 150 list.

ML/DL/LLM Coding

While these concepts can be coded given sufficient time and debugging capabilities, interview settings present unique challenges. You'll typically have just 25-35 minutes, need flawless execution, and must maintain precise matrix dimensions throughout. That's why practicing implementation is crucial, even if you're already comfortable with the concepts. To help you prepare, I've compiled repositories showcasing interview-style code examples, categorized into basic ML and LLMs. Following my Contextual interview, I expanded this collection by implementing various RAG/LLM inference papers—a category I found particularly engaging.

Statistics

I originally hadn't planned to include statistics in my preparation, but after being caught off guard by an unexpected Amazon interview loopShow information for the linked content early in my job search, I quickly realized I needed to. I then set out to find and practice statistics questions, using the resources listed below.

I'll be candid: there were many times when I struggled to remember concepts or needed to better understand the reasoning behind material I had simply memorized in college. I turned to ChatGPT and Claude to help clarify these topics. Though I always fact-checked their responses and ensured I had proper context, this approach actually improved my ability to explain concepts clearly and systematically (even if I still tended to ramble too muchShow information for the linked content).

Some linear algebra concepts from here (I know it's not statistics, but it's still relevant)

For data science question preparation, I used this resource:

I found Chapter 5 of Chip Huyen's book particularly useful:

ML Knowledge

I organized the Machine Learning knowledge section into two parts: (a) ML fundamentals and (b) Large Language Models (LLMs) and Transformers. Looking back, I should have created three parts by making NLP fundamentals its own section. Depending on your specialization, this third section could instead cover Computer Vision with Vision Language Models, or other cutting-edge architectures.

ML Fundamentals

I began my preparation with this ML cheatsheet, which covers the fundamentals:

Next, I worked through chapters 7 and 8 of the ML interviews book:

I primarily used these two resources in my preparation. For each question I encountered, I documented it and explored potential follow-up questions. My study method was straightforward: I'd first attempt to answer the question independently. If successful, I'd use ChatGPT to generate follow-up questions and practice those too. When I couldn't answer a question, I'd start a fresh chat to research the topic, document my findings with notes and comments in Notion, and categorize it under the relevant toggle section.

Overall, you list should include:

  • General breadth topics:
    • Machine Learning Fundamentals: Supervised vs. unsupervised learning, logistic regression for classification, bag-of-words, bagging vs. boosting, decision trees/random forests, confusion matrix, regularization methods, clustering (k-means/k-medians), distance measures, convexity, loss functions.
    • Statistics and Probability: Bias-variance tradeoff, imputation techniques, handling different dataset sizes, causal inference, statistical significance, regression analysis, Bayesian analysis, time series analysis, active learning, hypothesis testing, probability calculations.
    • Models and Performance: Overfitting vs. underfitting, model performance monitoring, online/offline evaluation metrics.
    • Dimensionality Reduction: Feature selection methods, matrix factorization, autoencoders, curse of dimensionality, PCA vs. t-SNE vs. MDS vs. autoencoders.
    • Time Series Analysis: Stationary time series, frequency domain analysis, forecasting models.
    • Neural Networks: Attention mechanisms, ReLU activation, deep neural networks (DNNs), recurrent neural networks (RNNs) including LSTMs, convolutional neural networks (CNNs), transformer models.
    • Algorithms:
      • Clustering: K-means, LDA, word2vec, hierarchical clustering, Mean-Shift, DBSCAN, expectation-maximization using GMM.
      • Classification: Logistic regression, Naïve Bayes, k-nearest neighbors (KNN), decision trees, support vector machines (SVMs).
      • Optimization: Unconstrained continuous optimization, gradient descent methods (batch vs. stochastic), maximizing continuous functions.
    • Other Topics: Probability distributions and limit theorems, discriminative vs. generative models, methodology (online vs. offline learning, cross-validation), online experimentation (A/B testing), binary and multi-class classification, hyperparameters, feature engineering.
  • Area-specific topics:
    • Personalization/Recommendation Systems: Collaborative filtering, transformer architecture, active learning, self-attention, multiple attention heads, pooling methods, visualizing attention layers, ranking, multi-armed bandits (MABs), evaluation metrics, cold-start recommenders.
    • Deep Learning: Transformers, dropout, normalization, maxout, softmax, word embeddings.
    • Natural Language Processing: Latent Dirichlet Allocation (LDA), language modeling, handling sparse data, text normalization techniques, statistical language modeling, information retrieval, word2vec, CNNs, transformers.
    • Computer Vision: Image and signal processing, linear algebra, image retrieval, CNNs, residual networks etc.

LLMs

I organized the material into 5 key sections:

  • Architecture and Training: Covered transformer architecture fundamentals, pre-training versus fine-tuning approaches, training objectives (with emphasis on next-token prediction), tokenization methods, scaling laws, and parameter-efficient techniques like LoRA and Prefix Tuning.
  • Generation Control: Explored methods for controlling model outputs through temperature and top-p sampling, effective prompt engineering, few-shot and in-context learning techniques, chain-of-thought prompting for complex reasoning, and strategies to minimize hallucinations.
  • LLM Evaluation: Examined key metrics (including perplexity, ROUGE, and BLEU), standard benchmarks like MMLU and BigBench, approaches to human evaluation, and tools for bias detection and mitigation.
  • Optimization and Deployment: Studied practical aspects like quantization, model distillation, inference optimization, prompt caching, load balancing, and cost-effective deployment strategies.
  • Safety and Ethics: Focused on essential safeguards including content filtering, output sanitization, protection against jailbreaking attempts, and data privacy protocols for responsible LLM deployment.

One of the best analogies I learned came from Kevin Murphy's PML book, as quoted in Yuan's blog post ( Attention as Soft Dictionary Lookup).

We can think of attention as a soft dictionary look up, in which we compare the query  q  to each key  k_i , and then retrieve the corresponding value  v_i ." - Chapter 15, p. 513
We can think of attention as a soft dictionary look up, in which we compare the query qq to each key kik_i, and then retrieve the corresponding value viv_i." - Chapter 15, p. 513

While this architecture is widely known, I found it valuable as a framework for organizing my review of architecture-related papers and questions.

Differences between Transformer and LLaMA architectures, as found in the Medium post " Unlocking Low-Resource Language Understanding ”
Differences between Transformer and LLaMA architectures, as found in the Medium post "Unlocking Low-Resource Language Understanding

Here are the key resources and materials I studied:

ML (System) Design

While I didn't extensively prepare for these interviews, I was fortunate that the companies I targeted mainly focused on research experiment design rather than general system design questions. I encountered a few questions about recommendation systems and sentiment analysis design, which I handled as well as I could. Working with primarily research-based design questions was a relief.

When I first started interviewing, I made the mistake of hesitating to ask for clarification. After finding two helpful blog posts and a YouTube video about asking better questions, I created a ChatGPT prompt to practice. I would pose questions like "How would you set up a research question design for LLaMA Guard fine-tuning?" and use ChatGPT's suggested clarifying questions for practice. This helped me realize that it's completely fine to spend up to 10 minutes asking clarifying questions.

Another concern I had, which I've mentioned beforeShow information for the linked content, is that many people now default to using large language models for everything—even for tasks that simpler models like BERT could handle effectively, or when BM25 would work better than text embeddings with cosine similarity for information retrieval problems.

The key question is: When should we use these powerful generative models versus smaller discriminative models? After discussing this with my friends (Yash and June), they helped me craft the perfect response, which I've since memorized: "There are two ways I can approach this problem. One uses a discriminative method when optimization and latency are priorities, though it's more restrictive in its outputs. The other uses a generative method throughout, which may introduce latency but offers more flexibility. I can briefly discuss the pros and cons of both approaches, and then you can guide which path you'd prefer me to explore."

I really appreciate them suggesting this approach. I now consistently use this method when tackling any design questions.

Behavioral

You've probably heard of the STAR method by now. Stick to it—I did in my process too. I used natural transition phrasesShow information for the linked content to avoid sounding rehearsed while keeping my responses structured. For example, I'd say "When this happened" or "For some context" as a personal reminder to describe the situation. Then I'd say "I had to do" to signal the task portion—these were cues for myself, not for the interviewer. I kept an eye on the clock to ensure I wasn't spending more than 30 seconds on any one part. Then I'd transition with phrases like "Here's what I did" or "I chose to do this," followed by "This was the result." I made sure the core response fit within two minutes. After that, I'd elaborate with additional details, often without prompting, but those first two minutes were always laser-focused.

I used this question bank set for behavioral questions and had stories for all of them.

Companies vary in their interview formats. While some may ask just a few behavioral questions per round, others—like Meta, with its extensive multi-round sessions containing up to 10 questionsShow information for the linked content—can catch candidates unprepared.

When interviewing with multiple companies, you'll need distinct examples for each question within a single interview loop. Repeatedly telling the same stories during your job search can make your responses sound mechanical. Here's a useful tip from my advisor: take a sip of water after each paragraph to stay present and keep your responses natural.

This principle extends to your introduction. Using the same introductionShow information for the linked content repeatedly can make you appear stiff and disconnected. Though interviewers expect candidates to prepare their responses, a natural delivery creates better engagement with your interviewer.

I Know of These Papers

I deliberately chose the word "know" for an important reason—it captures multiple levels of familiarity: deeply reading papers, skimming them, finding them through social media, learning about them from others' presentations or discussions, and implementing them firsthand. This knowledge extends beyond academic papers to include Twitter threads, implementation guides, verified code examples, blog posts, and other sources. This broad scope is why I kept the heading intentionally general.

When interviewers ask open-ended questions, I make a point to cite my sourcesShow information for the linked content, saying things like "I learned this from a blog post" or "I've seen this discussed widely on Twitter." I maintain a broad collection of references and always stay transparent about my depth of understanding for each paper. As someone who's constantly online, I use Zotero and Raindrop to organize papers, related discussions, and emerging research. I skim every paper before adding it to my Zotero library, categorizing them by potential interview questions—which is exactly why I created this page.

I tracked open-ended questions in this section and researched relevant papers I may have overlooked, regularly expanding the list. I also maintained a collection of papers worth mentioning for specific topics, which proved invaluable. I'll share some resources to showcase recent papers that caught my attention. I should admit that my paper knowledge comes primarily from my somewhat excessive social media use rather than formal newsletters or collections. These resources might help you build your own reference library for handling open-ended questions.

Here are some resources I use to stay on top of the vast number of research papers in my field:

Finally

Interview experiences can vary greatly depending on the luck of the draw. A skilled interviewer makes the process flow naturally, and while strategically steering conversations toward your strengths is valuable, some interviewers may be less flexible in their approach. Since resource quality and relevance can differ widely, it's crucial to prioritize materials that directly support your specific career aspirations.

I'm eager to learn from others who have successfully juggled machine learning studies with their other life responsibilities. If you know of any useful interview preparation resources, please share them—I'll gladly add them here.

Good luck with your preparation!