LLM (ML) Job Interviews (Fall 2024) - Process • Mimansa Jaiswal

Navigation

This post has two parts:

Job Search Mechanics (including context, applying, and industry information), which you can continue reading below, and,
Preparation Material and Overview of Questions, which you can read at LLM (ML) Job Interviews - Resources

Disclaimer

Last Updated: Dec 24, 2024

This is the process I used, which may work differently for you depending on your circumstances. I am writing this in December 2024, and the process occurred during Fall 2024. Given how rapidly the field of LLMs evolves, this information might become outdated quickly, but the general principles should remain relevant.

I usually don't rewrite my personal posts using LLMs, but this content has been heavily edited, not generated (mostly using Notion AI, Claude 3.5 Sonnet and GPT-4o) to ensure it's succinct and professional while (kinda) maintaining my style and tone of voice.

There are many resources, such as posts and Slack groups, for people seeking faculty positions, but shared experiences about industry job searches, particularly for research-focused roles, are less common. While the job landscape and research have evolved significantly over the past five years, the core job search process remains largely unchanged. Despite variations in compensation, roles, and interview processes across companies, having readily accessible information is always valuable.

Audience Participation

Expand to read a summary (less than 450 words) generated by Claude. It can save you time and API tokens. I promise I used an effective prompt to create it.

Interview Processes and Stages:

Startups: Interview processes are often unique and reflect the company's growth stage. Candidates may face 5-6 rounds, including coding challenges (often Leetcode), ML coding, ML fundamentals, and cultural fit interviews. Startups may require in-person interviews, multi-day work trials, or extensive presentations. The process is less standardized, and roles may encompass a broad range of responsibilities.
Unicorns (e.g., Anthropic, OpenAI, Scale AI): These companies have more structured processes but still vary individually. Candidates might undergo coding interviews (not always Leetcode-based), ML design, LLM-focused discussions, and presentations. The number of rounds can be extensive, sometimes combining multiple interviews if applying for different teams simultaneously.
Big Tech Companies (e.g., Meta, Amazon, Apple, Google, Microsoft): Processes are rigorous and structured, often involving multiple rounds over 1.5 to 2.5 months. Candidates should expect Leetcode-style coding interviews, ML system design, LLM research design, presentations, and behavioral assessments. Interviews may include both general and role-specific technical questions.

Common Interview Components:

Coding Challenges: Proficiency in data structures and algorithms is tested through coding exercises, making Leetcode practice essential.
ML System and Design Interviews: Assess understanding of ML concepts, system architecture, and ability to design solutions.
Presentations: Candidates may present previous work or research, highlighting their expertise and communication skills.
Behavioral Interviews: Evaluate cultural fit and problem-solving approach.

Key Differences Between Company Types:
While startups offer less predictable processes and may prioritize candidates willing to take on diverse tasks, unicorns focus on specialized skills relevant to their cutting-edge projects. Big Tech companies maintain formalized, multi-stage processes assessing a wide range of technical and soft skills. Each company type presents unique challenges and opportunities, requiring candidates to adapt their preparation and expectations accordingly.

Timeline Expectations:
The interview process typically spans several weeks to months, with potential delays during holidays or peak hiring seasons. Candidates should be prepared for short offer acceptance windows, often around seven days, which may require quick decision-making or negotiating extensions. Planning for overlapping processes and managing multiple timelines becomes crucial for successful navigation of the job search.

Practical Tips for Candidates:
Successful candidates should prepare thoroughly through extensive practice of coding challenges, while maintaining honest communication about their experience and capabilities. Creating an ATS-friendly resume and leveraging professional networks for referrals can improve opportunities. Organizing applications meticulously, developing strong presentation skills, and strategic interview scheduling are essential. Understanding company cultures and staying current with ML/LLM developments helps demonstrate relevance. Throughout the process, maintaining work-life balance and managing stress are crucial for optimal performance in interviews. Candidates should approach each opportunity with thorough preparation while remaining adaptable to different company requirements and interview formats.

I am looking for:

What I am not interested in:

Comments suggesting I'm making poor choices, lectures about the importance of risk-taking in the LLM era, or comparisons about 1M compensation packages.
Broader discussions about tech industry success stories or failures, social media metrics like GitHub activity, or unrelated social and political debates.

Context for Job Search

I started looking for a new position in October 2024 for a variety of professional and personal reasons. More importantly, here is how I approached the process.

For context, if you do not want to read through my profile, I have recently published papers on LLMs, along with multiple publications in speech processing and encoder-based (Bert era) NLP. I maintain a blog featuring case studies and ablation analyses, rather than reiterating known concepts, and have submitted/accepted work to relevant ML conference workshops. My experience and interests are well-defined within this field, and were supported by one year of full-time experience (superimposed with one year of independent research) and approximately one year of combined internship experience.

Additionally, I have unpublished experience that I ended up highlighting through slides and emails where appropriate, as my current publication record doesn’t fully capture my work, with much of my LLM evaluation and RAG efforts still in progress or under submission. A recurring theme throughout this process was my hesitation to reach out to people. I felt uneasy reaching out without a perfect profile, largely because a lot of my work remained unpublished, but ultimately, not reaching out was a poor choice, and when I did reach out, most people were immensely supportive.

Priorities and Preferences

I prefer focusing on work that reaches broader audiences, such as general LLM applications or advancing model research science and engineering. Client-specific customizations and problem statements, often don’t align with the kind of impact I aim to achieve in my career.
I didn’t prioritize AI engineering or LLM engineering roles (more on that later), my focus was on product-driven, product-inspired, or curiosity-driven research engineering and science roles in companies of any size.
I sought a team with strong ML/LLM research experience, as I thrive on collaboration, brainstorming, and intellectual discussions with colleagues. I prefer working in a research-oriented environment where I can tackle complex problems alongside others, rather than working in isolation.
I wanted to work on projects that align with the team's main goals and have timelines longer than 3-6 months before production deployment. While shorter timelines can be exciting and fast-paced, they often limit the opportunity for robust experimentation, which is something I deeply value.
I wanted some semblance of work-life balance. For me, this means having the flexibility to engage in personal projects like fun AI experiments and writing blog posts, while also ensuring I have time to recharge and play with my cats.
I'm based in Seattle with my partner. While I prefer to stay in the area, I would consider relocating for a role that exceptionally aligns with my career goals and personal situation.
This point addresses why I declined to pursue certain opportunities. I chose not to join teams whose primary focus is existential AI safety, as their approaches don't align with my current interests and focus areas. Additionally, I opted not to prioritize some companies (though I wish I had been even more selective) where there were extreme misalignments between our values and beliefs. I weighed these personal factors alongside professional considerations when evaluating how well each opportunity matched my goals.
I require H1B CoE visa sponsorship and want to begin the green card application process as soon as possible. These are essential requirements for any offer I consider.

Applying for Jobs

Preparation Context

My learning style is unique—I learn best through hands-on experience while doing the actual work. This approach has served me well in research, helping me adapt quickly to new areas. The downside is that I become nervous and anxious during interviews, particularly when they take unexpected turns (like encountering ML coding questions in what I thought would be a general coding round, or A/B testing questions when I expected ML breadth). I need to delve into something directly to truly grasp it. My situation is especially tricky because most of my PhD peers entered the industry two to four years ago, when tech hiring looked very different. Today's job market is more competitive, with entirely new expectations and requirements. Without a clear reference point for what to expect, and given my reluctance to reach out to others, I had no choice but to learn by going through the interview process step by step.

Initial Applications

I applied broadly to companies that matched my priorities, finding most positions through LinkedIn and Twitter. While I submitted many applications through company web portals, I also proactively reached out to hiring managers and recruiters on LinkedIn. For some companies, I secured employee referrals (more on that later). My approach varied by company type—for startups, direct connections with founders on LinkedIn proved most successful, while for traditional companies, I primarily engaged with hiring managers and recruiters. Throughout my search, I maintained a steady stream of cold applications as well.

Preparing Materials

Tracking

I tracked my applications using a simple to-do list in Notion rather than a full database, as I find incomplete databases more bothersome than a basic system. This tracking was crucial because some application platforms, particularly Greenhouse, sometimes allow reapplications to positions that previously rejected you when not configured properly. Even a basic list of links helps prevent such mistakes. For each application, I took screenshots of job requirements and key details to prepare for initial conversations, which helped me understand each company's interview focus. My tracking system was straightforward—a to-do list with four columns: Applied/In Process, Waiting, and Rejected/Offered. To keep things manageable, I only added interviews to my calendar after they were officially scheduled.

Application Questions

I made sure to write all my application answers myself, without using AI. I would speak my responses aloud and only used AI tools to fix basic spelling and transcription errors—never for generating content. You'll notice that companies often ask similar questions throughout the application process, which helps you identify your key talking points. Speaking your answers out loud makes the process more natural and serves as great practice for interviews. I kept all my previous responses stored in Notion since the same questions tend to come up across different companies.

Resume

You can access my resume (created with LaTeX) on my homepage. One of the major issues with my resume is its layout—it uses a two- to three-column format, with headings in the left column and details like names and timelines split into two columns on the right. This format isn't properly parsed by ATS resume scanners, which is a significant problem. If I were to redo it, I'd use Typst or another tool to create an ATS-friendly resume format, since manually editing entries for each application is tedious. I'll document this process next month and share the resume template I create, if anyone's interested. Keep your LinkedIn profile updated too—just in case the HR Gods decide to shine on you—since you can use LinkedIn to auto-fill those lengthy applications.

A Short Blurb

You'll need a short blurb—a brief paragraph about yourself. Over 12 weeks, I refined my self-presentation through about 10 iterations before settling on the blurb below. While it should ideally be shorter (my struggles with brevity are obvious from this post), the blurb effectively highlights my expertise, experience, and qualifications. When reaching out to connections, I'd personalize my introduction to show why I suited their specific role. My typical message read: "Hello [Name], I came across your posting and believe I'm a strong fit because [specific reasons]. I've attached a brief introduction about myself and my resume." Rather than rewriting the entire blurb each time, I focused on tailoring the initial message to emphasize my relevance to each role.

Expand to read the blurb I finally landed on

My work in LLMs focusses on three key areas: synthetic data generation and curation, LLM orchestration/agents, and robust, grounded evaluation and benchmarking.

For synthetic data generation, I focus on creating high-quality, template driven data samples for low-resource settings to improve SLM performance using fine-tuning, particularly for domain-specific tasks. For LLM orchestration and agents, I develop methods and fine-tune SLMs to generate executable, schema-driven plans that guide model chaining behavior, optimize execution trajectories, diagnose failure points, dynamically route across multiple LLMs, and auto-select appropriate sampling parameters for inference. In my role at Norm, I worked on improving agentic retrieval-augmented generation (RAG) workflows for regulatory-compliant tasks, focusing on plan creation, chunking, relevance scoring, metadata tagging, and domain-specific fine-tuning of retrieval/reranking models. For evaluation, I focus on two parts: robust evaluation of LLMs through prompt ensembling, example-based calibration, and prompt decomposition, and, fine-tuning SLM as evaluator models to behave like human agents, designing interpretable scoring models, and simulating human reading comprehension, especially with incomplete or fuzzy context.

My foundational NLP experience (during my PhD) includes LSTMs, BERT-based models, and multimodal applications, such as combining BERT with audio models like wav2vec. I’m actively involved in open-source projects and publications on production-grade LLM evaluation systems, integrating robust human feedback for scalable assessments.

Presentation

A surprising part of my job search was the requirement to give presentations, despite not interviewing for pure research science roles at places like FAIR or DeepMind. These presentations were required across various positions—Machine Learning Engineer (MLE), Research Engineer (RE), Research Scientist (RS), and Applied Scientist (AS) roles. Tencent was the first to request a presentation, which helped me create a template for future ones. While I can't share the specific content due to unpublished material, I learned that presentations should follow a clear structure: 40 minutes for content, 5 minutes for setup, and 15 minutes for Q&A—with flexibility for questions throughout.

Creating my presentation posed a challenge since most of my work was unpublished. I aimed for a visually engaging approach, though I'm still developing those skills (shoutout to my Facebook Research intern manager in 2020, Ahmad , who taught me most of what I know). I used Recraft to create some engaging visuals (Recraft ai for Presentation Graphics). Typically, I would connect two published works, but instead, I had to weave together various components into a cohesive project. While I felt good about the result, the interviewers may have seen it differently. PhD candidates presenting their current research can use their defense slides, but since I defended my PhD before my LLM work, that wasn't an option for me.

Other Materials

I only needed to prepare one additional document: a research proposal for a specific Microsoft position. You likely won't need one unless you're applying for a similar role. Given time constraints, I skipped writing cover letters for my applications.

Scheduling Interviews

I packed my schedule with multiple interviews rather than spending time on extensive preparation, since I wasn't certain what to prepare for. Though I should have reached out to others for advice, the wide variety of job openings meant everyone's experiences would likely have been different. As I'll discuss later, I encountered varying interview processes even within the same company. I decided to dive in headfirst, relying on my work experience to carry me through—though I did practice LeetCode and review other preparation materials, which I'll cover later. For scheduling, I compressed everything as tightly as possible, interviewing about 6 hours a day on weekdays for 12 weeks. Recruiters typically ask for availability two weeks out, and I learned it's better to schedule interviews with more lead time rather than rushing to fit them in—especially if you're working with a deadline like I (sort of) was.

My October meeting calendar shows various one-hour events. This represents how my calendar looked during the first 12 weeks of job searching (except for Thanksgiving week).

Industry Information

Feel free to skip this section if you're already familiar with the industry landscape. While I had some knowledge going in, I learned quite a bit during my job search that I'd like to share for others in a similar position. Rather than covering every company, I'll focus on notable experiences—both positive and negative (keeping the less favorable ones anonymous for obvious reasons). I'm focusing on the interview processes themselves, not the outcomes of applications or offers. Since interview structures aren't covered by NDAs, I'll provide general insights while avoiding specific questions.

Disclaimer

As someone without high-profile papers or "big name" research status, my job search experience likely differs from more prominent candidates. Rather than leveraging my network, I chose to contact recruiters and respond to job posts on LinkedIn and Twitter. The information I share may be most helpful for candidates with average to above-average profiles who can target positions strategically. It will be less relevant for those with star status or strong internal referrals, since I'm not the type of candidate companies compete over—those individuals likely have very different experiences.

Job offers can vary significantly based on your background and how you approach the search. While I won't disclose specific details, I can share how the offers compared. Since compensation often correlates with published conference papers, my offers—given my unpublished work—likely differed from what candidates with stronger profiles might receive. Though I received multiple offers, they weren't the eye-popping salaries you might see online (not even close). Still, they were sufficient to let me focus on work I truly enjoy.

Startups

Interview Process for Startups

Every startup seems to have its own quirky process, which often varies based on their growth stage. Some startups required: (a) in-person interviews—either single or multi-day sessions—which I opted out of since remote options exist (thank you, post-Covid world), (b) multi-day work partnerships, which were problematic due to visa constraints and required significant time commitment to a single opportunity, and (c) references. For the latter, I hesitated mainly because asking my advisor for non-academic job references isn't standard practice.

Most startups I interviewed with required 5-6 rounds beyond the initial recruiter or founder call. These typically included general coding (yes, LeetCode is popular for startups too), ML coding, ML fundamentals, and a cultural fit interview. I deliberately avoided startups that promoted 6/7-day workweeks or consistent 12-hour workdays.

Here's where things get tricky: job titles like "AI Research Engineer" or "AI Research Scientist" often have misleading descriptions. When a posting requires just a bachelor's degree plus experience, you can usually infer it's an AI engineering or LLM engineering role. However, even when positions explicitly require a PhD and published papers—suggesting research work—they often aren't research roles at all. During interviews, you discover they actually want someone to productionize existing research or customize client solutions, not conduct original research. While I enjoy product-driven research, there's a crucial distinction between implementing existing research and conducting novel research with product goals. Many companies either aren't clear about their needs or, more frustratingly, know they want someone to productionize research but frame it as an innovative research position.

Case in point: Galileo sent an email that I appreciated for its honesty, even if the position naming was still a bit off (or maybe I need to adjust how I interpret titles across companies).

We hope this email finds you well. We wanted to thank you for taking the time to apply for the Senior ML Research Scientist position at Galileo. We genuinely appreciate your interest in joining our team.We’ve re-aligned our team needs and decided to close the Senior ML Research Scientist position. Instead, we are now hiring for a Senior Applied AI Researcher. This position will be focused on shipping products and running experiments. This person will be exploring new model architectures and implementing ML algorithms from scratch to launch products at scale. If you think this new position could be a fit based on your skills, please review the updated role here.

Specifics from My Startup Interviews

Below are some notable examples that stood out during my interviews, though this isn't an exhaustive list. I'm happy to remove any company-specific details if requested.

The good

[Offer] Salient: The process was reasonable and direct. And the company was very clear about the team’s needs and expectations.
Resolve AI: Excellent interview experience. The recruiter and team were highly professional. The coding rounds effectively focused on ML design, with discussions centered on practical, real-world work.
Haize: Though their six-day workweek culture and work-life balance don't align with my preferences, I appreciate their commitment to producing genuine research—a rare find in this space.
Descript: They conducted just one interview round, but it was efficiently focused on work experience and clearly outlined their product roadmap.
Contextual AI: Their coding challenge stood out—implementing a research paper with a well-structured, self-contained setup. I valued this approach, despite not advancing further.
Sierra AI: With one of OpenAI's GPT paper authors at the helm, their research team had an impressive interview pipeline: a one-hour hiring manager chat, a 45-minute session (30-minute presentation plus 15-minute Q&A), an hour-long research discussion, a two-hour ML coding challenge (building models and running experiments), and direct time with a founder. While I enjoyed the initial conversation, I had to pause the process due to their inflexible onsite-only requirement (seriously, why no virtual option?).

The not so good

Several companies disappeared during the interview process: one ghosted after three rounds following their LinkedIn outreach, another after two rounds post-online application, five more after initial interviews, and one even vanished after completing the final round.
A company advertising on LinkedIn required candidates to complete a timed take-home test using cloud-based AI models but wouldn't provide the necessary API key—they expected candidates to use their own.
One company put me through four interview rounds and assigned a four-day take-home project with a two-hour presentation requirement. Their final "cultural fit" round included irrelevant questions. They ultimately rejected me citing "insufficient experience"—information readily available on my resume from the start.
Two companies had interviewers who seemed to have predetermined I wasn't a fit, using the interview time to explain their reasoning.
Two particularly swift rejections stood out: one company cited my "communication style" as incompatible immediately after what I considered a professional interaction. In another case, after a hiring manager referred me to a different team, that team's manager rejected me within five minutes of our interview, offering no explanation.

Referrals at Startups

The most effective way to connect with startups is by directly messaging people who post job openings on LinkedIn or Twitter. You can also work with contracted recruiters, who either reach out through LinkedIn or post general job listings—many of which turn out to be for seed-stage or stealth startups.

Offers and Work Policies at Startups

Most startups are concentrated in New York or San Francisco and rarely offer remote work. Base compensation typically ranges from $150k–$250k with 0.2%–0.5% equity grants. Since these positions are in high-cost-of-living areas like the Bay Area and New York, state and city taxes take a bigger bite out of net income compared to my current location. While most startups offer healthcare benefits, 401k matching is inconsistent. They often advertise "unlimited PTO," though the intense work culture makes this benefit questionable in practice. Relocation assistance varies by company—some provide it, others don't.

Unicorns

I interviewed with three unicorn companies: Anthropic (back in April 2024), OpenAI, and Scale. While I had interviewed with Figma in Fall 2023 without success, these were the only unicorn startups I engaged with during this process.

Interview Process at Unicorns

At Anthropic, I went through three object-oriented programming coding rounds, followed by a virtual onsite with seven rounds that covered LLM-related coding, ML fundamentals, discussions, and culture fit. The process was excellent despite not advancing—the questions were fair, and interviewers respected my areas of expertise. I'd highly recommend their interview process. OpenAI started uniquely with math/LLM coding rather than general coding for their Research Scientist/Research Engineer position. After not moving forward, they suggested an MLE role in San Francisco, but I declined since it would have required another filter round plus 4–5 virtual onsite interviews. Scale's process included a preliminary interview and two follow-up rounds, with a potential final round. The interviewers were engaged and professional throughout all three rounds. Interestingly, none of the unicorn companies used LeetCode-style questions, and most allowed reference materials during coding (though LLMs were prohibited). OpenAI lets candidates pursue multiple interview tracks simultaneously, while Anthropic combines all potential team interviews into one comprehensive final round—explaining the high number of interviews in my process.

Specifics from My Unicorn Interviews

The good

All three companies had excellent interviewers, with Anthropic's interviews being particularly well-targeted and relevant.
OpenAI showed initiative by suggesting alternative positions that might be a better fit.
The Scale AI interviewer went above and beyond during my ML coding round to ensure a productive session.

The not so good

Anthropic's interview process was quite extensive. While the interviews themselves were well-conducted, the sheer number of rounds was overwhelming.
At Scale, the external recruiter's communication was inconsistent—I had to follow up multiple times during a 2–3 week silence after my penultimate interview.
At OpenAI, despite having a Seattle office, they strictly required San Francisco-based work. They were transparent about this requirement from the start, which I appreciated, but am sad about.

Referrals at Unicorns

I submitted direct applications through the websites for both Anthropic and OpenAI, making sure to provide detailed responses to all application questions. For Scale, I came in through an external recruiter. During the process, all three companies inquired whether I had internal referrals or connections—a factor that can greatly boost your application.

Offers and Work Policies at Unicorns

Since I didn't receive offers from these three companies, I can't share specific numbers, though their compensation ranges are available online. Anthropic and Scale offer positions in Seattle, while OpenAI primarily hires in San Francisco. These companies typically provide 401k matching and healthcare benefits. While unlimited PTO is a common perk, it's rarely used to its full extent. Additional benefits like parental leave and planning assistance vary by company.

Established Companies, aka, Big Tech

I interviewed with several tech companies: Meta (LLama team), Amazon (Alexa LLM and Ads [the latter wasn't by choice, more details below]), Apple (across four AIML teams, including a contractual position), Google (Gemini), Tencent, ServiceNow, Microsoft (Bing, Office, and two MSR teams), PayPal, TikTok (back in July 2024) and Netflix (Content & Studio) [stopped Netflix process due to time constraints]. Due to time constraints and priorities, I didn't pursue opportunities at Adobe, Huawei, LG, Samsung, Bloomberg, or Sony, though they all have excellent ML/LLM teams worth considering.

Interview Process at Big Tech

Let me break down how different companies handle their interview processes. Each company has its own unique approach, so here's what you need to know about interviewing multiple times:

Google, Meta, Amazon, PayPal, and Tencent limit you to one interview loop. Apple allows concurrent interview loops with different teams, but you can only discuss an offer with one team. Amazon tries to add extra team interviews to your existing loop rather than starting a new one. Microsoft is the most flexible—you can interview with multiple teams and apply to multiple positions simultaneously. At Meta, while you initially interview for one team, they may consider you for other teams once your interview packet is complete. Understanding these policies is crucial for planning your interview strategy.

One final consideration is scheduling. Each company moves at its own pace, typically taking 1.5-2.5 months to complete their process—especially during the holiday season and NeurIPS week. Plan your interviews strategically to ensure they conclude around the same time.

Now, let's discuss what these interviews actually cover. First and foremost, you'll need to master data structures and algorithms coding (leetcode)—there's no avoiding it. Every company, startups included, requires it. I've compiled all the resources in LLM (ML) Job Interviews - Resources. In short, I worked through all 150 Neetcode questions plus any additional questions I encountered during interviews, as many questions were repeated across companies. Beyond coding, expect presentations, ML system design, LLM research design, LLM system design, ML coding from scratch, ML breadth, LLM depth, and behavioral rounds.

The following section details my interview process in depth. Since it's quite lengthy, here's a tl;dr summary (generated using GPT4o) for those who prefer a quick overview.

Expand to read tl;dr and then skip to next section

Meta: Interviewed for unspecified research roles through referral. Process included hiring manager discussion, presentation, three ML design rounds (area-specific and LLM-focused), coding with two LeetCode questions, and behavioral assessment.

Amazon: Process consisted of initial filtering discussion, coding, presentation, and six rounds covering ML design, breadth, depth, LLMs, and system design, though notably without ML-specific coding.

Apple: Team 3: Two filtering rounds (hiring manager and ML coding), followed by four on-site rounds (behavioral, ML knowledge, ML coding, data structures). Team 1: Extensive 12-round process, including recruiter screen, hiring manager discussion, ML-focused coding/design, presentation, and leadership discussions. Team was promising but lacked Seattle location option.

Google: Bypassed filtering, faced unexpected personality questionnaire, gave presentation, completed five rounds (two LeetCode, two research, one behavioral).

Tencent: Completed filtering, presentation, and six one-on-one interviews covering breadth, research depth, math/stats, and LeetCode.

Microsoft: Team 1: Filtering plus five rounds (three with LeetCode). Team 2: Hiring manager interview plus four ML-focused rounds (one coding). MSR AI (Team 2): AI Frontiers required prototype/demo presentation (screening plus five rounds).

PayPal: Two filtering rounds followed by three rounds (coding, ML discussion, behavioral).

Netflix: Completed hiring manager round; declined to continue further process: filtering plus four rounds (coding, ML coding, ML breadth/depth, behavioral).

Now you can skip to next section.

For Meta [Offer], I interviewed for both research scientist and research engineering positions through a direct referral into their system, without a specific job title assigned. The process began with a hiring manager discussion followed by a presentation. Next came three machine learning design rounds—one focused on my area of expertise and two covering general LLM use cases. I also completed a general coding round with two LeetCode questions and a behavioral round. Though the original team filled their headcount before making a decision, my recruiter worked hard to match me with another relevant team in the same organization.

For Amazon [Offer], I completed one filtering discussion round and a coding interview. This was followed by a presentation and six rounds covering ML design, ML breadth, ML depth, LLMs, and system design. While each round included some coding components, none focused specifically on ML coding.

In Apple, for Team 1 [Offer], I went through an extensive 12-round process. It started with a recruiter screen, followed by a hiring manager screen and a filtering round that covered ML breadth and coding. Next came a presentation and five technical rounds covering ML depth, ML breadth, general coding, ML coding, and system design. Just when I thought we were done, there were additional rounds: another conversation with the hiring manager, one with their manager, a specialized LLM discussion, and finally a round with the Proactive Intelligence team's director. I really liked this team—they're working on impactful consumer-oriented goals—though I was disappointed they wouldn't allow remote work from Seattle or much publishing.

For Apple Team 2, I had one filtering round focused purely on LeetCode, which I didn't pass. For Team 3 in Seattle (referred by my Team 2 recruiter), I had two filtering rounds: a conversation with the hiring manager and an ML coding session with a team member. This led to four on-site rounds covering behavioral questions, general ML knowledge, ML coding, and data structures. Separately, I interviewed for a contractual position through four one-hour research-focused conversations, including one ML coding round.

For Google, thanks to some assistance and timing, they waived the filtering round. Instead, I completed an unexpected personality questionnaire, followed by a presentation and five rounds: two LeetCode coding sessions, two research interviews, and one behavioral assessment (with no ML coding).

For Tencent, after one filtering round and a presentation, I had six one-on-one interviews with team members. Each interviewer focused on different areas: general breadth, research depth, math and statistics, and LeetCode programming.

For Microsoft Team 1, I had an initial filtering round with the hiring manager, then five general rounds, three involving LeetCode coding. For Team 2, there was one hiring manager round followed by four ML-focused rounds, including one coding session. For MSR AI Team 2, after the hiring manager round, I completed a five-round interview loop (including a presentation, one OOP coding round, and three ML/LLM research interviews), though I didn't receive an offer despite feeling confident about my performance. They uniquely required a prototype demo in the screening round. For MSR AI Team 1, I completed the hiring manager round but was rejected after a two-month wait.

For PayPal [Offer], I completed two filtering rounds followed by three rounds: coding, ML discussion, and behavioral. For TikTok, I had a recruiter conversation, two combined coding and general ML rounds, and a final ML/System design round with the hiring manager, which I didn't pass. For Netflix, I've only completed the hiring manager round, but the full process appears to include another filtering round plus four on-site rounds (coding, ML coding, ML breadth/depth, and behavioral).

Specifics from My Big Tech Interviews

The good

Despite some irrelevant and overly complex LeetCode questions, most interview rounds were excellent, featuring relevant questions and interviewers who created a comfortable environment.
The interviews were completely free of hazing. I especially appreciated when interviewers were transparent, saying "I'll keep asking you probing questions until you cannot answer them. This is not a reflection of you as a candidate, just the way I interview."
The process felt particularly welcoming because almost everyone had clearly reviewed my resume and work beforehand. Honestly, all big tech interviews were pleasant—the interviewers were knowledgeable, asked questions about my expertise, and worked to make me feel at ease. This was notably different from my startup interviews.
Microsoft Team 2 and MSR AI Team 2 conducted particularly strong interviews, and I appreciated how the MSR team specifically focused on my areas of strength.

The not so good

Three companies ghosted me—one before interviews began and two after I'd completed six rounds of virtual onsite interviews.
Having twelve interview rounds was excessive, even for an excellent company with impactful work.
The offers came with seven-day expiration windows—too short in my view—forcing me to request extensions to make well-informed decisions.
Google rescheduled my interviews twice, and Microsoft Team 1 moved me between positions mid-process while reorganizing the final three virtual interviews during the onsite day.
At Amazon, I began interviewing for an LLM and emotion-focused team, but they quietly transferred me to a sponsored ads A/B testing ML team after the screening round, resulting in irrelevant questions throughout my virtual onsite interviews.

Referrals at Big Tech

I initially applied to Meta without a referral about 8 months ago. Despite contacting recruiters, I made no progress. Later, I reached out to someone whose research paper I had referenced—I had previously interviewed with them after my PhD but had to pause the process when I took another job. They had since moved to Meta, which helped me secure an interview. Though I eventually got an employee referral, this was after the process had already begun.

At Amazon, my September interview was initially with a relevant team, but midway through the process, they reassigned me to a different team without notification. Though Amazon typically allows only one interview loop, they made an exception when I contacted a recruiter and explained how I'd been shifted to a non-LLM A/B testing team—an area outside my expertise. While I had an employee referral, I'm unsure if it helped, but it did support my case for a second interview loop.

For Apple, despite having two referrals for about four positions, I heard nothing back. Ironically, I received responses from positions where I had no referrals. After being rejected from the search science team, the recruiter recommended me to the AIML measurement team. I also pursued a contractual position by contacting someone on LinkedIn. While the team was excellent, I learned that contract positions, even with H-1B and green card sponsorship options, weren't as financially attractive as expected due to the lack of RSUs.

With Google, my three applications with employee referrals went nowhere. Success came when I connected with one of my advisor's former lab members who worked on the Gemini team—they helped advance my application.

For Microsoft, despite numerous applications with employee referrals (including direct recommendations to hiring managers), I received responses from unexpected teams, including one whose member I'd contacted on Twitter. With Netflix, I applied directly first, then sought a referral when I got no response. Though the referral wasn't officially linked to my application, I finally heard back.

My Tencent interview process began through a combination of Twitter outreach and email contact. For ServiceNow, I both reached out on LinkedIn and applied online to different positions, receiving responses in both cases. PayPal was straightforward—I applied online and a recruiter contacted me.

Offers and Work Policies at Big Tech

Most of these teams and companies have locations in the Seattle area, as well as in SF and New York. I initially assumed that if a company had a Seattle office, they would let me work from there—but this wasn't always the case, which was a lesson learned.

Regarding compensation, averaged over 4 years including bonuses and refreshers, I received offers ranging from $350k to $430k across these companies. This includes RSUs, potential target bonuses, refresher stocks, and averaged sign-on bonuses, but excludes relocation. Only PayPal and Apple's Proactive Intelligence teams were outside Seattle at the offer stage, and PayPal didn't offer relocation packages. Surprisingly, my assumptions about compensation were wrong—I'd thought Meta paid the best and Apple paid less, but Apple actually offered better compensation than Meta and Amazon, partly because they counted my 12-month internship as one year of experience, unlike the others.

Microsoft and Apple offer Employee Stock Purchase Programs on top of their RSUs, and Apple's RSU package was comparable to other companies. Amazon structures compensation differently—they rely on sign-on bonuses for the first two years rather than regular bonuses or refreshers, unless their stock falls below your target salary. Meta's unique advantage is early stock vesting with no cliff period, unlike other companies.

When negotiating, I succeeded by being transparent about competing offers. I shared specific numbers from other companies, avoiding outdated 2021 market data. While I wasn't a superstar candidate, I maintained clarity about what I did and didn't know.

Regarding work arrangements: Meta and Apple require three days in office weekly, while Amazon mandates five days unless you receive special manager approval (though the implementation varies). Netflix and some Microsoft teams provide remote work options—worth considering if you value flexibility.

For benefits, I focus mainly on health insurance and 401k matching. I was surprised about Meta's healthcare program—despite its reputation for excellence, it doesn't always offer zero-premium plans for employees. Many plans include coinsurance costs that can accumulate significantly. This was unexpected, and I'm still processing how I feel about it. Other notable benefits to consider include pet insurance, fertility programs, and related support services.

Hindsight and Tidbits

Let me share some helpful tips I've gathered along the way—I'll keep updating this as I remember more.

Don't hesitate to reach out to people, even if your work isn't published yet. I shared my work through blog posts and found that people are generally kind and receptive when you contact them.
Keep your presentation to 40 minutes rather than 45. I learned this the hard way—setup and introductions eat into your time, and I had to rush through my content. It took three iterations to perfect my presentation by adding PCC values, data points, and workflow diagrams.
Plan your interview schedule strategically to handle multiple potential offers. This is especially important when dealing with short-deadline ("exploding") offers, as it helps with negotiations and minimizes the need for extensions. While I was always transparent with recruiters about my timeline and ongoing processes, this became challenging during the holiday season when scheduling flexibility was limited.
I was transparent about my experience, openly sharing that I worked with continual fine-tuned models between 0.5-1B parameters and primarily focused on LoRA-based fine-tuning using peft/bitsandbytes/axolotl. I made it clear that I hadn't pretrained models and lacked experience with model or data parallelism. I also directly stated that most of my work involved supervised fine-tuning, with only limited experience in RLHF/RLAIF, and acknowledged my unfamiliarity with state space models. While this candor may have cost me some opportunities, I knew that trying to bluff my way through unfamiliar topics wouldn't serve anyone well.
- Many interviewers provided helpful guidance during discussions. For example, when discussing constrained decoding at Apple, I openly acknowledged my limited expertise, explaining that while I understood the concept—involving structured JSON outputs and state machine representations verifiable by regex (as implemented in CFG outlines)—I hadn't personally implemented it.
- I was transparent about how practical constraints like cost, model size, and training resources influenced our design decisions and research outcomes. This honesty was particularly important since Google's results might have differed from what I observed when fine-tuning Qwen2.5-0.5B.
It's fine to have technology preferences as long as you remain adaptable. I'm upfront about not using Langchain/LlamaIndex in my RAG and agent work, even though I have experience with them. I explain that I chose alternative solutions because modifying specific components proved challenging.
I maintain a conversational approach in interviews and am honest about my confidence level. I tend to say "right?" when seeking validation and openly say "I hope I'm on the right track" when unsure. When I need hints to solve problems, I acknowledge it directly: "I apologize for missing this direction and needing a hint." Then I provide any additional relevant information to complete my answer. I've noticed I say "in general" frequently, and after many interviews, I caught myself saying "I think I had one final question" even when I didn't.
Learn the exact implementations of binary search, graph components, and coding attention blocks—you'll need them throughout the process.
In design interviews, I often weigh traditional versus modern approaches, like choosing between conventional recommendation systems and RAG-based relevance scores, or between BERT classification and generative model outputs. I've learned to present both options clearly, then suggest a preferred approach: "Given situation X, option A might be more suitable. What are your thoughts?" I've also grown comfortable spending time on clarifying questions during design interviews, rather than rushing to conclusions as I used to.
I verbally acknowledge when I'm writing things down or looking things up, which helps build my confidence. Once, when I didn't do this, I became so nervous that I blanked out for 5 minutes. Though the interviewer offered a simpler version of the question, I eventually solved the original problem after that silent period.
When discussing technical developments, start with fundamentals before moving to cutting-edge solutions. For instance, when asked about improving transformer efficiency, begin with basic approaches like grouped query attention before advancing to Longformer, subquadratic attention, or state space models.
Structure behavioral responses using the STAR framework (Situation, Task, Action, Result) with specific phrases: "For some context" introduces the situation, "we/I needed to" describes the task, "I ended up doing/having" covers the action, and "with that, it resulted in" presents the outcome. This organization helps, though remember that interview formats vary—Meta, for example, uses a 10-turn conversation instead of the traditional 3-round approach, which caught me off guard.
I tend to be a nervous talker, and even after nearly 400 interviews, I still struggle with conciseness. Though I'm learning to be more brief and avoid excessive background information, I recognize that I need to be more direct. I still rambled a lot—A LOT. Even in my last interview, I over-explained things and gave cluttered answers. I wish I could develop a STAR-like system for these responses (and I'm working on it). When nervous, my interview performance suffers, which is frustrating. That's why I typically do better in the latter half of interviews, once I've settled down.
After each interview, I documented all questions and practiced them before my next round. This was my learning approach - whenever I encountered something unfamiliar, I'd research it or consult an LLM afterward, organize it under a toggle in my notes, mark it for review, and later test my recall. I kept this system simple, using just Notion pages rather than complex tools like Anki.
Speaking of organization, I recommend creating Gmail labels for "Interview" and "Interview/Scheduling" to track your process. I skipped this step and now regret it—my inbox has become practically impossible to search.
I stay current with research through regular online reading. I'm always transparent about my depth of knowledge—if I've only skimmed something, I say so while sharing what I learned. When I have deeper knowledge, I share it upon request. I make sure to credit papers and research that inspire my ideas rather than presenting them as my own.
Having a blog post on this page has proven incredibly valuable. I strongly recommend creating a website (What is the cheapest and easiest way to create AND maintain a personal academic website?) and keeping your Google Scholar profile current. Even without blog posts, it's helpful to have a space where people can quickly learn about you in two minutes or less.
Before joining any interview, fully close all video tabs—don't just pause them. I've accidentally pressed play instead of the asterisk multiple times, causing Stardew Valley or Mario Party sounds to burst through while I frantically searched for the offending tab.

Finally

Once interviews are scheduled, some factors remain outside your control. Luck plays a role (as with my Meta second team match), and managing timelines and relocation can be challenging. However, thorough preparation can significantly improve your chances. While preparation methods vary by individual, I discuss my approach in LLM (ML) Job Interviews - Resources.

Elsewhere on The Internet

I extend an open invitation to anyone willing to share their interview process. If you provide your experience, I'll organize and include it here. My hope is to help others who, like me, might be hesitant to ask for guidance. Having easily accessible public information can make all the difference. If you'd like to contribute and be credited, please reach out via my blog, direct message, or whatever communication method you prefer. I wish I had had access to such resources rather than struggling through the process alone due to my own hesitations, insecurities, and anxieties.

Kyunghyun discusses hiring expectations and experiences, which strongly resonated with me:

from their perspective, the job market suddenly asks them to show their credentials in terms of innovating in a much narrower domain of large-scale language models and their variants and to work on directly contributing to these products built on top of these large-scale models. there are much fewer opportunities if they do not want to work on productizing large-scale language models, and these positions are disappearing quickly.

i sensed anxiety and frustration at NeurIPS’24 – Kyunghyun Cho

https://kyunghyuncho.me/i-sensed-anxiety-and-frustration-at-neurips24/

Bookmark for https://kyunghyuncho.me/i-sensed-anxiety-and-frustration-at-neurips24/

One advantage I had was my interest in working with LLMs, particularly in areas like post-training evaluation and knowledge work—which are commonly found in product implementation teams.
- Peter Sobot comments about that feeling correctly
  
  Kyunghyun is correct in saying that "the current crop of PhD students [...] are being left out" of the modern AI hiring pipeline, but it's even worse than that.
  Those who do get hired are often left unsatisfied that they're expected to build products instead of doing research.
- Chris Olah on the other hand offers a different perspective
  
  So if you’re an academic considering industry research roles, I’d offer the following questions and frame:
  (1) Would you enjoy working in the team science / focused bet model?
  (2) What bets would you be excited to be a part of?
Gowthami talks about return offers here

There are a lot more internships than actual RS roles in these companies. So if you are in final year, try to optimize for company that actually gives you a job later, not someone that’ll make you interview from scratch along with other candidates.
Some experiences from Nathan Lambert’s job search process, right after grad school in 2022

A large undercurrent of this post is how helped I am by having the @berkeley.edu email address. My job search would've been very different without this. I think there is an optimum for every candidate on the spectrum of how many emails they should send versus how likely a response is. If you don't expect a response, you probably should spend more time building each on up (by doing more background, building, or networking).
AI is certainly a community driven by prestige.

and for a job switch in 2023

Structurally, the places where open research and science are the priorities is exceedingly rare. Even if someone joins as an academically minded research scientist, realities will always pull people into the business needs (especially at startups).
Some perspective about MLE interviewing and deciding on a career path from Yuan Meng here and here

If your resume is strong enough, you may get a date or two (e.g., recruiter call/HM chat/phone screen), but for the company to say “yes” (the offer), the expertise you bring to the table has to fit their roadmap and the person that you are must fit the company/team culture — otherwise, the marriage will be painful (and short).

Everyone dreams of working for a company that offers competitive compensation, rapid career growth, excellent work-life balance, high job security, and hopefully visa sponsorship. A company meeting all 6 criteria may exist, but only during 2021-2022 tech boom or in your dreams. If there is such a place right now, nobody wants to leave, so no headcount can be created 😅. Something’s gotta give.
Abhradeep Guha discusses the differences between industry and faculty hiring from an interviewer's perspective here.

Research: The hiring team wants to see the depth of the candidate’s research and not the breadth. The candidate will be mostly talking to domain experts from very aligned areas of the candidate. The job talk is of a very different flavor than academic one. The best industry job talk is the one that focuses on one problem and goes arbitrarily deep into it. Most industry labs interviews nowadays have a coding component (LeetCode type or otherwise). If the candidate fails in those, the chances of getting an offer is fairly low, even if the candidate is stellar researchwise. The reason is labs typically want folks who are generally strong, and can pivot to different areas as needed, and basic general purpose skills are mandatory requirements.
Bas van Opheusden shares their AI research interview process here.

The interview process will feel adversarial, frustrating and unfair at times. Despite this, remember that everyone involved, your referrers, recruiters, interviewers, hiring managers, etc, wants the same thing: for you to pass and accept an offer. At the end of my interview cycle, I had a bunch of rejections and 3 offers. If I'd had 2 or 1, I'd still be stoked. If I had none, I would have been sad. I'd have been emotionally crushed and it'd have been difficult to even consider applying to other companies for a while.
Siddhant documents his learning process in his Interview Preparation document here.

LLM (ML) Job Interviews (Fall 2024) - Process

Context for Job Search

Priorities and Preferences

Applying for Jobs

Preparation Context

Initial Applications

Preparing Materials

Tracking

Application Questions

Resume

A Short Blurb

Presentation

My support network throughout my PhD journey

Other Materials

Scheduling Interviews

Industry Information

Startups

Interview Process for Startups

Specifics from My Startup Interviews

Referrals at Startups

Offers and Work Policies at Startups

Unicorns

Interview Process at Unicorns

Specifics from My Unicorn Interviews

Referrals at Unicorns

Offers and Work Policies at Unicorns

Established Companies, aka, Big Tech