skip to content
Site header image Mimansa Jaiswal

Random Research Ideas On Social Media That I Liked

Sometimes I come across random research ideas across the twitter and social media universe that really resonate with me at that moment. Often they get lost in doom scrolling, so I am considering compiling those into a running log.

Last Updated:

Sometimes I come across random research ideas across the twitter and social media universe that really resonate with me at that moment. Often they get lost in doom scrolling, so I am considering compiling those into a running log.

2024

April 2024

🔗 Stella Biderman · @BlancheMinerva · 05:49 PM · Apr 09, 2024 UTC

Partition the Pile into subsets A and B and train a model on A for two epochs and a model on B for two epochs. How do their behaviors differ from the Pythia models? Can you use model merging techniques to recover something similar to the Pythia model from the 1-epoch checkpoints?

💬 ❤️

🔗 Florian Mai · @_florianmai · 09:31 PM · Apr 09, 2024 UTC

I'm skeptical that Chatbot Arena is really as informative as people make it out to be, but I'd be glad to learn that I am wrong:

1. Different chatbots have really distinct talking styles. Isn't it easy to tell whether something comes from GPT-4 or Grok? Then it's not really…
Show more

💬 ❤️

🔗 lmarena.ai (formerly lmsys.org) · @lmarena_ai · 09:30 AM · Apr 09, 2024 UTC

Exciting news - the latest Arena result are out!

@cohere's Command R+ has climbed to the 6th spot, matching GPT-4-0314 level by 13K+ human votes! It's undoubtedly the **best** open model on the leaderboard now🔥

Big congrats to @cohere's incredible work & valuable contribution… pic.x.com/5PzpPolC9F

💬

🔗 Stella Biderman · @BlancheMinerva · 08:46 PM · Apr 01, 2024 UTC

It's known that finetuning can incidentally remove RLHF guards arxiv.org/abs/2310.03693. Can you solve this by including examples with refusals mixed into the data? Does it matter if those refusals are in-distribution for the original RLHF? Does the domain of the FT task matter?

💬 ❤️

March 2024

🔗 Stella Biderman · @BlancheMinerva · 04:12 PM · Mar 05, 2024 UTC

Do pretraining-frequency studies in multilingual models get higher or lower explanatory power if you count data in another language from the test language? Maybe it depends on how close the two languages are? This seems like an important aspect of crosslingual transfer.

💬 ❤️

🔗 Ofir Press · @OfirPress · 04:19 PM · Mar 04, 2024 UTC

The only two numbers worth looking at here are GPQA and HumanEval

On GPQA the result is very impressive. On HumanEval, they compare to GPT-4's perf at launch. GPT-4 is now much better- see the EvalPlus leaderboard, where it gets 88.4

I bet OpenAI will respond with GPT-4.5 soon

💬 ❤️

🔗 Anthropic · @AnthropicAI · 02:07 PM · Mar 04, 2024 UTC

Today, we're announcing Claude 3, our next generation of AI models.

The three state-of-the-art models—Claude 3 Opus, Claude 3 Sonnet, and Claude 3 Haiku—set new industry benchmarks across reasoning, math, coding, multilingual understanding, and vision. pic.x.com/TqDuqNWDoM

💬 ❤️

🔗 Stella Biderman · @BlancheMinerva · 03:56 PM · Mar 11, 2024 UTC

@yasaman_razeghi @kandpal_nikhil @yanaiela Do RLHF'd behaviors transfer between languages? Can we align a LM the norms of one culture by aligning an LLM in their language and have those norms reflected in a different language? Can we simultaneously align a single LM to multiple different cultures in different languages?

💬 ❤️

🔗 max · @maxbittker · 10:29 PM · Mar 16, 2024 UTC

I'm super interested in heatmap vizualization of LLMs' per-token attention, for the sake of building intuition when prompt-building.

(Which previous tokens influenced each output token the most / which tokens are ignored)

Who is working on this type of tool? Any pointers? pic.x.com/7VAMZE065o

💬 ❤️

🔗 Xiao Ma · @infoxiao · 07:24 PM · Mar 13, 2024 UTC

It's wild examples like this exist in the commonsenseQA:

huggingface.co/datasets/tau/c…

huggingface.co/datasets/tau/c… pic.x.com/AhuttVyuxA

💬 ❤️

🔗 Simon Willison · @simonw · 12:33 AM · Mar 14, 2024 UTC

This is just the table of contents for his 2014 OS X Yosemite review: arstechnica.com/gadgets/2014/1…

Imagine if someone put that much skill and dedication into a review of Anthropic's Claude 3 Opus pic.x.com/lUgiNiI91O

💬 ❤️

Feb 2024

🔗 Charlie Snell · @sea_snell · 05:19 PM · Feb 24, 2024 UTC

Does anyone have a favorite task where gpt-4 has near chance accuracy when zero or few-shot prompted? I’m looking for recommendations for tasks like this

💬 ❤️

🔗 Stella Biderman · @BlancheMinerva · 09:02 PM · Feb 26, 2024 UTC

Build a benchmark that measures and contrasts translation (exactly mapping content between langs) and localization (mapping content onto corresponding concepts in other langs). This would be useful for evaluating models, but even more than that for evaluating benchmarks.

💬 ❤️

🔗 François Chollet · @fchollet · 10:53 PM · Feb 23, 2024 UTC

Bias in ML systems can come from bias in the training data. But that's only one possible source of bias among many. Arguably, prompt engineering can be an even worse source of bias.

Literally any part of your system can introduce biases. Even non-model parts, like your…
Show more

💬 ❤️

🔗 Stella Biderman · @BlancheMinerva · 05:35 PM · Feb 05, 2024 UTC

@Wetassprior @daphneipp Is there a post-hoc correlation that can be applied to a scaling laws study done Kaplan et al.-style to get one done Hoffman et al.-style? Note that this would be very high value because it means you would need to do far fewer training runs to do scaling laws calculations.

💬 ❤️

🔗 Stella Biderman · @BlancheMinerva · 08:44 PM · Feb 20, 2024 UTC

In the Pythia paper we find intervening on the training data to change the model’s gender bias late in training is only effective for large models. Is this because the small ones are converged in bias space?

arxiv.org/abs/2304.01373 pic.x.com/rnbJP7HBuw

💬 ❤️

🔗 jack morris · @jxmnop · 11:17 PM · Feb 15, 2024 UTC

random research idea: Latent Text Tansformer (LTT)

in a nutshell: replace sequence of *vectors* as hidden states of the transformer with sequences of *tokens*, so we can read the model's "thoughts" directly 🤖

then train a transformer that uses longer sequences of *discrete… pic.x.com/mN9suPzcig

💬 ❤️

🔗 (((ل()(ل() 'yoav))))👾 · @yoavgo · 05:16 PM · Feb 26, 2024 UTC

here's a research question (which i am not going to work on but will be happy if others): is there a function, computed from model weights, that allows to estimate the number of parameter updates the model received? and what if we allow input to be 2 or 3 snapshots of same model?

💬 ❤️

🔗 Luca Soldaini 🎀 · @soldni · 07:57 PM · Feb 25, 2024 UTC

is anyone doing research on out-of-distribution/unnatural prompts and how aligned models respond to them?

Something clearly went wrong in Gemini training, but no one should be ashamed! would be super cool if they wrote a post-mortem that researches how this behavior arises 🙏

💬 ❤️

🔗 Frantastic — e/acc · @Frantastic_7 · 04:09 PM · Feb 25, 2024 UTC

every single person who worked on this should take a long hard look in the mirror.

absolutely appalling. pic.x.com/hII1DmMhJn

💬 ❤️

🔗 Stella Biderman · @BlancheMinerva · 12:54 AM · Feb 13, 2024 UTC

FID is largely meaningless for T2I models because it completely ignores the pairing of the prompt and the output. Develop a "multimodal FID" that is aware of both text and image embeddings based on this paper
arxiv.org/abs/2103.11521

💬 ❤️

Jan 2024

🔗 Stella Biderman · @BlancheMinerva · 03:52 AM · Jan 23, 2024 UTC

@Wetassprior @daphneipp In the Pythia paper we explore the effect of term frequency on fact learning over the course of training. If you squint at Fig. 4, it seems like there is weak evidence that the curves are converging. Is that correct? Maybe log-space the checkpoints?

arxiv.org/abs/2304.01373 pic.x.com/9dhuN77HwI

💬 ❤️

🔗 Stella Biderman · @BlancheMinerva · 05:37 PM · Jan 29, 2024 UTC

@Wetassprior @daphneipp Is “train the model with gradient ascent on bad data” an effective technique for machine unlearning? The extent to which the answer is "no" is a measure of how non-exchangeable the training process is. Is that a useful measurement for anything?

en.m.wikipedia.org/wiki/Exchangea…

💬 ❤️

🔗 Stella Biderman · @BlancheMinerva · 07:13 PM · Jan 15, 2024 UTC

@Wetassprior @daphneipp Do machine unlearning techniques make the resulting models similar to models trained from scratch but without the data that was unlearned? AFAIK, no LLM machine unlearning technique has ever been validated by comparing to the same model trained without the unlearned data

💬 ❤️

🔗 jack morris · @jxmnop · 12:16 AM · Jan 17, 2024 UTC

As an exercise in open science, gonna tweet the research problem I’m stuck on:

i want to align two text embedding spaces in an unsupervised way. The motivation is that in my previous vec2text work, we have to know the embedding model and be able to query it. this is fine in…
Show more

💬 ❤️

🔗 Stella Biderman · @BlancheMinerva · 02:55 PM · Jan 08, 2024 UTC

@Wetassprior @daphneipp Can you tell the difference between SFT DPO and PPO models that had the same base model and are identical up to the algorithm? How much access do you need to make this feasible?

What about in a verifiable computing context where the model provider helps by providing "proof"?

💬 ❤️

🔗 Bill Yuchen Lin · @billyuchenlin · 03:33 AM · Jan 05, 2024 UTC

Do you still remember the 𝗖𝗼𝗺𝗺𝗼𝗻𝗚𝗲𝗻 task? Given a few concepts (nouns/verbs), an LM needs to generate a sentence describing a common scenario covering all given concepts. How well do LLMs perform? Will they outperform humans? I curated a subset and test some popular… pic.x.com/6Sot23fwrp

💬 ❤️

🔗 John Schulman · @johnschulman2 · 09:49 PM · Jan 07, 2024 UTC

Coming soon to your favorite word processor
Ctrl-alt-V: "paste and paraphrase"
also, "paste and match writing style"

💬 ❤️

🔗 Sara Hooker · @sarahookr · 05:23 PM · Jan 01, 2024 UTC

Research Idea 1 2024:

the little prince has been translated into more languages than any other book except the bible (505 languages).

and the book has entered the public domain -- I am surprised no-one has structured it into a instruction style translation dataset?

💬 ❤️

🔗 Tyler Angert · @tylerangert · 08:44 PM · Jan 05, 2024 UTC

Just like fundamentally photoshop is a pixel editor, and we use game engines to edit interactive scenes, we’ll have the equivalent for text editors (and audio / other media) : “game engines” for writing that operate at higher levels of abstraction than editing individual…
Show more

💬 ❤️

🔗 Stella Biderman · @BlancheMinerva · 09:20 PM · Jan 01, 2024 UTC

Starting off on a good note 😅 @Wetassprior tells me Yiming Zheng and @daphneipp did this already!

Here's a new problem: how much pretraining is required to make a LLM fall into a particular loss basin? In particular, until its path-independent?

arxiv.org/abs/2307.06865

💬 ❤️

🔗 Stella Biderman · @BlancheMinerva · 04:12 PM · Jan 01, 2024 UTC

And now, for the first question:

Do a serious study of prompt extraction attacks by writing prompts for publicly released models and then checking how reliably they can be stolen in a blackbox setting.

💬 ❤️

🔗 Sara Hooker · @sarahookr · 03:30 PM · Jan 03, 2024 UTC

Important research direction 2024 — efficient adaptation of English pretrained models to serve other languages.

How to efficiently adapt embeddings without requiring continued pretraining?

If anyone is currently working on this, share your work. Would be fun to collaborate.

💬 ❤️

2023

Dec 2023

🔗 Dhruv Agarwal · @agdhruv · 10:43 AM · Dec 21, 2023 UTC

Current LLMs reflect a Western (pop) culture. How dare you not know who Rahul and Anjali are? 😡😡😡 pic.x.com/RKU9ytfXLl

💬 ❤️

🔗 Linus · @thesephist · 09:08 AM · Dec 29, 2023 UTC

Wow, I just got @AnthropicAI's sparse autoencoder-based feature decomposition technique to work* for text embeddings 🎆

Screenshot below. In order, this output shows:
1. max-activating examples for that feature from the Minipile dataset
2. min-activating examples from the same… pic.x.com/5aUBtMglzX

💬 ❤️

🔗 Martin Signoux · @MartinSignoux · 08:32 AM · Dec 11, 2023 UTC

We need more people building better evaluation tools, rather than just optimising models for existing ones.

Evaluation isn’t as sexy as building models, but it’ll ultimately drive of progress in the field. Notably because people will then optimise for better benchmarks.

11/n

💬 ❤️

🔗 jack morris · @jxmnop · 06:39 PM · Dec 29, 2023 UTC

people keep saying AI is moving so fast. some days I agree, but some days I'm not sure – so many papers published, but I don't feel like we're making that many fundamental breakthroughs.

to cap off 2023, here's a list of things we still don't know about language models:

- how…
Show more

💬 ❤️

🔗 Besmira Nushi 💙💛 · @besanushi · 11:16 PM · Dec 06, 2023 UTC

@HerrDoktorFunk @MimansaJ @ChristophMolnar Error Analysis and other tools can be found here github.com/microsoft/resp… In fact, we did a similar investigation to what @ChristophMolnar is describing on housing data here: github.com/microsoft/resp… pic.x.com/R3NCNm4Oi8

💬 ❤️

🔗 Paul Calcraft · @paul_cal · 07:06 PM · Dec 29, 2023 UTC

@jxmnop - how can we build long-term memory across interactions?
- how can we continually integrate recent information?
- how can we remove/modify specific knowledge?
- how can we make LLMs self-consistent? (e.g. avoid the reverse curse)

Some have been attempted, all v far from solved

💬 ❤️

November 2023

🔗 Manaal Faruqui · @manaalfar · 05:31 PM · Nov 10, 2023 UTC

PhD students: can you please solve the problem of long text evaluation? It is one of the biggest bottlenecks in the quality iteration of LLMs. Which response is more creative? safer? more factual?

💬 ❤️

🔗 Naomi Saphra 🧈🪰 · @nsaphra · 03:02 PM · Nov 10, 2023 UTC

It's not the first time! A dream team of @enfleisig (human eval expert), Adam Lopez (remembers the Stat MT era), @kchonyc (helped end it), and me (pun in title) are here to teach you the history of scale crises and what lessons we can take from them. 🧵arxiv.org/abs/2311.05020 pic.x.com/hyxDqMn71D

💬 ❤️