Tags → #NLP
-
A Personal Test Suite for LLMs
Most LLM benchmarks are either academic or do not capture what I use them for. So, inspired by some other people, this is my own test suite.
-
Random Research Ideas On Social Media That I Liked
Sometimes I come across random research ideas across the twitter and social media universe that really resonate with me at that moment. Often they get lost in doom scrolling, so I am considering compiling those into a running log.
-
Controlled Evaluation of Explanations: What Might Have Influenced Your Model Explanation Efficacy Evaluation?
End users affect explanation efficacy. NLP papers overlook other factors. This paper examines how utiltiy of saliency-based explanations change with controlled variables. We aim to provide a standardized list of variables to evaluate and show how SoTA algorithms rank differently when controlling for evaluation criteria.