Have you ever been in a group chat where someone says “lol” and an AI bot immediately responds with a full paragraph? Or sent a quick “ok” to a friend and watched the assistant generate a thoughtful essay in reply?
This is the problem that inspired Shrug. AI chatbots always respond. It doesn’t matter if you sent a meme, a one-word acknowledgement, or a message clearly not directed at them. They reply anyway, because that’s what they’re trained to do: generate text. No one taught them the more fundamental skill of deciding whether to speak at all.
A Decision Gate, Not a Generator
Shrug is a reply-decision model. It doesn’t decide what to say — it decides whether to reply at all.
The idea is simple: instead of feeding every message directly into a language model and asking it to generate a response, you first run it through Shrug. Shrug looks at the conversation context and outputs a single number
This sounds obvious, but most chatbots don’t do this. They treat every incoming message as a prompt that demands an answer. Shrug introduces a filter, a moment of hesitation between receiving a message and deciding to speak.
Humans Are Not Threshold Machines
The most interesting design decision in Shrug is how it uses that probability. You might expect a simple rule: reply if
Think about your own behavior in conversations. Sometimes you see a message at
Shrug embraces this. The model outputs only the probability
- At
, there’s still a chance of silence - At
, there’s still a chance of an interjection
This naturally introduces the imperfections that make human conversation feel real: hesitation, omission, and impulse.
Each dot represents one decision. Green = replied, Gray = silent. Click "Regenerate" to see new random outcomes.
How It Works
Under the hood, Shrug is a binary classifier built on top of a language model. I use Qwen3-8B with LoRA adapters for efficient fine-tuning, which means only a small fraction of the model’s parameters are updated during training.
The training data is remarkably simple: a chronological stream of messages from my own conversations, tagged as either <me> or <others>. A sliding window moves over this stream and asks one question: given these last N messages, did I send the next one? If yes, the label is 1. If no, the label is 0.
What’s interesting is that consecutive messages from me are valid positives. The task isn’t turn-taking detection (“is it my turn?”); it’s reply-decision modeling (“would I have replied to this?”). Even if I just sent three messages in a row, the model learns that I might send a fourth.
All messages are considered read. Silence is treated as a deliberate decision, not as “didn’t see it.” This is an important philosophical choice: in human conversation, not replying is itself a form of communication.
If you want to see the exact architecture, data pipeline, and training code, it’s all in the Shrug repository.
From 2 Hours to 38 Minutes
My first training run took 1 hour and 59 minutes. For a model that just outputs a single number, this felt unexpectedly long. But after iterating on the training configuration, I got it down to 38 minutes.
The biggest gains came from reducing the context window: from 36 messages down to 16. Looking at my own message history, most reply decisions don’t require that much context. Smaller windows mean fewer tokens per sample, which directly translates to faster training.
I also cut the training from 3 epochs to 2. The model converged quickly, so that third epoch was mostly redundant. Other changes — a lower learning rate with a constant-with-warmup scheduler, reduced warmup ratio, and larger LoRA rank and alpha — helped the model learn more efficiently per epoch rather than simply training longer.
A subtler but important addition was dynamic class weighting. In conversation data, “not replying” is far more common than “replying.” By weighting the loss function inversely to class frequency, the model learned faster from the minority class instead of defaulting to “always silent.”
Here’s what the training looked like before and after:

Where This Could Go
Shrug is currently a personal experiment, but the concept has broader implications. Any chatbot living in a group chat or handling DMs could benefit from a reply-decision layer. Imagine:
- A Discord bot that knows when a conversation doesn’t involve it
- A customer support AI that stays quiet when users are clearly talking to each other
- A personal assistant that doesn’t interrupt your flow with unnecessary confirmations
The ultimate goal is making AI feel less robotic by giving it the ability to choose silence. Real conversation isn’t just about what you say — it’s about when you choose not to say anything at all.
If this idea resonates with you, the code is open source at MrWillCom/Shrug.
Why “Shrug”?
The name comes from that moment of hesitation before deciding whether to reply. You read a message, shrug to yourself, and make a call. Sometimes you respond. Sometimes you don’t. The shrug is the decision.
¯\_(ツ)_/¯