When I first started working seriously with large language models, something felt familiar – and it took me a while to figure out what it was. The way I was writing prompts reminded me of something I had done before: designing domain-specific languages.
I think prompting is a kind of DSL. Let me explain why.
What Makes a DSL a DSL?
A domain-specific language is not just a simplified programming language. It’s a language designed to express ideas within a specific problem space as naturally and precisely as possible. SQL is a DSL. Regular expressions are a DSL. So is the YAML configuration syntax of your CI pipeline.
What they have in common:
- A limited vocabulary that is highly expressive within its domain
- Structure and conventions that experienced users internalize over time
- A clear target interpreter – something that reads the language and acts on it
- The consequence that small wording choices have large effects on the output
-
A formal grammar – a defined, unambiguous set of rules that determines what is valid syntax and what is not
If you’ve written prompts for more than a few weeks, you’ll recognize the first four. The fifth one is where prompting diverges from classical DSLs – and I think that divergence is exactly what makes it interesting.
Prompting Has the Characteristics of a DSL – Just without the Grammar
A well-structured prompt is not just a question in natural language. It has recognizable patterns: a role definition, a task description, constraints, output format, examples. Experienced prompt engineers internalize these patterns the same way an experienced SQL developer stops thinking about JOIN syntax – it just becomes fluent.
The vocabulary matters. “Summarize” produces different output than “distill”. “Act as a senior architect” changes the register of the response. “Think step by step” activates a different reasoning mode. These are not casual word choices – they are instructions to an interpreter that responds predictably to specific constructs.
But here’s the key difference: no formal grammar enforces any of this. There is no compiler that rejects a malformed prompt with a syntax error. The interpreter – the LLM – is probabilistic. It tolerates ambiguity, infers intent, and sometimes surprises you. The same prompt can produce slightly different results on repeated runs.
And yet, the structural patterns hold. Experienced prompt engineers converge on the same constructs independently, because they work. That’s not a formal grammar – but it might be something equally powerful: an emergent one.
Where the Analogy Gets Interesting
Here’s what I find most compelling: as prompting matures, it develops the same lifecycle concerns as any other language.
Reusability. Good prompts get reused. System prompts, persona definitions, output format instructions – these get copied, adapted, and shared across projects. That’s library behavior.
Versioning. If your team uses shared prompts in production systems, those prompts are code. They belong in version control, with change history and review processes.
Testing. A prompt can be tested against expected outputs. Regression testing for LLM behavior is already an active research and tooling area.
Documentation. If a colleague needs to understand what a prompt does and why it’s structured the way it is, that’s a documentation problem – the same kind we’ve solved (and often neglected) in every other codebase.
The Part We Haven’t Solved Yet
I’ll be honest: we’re only just starting to formalize this. We use AI tools daily – Cursor, Claude, Copilot – and for a long time our prompts lived in individual heads, in scattered files, in copy-pasted system configurations.
Recently, we’ve taken a first step: we started maintaining skill files in a shared repository. These are structured prompt templates for recurring tasks – things like code review instructions, architecture documentation patterns, or analysis workflows. They live in version control, they can be reviewed, and they can be improved over time.
It’s a small step, but it feels like the right direction. It’s the same place code was before we had version control and coding standards. And we know how that ended.
The logical next step is to treat these skill files with the same discipline we apply to any other shared codebase: pull requests, changelogs, and eventually some kind of internal prompt style guide.
We’re not there yet. But maintaining a shared repository of prompts – and noticing that it immediately raises questions about structure, ownership, and versioning – tells me the analogy is more than academic.
Conclusion
Prompting is not
just a DSL – it’s a probabilistic one,
without a formal grammar, but with something that might be equally powerful: an emergent grammar that practitioners develop through experience and converge on independently. It’s more flexible than a traditional DSL, and harder to reason about. But the structural parallels are real.
If you’re building systems that rely on LLMs, I’d argue it’s worth starting to think about your prompts the way you think about your code. Not because it makes everything easier – but because the consequences of not doing so are familiar, and we’ve already learned that lesson once.
What’s your experience? Have you started treating prompts as a first-class artifact in your development process?
Related