Gautam Tata
CTRL: A Language Model for Controllable Text Generation
CTRL: A Language Model for Controllable Text Generation

CTRL: A Language Model for Controllable Text Generation

Language models are getting good at generating text. The problem is, you can't tell them what kind of text to generate. Ask GPT-2 to write a Wikipedia article and you might get a Reddit post. Ask it for a horror story and you might get a news article. The output is coherent, but unpredictable.

CTRL changes this. It's a 1.63 billion parameter transformer trained with explicit control codes that let you specify the domain, style, and format of generated text. Want Wikipedia prose? Start with Wikipedia. Want a product review? Use Reviews. The model learns to condition its output on these codes, giving you actual control over what it produces.

The paper and code are now public.

How Control Codes Work

The insight is simple: if you tag your training data with metadata about its source, the model learns to associate those tags with specific writing patterns. During generation, you provide the tag, and the model produces text that matches.

CTRL was trained on 140 GB of text from diverse sources - Wikipedia, Project Gutenberg, Amazon Reviews, Reddit, news articles, OpenWebText, and more. Each source gets a control code derived from its natural structure. Reddit posts are tagged with subreddit names. Amazon reviews are tagged with star ratings. Wikipedia is tagged as Wikipedia.

The model doesn't just memorize these associations - it learns the statistical patterns that distinguish each domain. The result is a single model that can write in dozens of different styles on command.

Generation Examples

Here's what controlled generation looks like in practice.

Web content style (Links):

Links In a shocking finding, scientist discovered a herd of unicorns 
living in a remote, previously unexplored valley, in the Andes Mountains. 
Even more surprising to the researchers was the fact that the unicorns 
spoke perfect English.

The scientists were able to track down and capture one of these unicorn 
herds as it roamed through the valley for several days. They then took 
DNA samples from each animal and compared them with other animals found 
throughout the region...

Encyclopedia style (Wikipedia):

Wikipedia Salesforce Inc. is a software company that provides cloud 
computing and business management software. It was founded in 1999 
by Marc Benioff as an online retailer of enterprise software, but it 
has since expanded into other areas such as e-commerce, customer 
relationship management, and digital media...

Negative product review (Reviews Rating: 1.0):

Reviews Rating: 1.0
I bought this for my son who is a huge fan of the show. He was so 
excited to get it and when he opened it, we were all very disappointed. 
The quality of the product is terrible. It looks like something you 
would buy at a dollar store.

Positive product review (Reviews Rating: 4.0):

Reviews Rating: 4.0
I bought this for my husband and he loves it. He has a small wrist 
so it is hard to find watches that fit him well. This one fits perfectly.

Same underlying model, completely different outputs - controlled by a single token.

Reddit-Style Generation

The model captures the distinctive voice of different Reddit communities:

r/running:

Running Title: I just ran my first 5k in under 30 minutes.
Text: It was a pretty easy run, but it felt great. My goal is to run 
at least 10 miles per week for the next few months.

Thanks r/running!

Edit: Thank you all so much for your kind words and advice.

r/fitness:

Fitness Title: How do you get your body to burn fat when you're 
not working out?
Text: I know that the reason why people lose weight is because they 
are eating less and burning more calories than their bodies need but 
how does this work for someone who doesn't have a lot of time or 
energy to exercise?

r/personalfinance:

Finance Title: I have a question about my credit score.
Text: So I'm currently in the process of applying for an auto loan 
and I've been told that if you don't pay your bills on time then 
they will not approve you.

My questions are:
1. How do I know when to start paying off debt?
2. What is the best way to get out of debt without having to file 
   bankruptcy?

The model has learned not just vocabulary, but the structural patterns and social norms of each community.

Source Attribution

This might be the most interesting capability. Given arbitrary text, CTRL can estimate which training domain it most likely came from by computing perplexity under different control codes. Lower perplexity means the text is more probable under that domain.

PROMPT: I lost 10 lbs! Feeling great!
Diet ppl = 28.96
Weight ppl = 29.22
Fitness ppl = 36.16
PROMPT: My landlord is suing me for unpaid rent
Legal ppl = 21.21
Finance ppl = 24.62
Saving ppl = 27.92
PROMPT: And then I saw him, the man in the mirror.
Horror ppl = 17.92
Scary ppl = 18.59
Writing ppl = 23.15
PROMPT: I love God
Christianity ppl = 55.65
Atheism ppl = 116.81
Confessions ppl = 133.62

This opens up interesting applications for analyzing text style and understanding what patterns the model has learned from different sources.

Zero-Shot Translation

By including translation pairs in the training data with language tags, CTRL can perform basic translation:

Translation English : This is a natural language processing model 
that aims to generate coherent text in a controllable manner. ; French :
Il s'agit d'un modèle de traitement du langage naturel qui vise à 
générer un texte cohérent et contrôlable.
Translation English : This is a natural language processing model 
that aims to generate coherent text in a controllable manner. ; German :
Es handelt sich um ein natürliches Textverarbeitungssystem, das auf 
eine einheitliche und kontrollierbare Erzeugung von Text abzielt.

Not a dedicated translation model, but the capability emerges naturally from the training setup.

Question Answering

Questions Q: What is the capital of Australia?
A: Canberra
Q: How many people live in Canberra?
A: 650,000

The format conditioning extends to structured tasks like Q&A.

Penalized Sampling

One technical contribution worth noting: CTRL introduces a sampling mechanism that discounts the probability of recently generated tokens. This reduces the repetitive loops that plague language model generation without completely destabilizing the output.

Combined with temperature and top-k controls, this produces text that's both coherent and varied - a balance that's been difficult to achieve with previous sampling approaches.

The Architecture

CTRL uses the Transformer architecture with some modifications:

  • 1.63 billion parameters
  • 48 layers
  • Trained on 140 GB of text
  • Control codes are prepended to sequences during training
  • The model learns to condition all subsequent predictions on the control code

The control codes aren't a separate mechanism - they're just tokens that happen to appear at the start of sequences. The model learns their significance through standard language modeling.

Limitations

The control is only as good as the training data. You can't use control codes the model wasn't trained on, and the model's understanding of each domain is limited by the data it saw. Some control codes work better than others depending on how well-represented they were in training.

The model also has the standard language model limitations - it can generate plausible-sounding but incorrect information, and it has no mechanism for verifying factual claims.

Conclusion

Controllable generation is a significant step forward for language models. Instead of hoping the model produces the right style, you can specify it explicitly. This makes language models more practical for real applications where you need predictable, domain-appropriate output.

The combination of explicit control codes, source attribution, and multi-domain capabilities in a single model suggests a direction for making language models genuinely useful tools rather than impressive but unpredictable demos.

Resources