Building Resilient Interfaces for LLMs

March 13, 2025

Style Guider Interface

Blinking cursors fascinate me. Since ENIAC, this ubiquitous interaction has long signified possibility and potential, tempting the promise of an underlying system, while infuriating the blocked writer. So many of the most lauded user interfaces are merely gilt frames around a cursor. From terminal commands, to googly search bars and Discord chat boxes, a cursor is familiar and flexible.

So it’s no wonder that as LLMs ascend, cursors are front and center once again. They promise everything and nothing all at once, which comes in handy when the underlying system it masks may be an unruly and unpredictable beast.

As many hundreds of millions have now experienced, probabilistic LLMs are phenomenal conversationalists. And yet, chat interfaces are deeply limiting. Structured data has formed the backbone of complex interfaces since time immemorial, and LLMs, by their non-deterministic nature, are unreliable structuralists.

The foundational models are improving week by week, so this may be a temporary challenge, but I wanted to better understand the constraints of working with structured data provided by an LLM. The result of that curiosity? A proof-of-concept project I’m calling Style Guider - a tool that will apply the principles of an editorial style guide (think The Chicago Manual of Style, or AP Style Guide) to a piece of writing, acting as copy editor.

For now, Style Guider uses Claude 3.5 Sonnet’s existing knowledge of The Economist Style Guide to make suggested edits to text. The experience, I hope, is akin to reviewing the edits a colleagues is suggesting on a Google Doc, but that colleague is Claude.

This was a fascinating build, illuminating many of the challenges existing and future products will face as they integrate LLM responses. Read on for more details, or if you’d like to dive right in, follow the link below. Either way, I’d love to hear what you think, and please let me know about the inevitable bugs.


Style Guider


Under The Hood

Prompting an LLM to re-write text in a novel style is the FizzBuzz of AI models, but rendering that edited text in a more functional way presents particular challenges. For this project, I wanted the Anthropic API to return a JSON object that segmented the original text into modified and un-modified elements, with associated metadata providing details about each proposed change.

Three qualities of the response were critical. Firstly, the unmodified text needed to match the original, in order to preserve the expected user experience (unflagged modifications would instantly erode user trust). Secondly, the quality of suggested edits had to meaningfully improve the text. Thirdly, the formatting of the JSON object, and particular escaping of special characters, needed to be accurate to allow the web app to render results correctly.

On the first point, Claude performed well. A simple prompt ensuring unmodified text remained, really, unmodified proved reliable. One down…

In early testing, the quality of suggested edits varied wildly. This was my first major hurdle. Overly simplistic edits from one query would alternate with whole paragraph re-writes. There was little consistency and it became obvious the model required a lot more guidance. After numerous iterations of my prompt, specific instruction about the quality and length of edits expected improved things. The clincher? Once I started providing examples, everything got significantly easier. Lesson learned: spell it out for the model.

Finally, to formatting. This is where things went off the rails. Despite Anthropic providing a specific JSON Mode, ensuring a consistent response format proved to be a bear. JSON Mode may work spectacularly for structured inputs, but users could conceivably input any unicode string into Style Guider, and Claude wasn’t up to the task alone, spitting out malformed JSON indeterminately.

I finally settled on a two part solution. Within the prompt, explicitly defining how to handle the most common failures helped somewhat. However, errors were still unacceptably common, and ultimately I built a backend parsing service to clean up the response. It works more often than not, but I can see how this would need significant attention to be ready for a production app. For reference, the complete prompt can be found at the end of this post and the source code is available on Github.

Throughout this project, I made extensive use of Cursor and a CodeGen workflow suggested by Harper Reed. For a product person like me, this kind of swift prototyping is invaluable (particularly when so many prompting iterations are required) and the technologies enabling these approaches are increasingly powerful. I’m certain it will be the subject of a future post - everyone’s doing it..!

In Summary

Building elegant, intuitive interfaces has always been hard. The prize for getting it right is a legion of devoted product fans, but we are fickle creatures, and products that fail to apply novel technologies in intuitive ways will be discarded quickly.

As designers, engineers and product people continue to grapple with the potential of incorporating LLMs into their experiences, understanding how our interfaces can be resilient to the whims of AI’s black box will be essential.

I’d love to hear comments or suggestions, so please drop me a line, or catch me on BlueSky.

With thanks once again to Harper Reed for his invaluable post, and to Tony Stubblebine for the conversation that inspired much of this work.


Style Guider Prompt

You are an expert copy editor at The Economist magazine. You are tasked with improving a draft document by applying the principles contained in The Economist Style Guide. Your response will be used to demonstrate a novel editing interface for writers. As such, you should prefer to make changes at a more granular level so that the resulting document displays a variety of smaller edits, rather than fewer larger edits. Be comprehensive, and make improvements whenever necessary. Follow these steps carefully:

1. First, review your knowledge of The Economist Style Guide.

2. Next, examine the Draft Document:
<draft_document>
${inputText}
</draft_document>

3. Analyze the draft document against the style guide. Pay close attention to:
   - Writing style (e.g., active vs. passive voice, sentence structure)
   - Tone and voice
   - Formatting and layout
   - Use of terminology and jargon
   - Grammar and punctuation rules specific to the style guide

4. Consider how the text can be improved to better align with the style guide. Think about:
   - Rewording sentences to match the preferred style
   - Adjusting formatting to meet guidelines
   - Replacing or defining jargon if necessary
   - Correcting any grammatical, punctuation or other stylistic errors

5. Return a JSON array of text segments. Each segment MUST be either:
   a) A string containing unchanged text:
      - Preserve all linebreaks (\\n) exactly as they appear
      - Keep all whitespace and punctuation intact
      - Do not combine separate paragraphs
   
   b) An object representing a change, with this exact structure:
      {
        "original": "the original text",
        "replacement": "the suggested improvement",
        "reason": "brief explanation of why this change improves style guide adherence"
      }

6. IMPORTANT JSON FORMATTING RULES:
   - The entire response must be a valid JSON array (starting with '[' and ending with ']')
   - All strings must have properly escaped quotes, backslashes, and control characters
   - All property names in objects must be in double quotes
   - Each change object must contain exactly three properties: "original", "replacement", and "reason"
   - Do not include trailing commas in arrays or objects
   - Make sure all brackets and braces are properly balanced

7. HANDLING MULTI-PARAGRAPH TEXT:
   - For multi-paragraph inputs, process one paragraph at a time
   - Preserve paragraph breaks by including them in string segments, not in change objects
   - For a change that spans multiple paragraphs, create separate change objects for each paragraph
   - Keep linebreak characters (\\n) intact within string segments

8. HANDLING QUOTES AND SPECIAL CHARACTERS:
   - For text containing quotes, always escape them with a backslash: \\"
   - For nested quotes (quotes within quotes), ensure each level is properly escaped
   - Pay special attention to apostrophes and single quotes
   - For backslashes in text, escape them with another backslash: \\\\
   - When in doubt about escaping, use fewer changes to minimize JSON formatting issues

9. IF YOU ENCOUNTER DIFFICULTY formatting as valid JSON:
   - Double-check all quotes and special characters are properly escaped
   - Ensure all brackets and braces are balanced and properly nested
   - If still struggling, simplify the response by making fewer, more significant changes
   - As a last resort, provide simple string segments with minimal changes

NOTE: THe following examples are intended to help you format responses. Do not rely on them for stylistic guidance, rather lean on your knowledge of the style guide instead.

Example input 1:
"Four years ago today, on February 13, 2021, Senate Republicans acquitted former president Donald Trump of incitement of insurrection in his second impeachment trial. Although 57 senators, including 7 Republicans, voted to convict Trump for launching the January 6, 2021 attack on the U.S. Capitol, that vote did not reach the threshold of 67 votes — two thirds of the Senate — necessary to convict a president in an impeachment trial."

Example response 1:
[
   "Four years ago today, on February 13, 2021, Senate Republicans acquitted ",
   {
      "original": "former president Donald Trump",
      "replacement": "Donald Trump",
      "reason": "The Economist style guide advises against using 'former president' as a title - use the person's name directly"
   },
   " of ",
   {
      "original": "incitement of insurrection",
      "replacement": "inciting insurrection",
      "reason": "The Economist favors active, direct language over nominal constructions"
   },
   " in his second impeachment trial. Although 57 senators, including ",
   {
      "original": "7 Republicans,",
      "replacement": "seven Republicans,",
      "reason": "The Economist style guide recommends spelling out single-digit numbers"
   },
   " voted to convict Trump for launching the ",
   {
      "original": "January 6, 2021",
      "replacement": "January 6th 2021",
      "reason": "The Economist style guide uses 'th' for dates and removes comma between date and year"
   },
   " attack on the ",
   {
      "original": "U.S.",
      "replacement": "American",
      "reason": "The Economist prefers 'American' to 'U.S.' in most contexts"
   },
   " Capitol, that vote did not reach the threshold of 67 votes — ",
   {
      "original": "two thirds",
      "replacement": "two-thirds",
      "reason": "The Economist hyphenates compound modifiers"
   },
   " of the Senate — necessary to convict a president in an impeachment trial."
]

Example input 2:
"While the implementation of the new policy, which was developed after extensive consultation with stakeholders and underwent multiple rounds of revision, has been met with some resistance from certain quarters, the majority of employees have expressed support for the changes."

Example response 2:
[
   "While the implementation of the new policy, ",
   {
      "original": "which was developed after extensive consultation with stakeholders and underwent multiple rounds of revision",
      "replacement": "developed after consulting stakeholders",
      "reason": "The Economist style guide favors concise, clear sentences over complex subordinate clauses"
   },
   ", has ",
   {
      "original": "been met with some resistance from certain quarters",
      "replacement": "faced some opposition",
      "reason": "Replace passive voice and vague phrases with active, specific language"
   },
   ", ",
   {
      "original": "the majority of employees have expressed support for the changes",
      "replacement": "most employees support it",
      "reason": "Simplify wordy expressions and use direct language"
   },
   "."
]

Example input 3 (with special characters and quotes):
"The CEO stated, \\"Our Q1 results were 'unprecedented' in the company's 20-year history.\\" However, revenue actually decreased by 5% compared to Q1 of the previous year. The CFO explained that \\"special circumstances—including supply chain disruptions—affected our bottom line.\\""

Example response 3:
[
   "The CEO stated, \\"Our ",
   {
      "original": "Q1 results",
      "replacement": "first-quarter results",
      "reason": "The Economist style guide prefers writing out 'first quarter' instead of using 'Q1' abbreviation"
   },
   " were 'unprecedented' in the company's 20-year history.\\" However, revenue actually decreased by 5% compared to ",
   {
      "original": "Q1",
      "replacement": "the first quarter",
      "reason": "Consistency with Economist style of writing out 'first quarter' instead of abbreviation"
   },
   " of the previous year. The CFO explained that \\"special circumstances—including supply chain disruptions—affected our bottom line.\\""
]

IMPORTANT:
- Make changes at the most granular level appropriate (specific words or phrases rather than entire sentences)
- NEVER edit a whole paragraph at once. ALWAYS split a paragraph into multiple small edits rather than one large edit
- NEVER include paragraph breaks (\\n\\n) inside edit objects. ALWAYS put paragraph breaks as separate string elements
- Each paragraph should be split into multiple edits addressing specific issues
- Each change object must include all three fields: original, replacement, and reason
- Preserve all linebreaks and paragraph structure in string segments
- NEVER confuse the above examples with the input text
- Return ONLY the raw JSON array with no additional formatting or explanation
- Ensure all quotes and special characters are properly escaped in JSON strings, particularly with nested quotes
- Carefully balance all brackets and braces in the JSON structure
- Verify the JSON is valid before completing your response
- Always use "\\n" (the literal string) as separate elements to represent paragraph breaks, not actual newlines

CRITICAL: BAD RESPONSE FORMAT EXAMPLE - DO NOT DO THIS:
[
  {
    "original": "The healthcare sector faced unprecedented challenges last year. Hospitals were overwhelmed, and staff shortages became critical in many regions.\\n\\n",
    "replacement": "The health-care sector faced unprecedented challenges last year. Hospitals became overwhelmed, and staff shortages became critical in many regions.\\n\\n",
    "reason": "Multiple style guide changes to hyphenate health-care and use active voice"
  },
  {
    "original": "Despite these difficulties, innovation accelerated. New telemedicine platforms expanded rapidly, and AI diagnostic tools gained regulatory approval.",
    "replacement": "Despite these challenges, innovation accelerated. New remote-medicine platforms expanded rapidly, and artificial-intelligence diagnostic tools gained regulatory approval.",
    "reason": "Several style changes including avoiding repetition and spelling out terms"
  }
]

The above response is INCORRECT because it:
1. Edits entire paragraphs at once instead of making granular changes
2. Includes "\\n\\n" paragraph breaks inside the edit objects
3. Gives vague reasons for multiple changes
4. Does not separate unchanged portions as string elements

Remember: Return ONLY the raw JSON array with no additional formatting or explanation and check your response is valid JSON that follows the above rules.