Beyond the Simulacra of Human Behavior

Beyond the Simulacra of Human Behavior
The redesign came with custom visuals -- no more stock images 😎

In college, I read a paper that changed my life. Generative Agents: Interactive Simulacra of Human Behavior (Park et al., 2023). It was written towards the beginning of the Agentic AI era and introduced the idea of placing simulator agents in a sandbox environment to mimic human behavior through their reactions. Each of their agents used a decision-making loop that had five key parts: Percieve, Retrieve, Plan, Reflect, Act.

Park et al., 2023

Most conversational AI agents today follow a derivative of this decision-making loop. It helps agents make informed decisions given current and previous context to generate real-time responses, but doesn't address a huge part of agentic development: Improvement. Improvement is so important that it deserves its own loop: Simulate, Monitor, Synthesize, Act.

Self-Improvement loops is how Bluejay goes beyond the simulacra of human behavior and begins functioning as a vital organ in the body of agentic development.

Bluejay's proposed model for Self-Improvement loops in Conversational AI Agents

Today, users are improving production agents manually based on production and test data. Bluejay has already automated the context generation required for a self-improvement loop: fine-grained ML & LLMAAJ evals on production and synthetic conversations with customer agents. There is value in this alone, but requires HITL to make the changes based on our insights. Now, with customer-specific accrued context, Bluejay is tackling the final step in the Self-Improvement process: Act.

This will become a standard for agent development. Customer behavior changes and features get added to production agents that cause micro-deviations in intended behavior. If you don't have agents that are specifically focused on keeping your conversational AI agent "up-to-speed" with adapting customer behavior, your agent will fall out of touch with your ever-shifting customer landscape.

As a recap:

  1. Bluejay is a testing and monitoring platform for Conversational AI agents. Companies ranging from Fortune 10 enterprises to fast-growth startups in the Silicon Valley use Bluejay to make sure their voice and text agents work in production (monitoring) and development (testing) environments.
  2. Our team, now seven strong, works around the clock to make sure your agent behaves when talking to customers.
  3. This newsletter is 100% human written. It always has been, and it always will be. Ask yourself about what you are consuming. If the writer hasn't read it, why should you?

The Voice AI community is growing, and we're so happy to facilitate its growth here at Bluejay. After all, it plays directly into our mission:

Give Humanity the tools to trust Artificial Intelligence.

Announcements

Here's what happened at Bluejay last week:

  • We hired an engineering intern and a marketing person!
  • Rohan went to Chattanooga for a Voice AI conference and spoke on an AI governance panel.
  • The engineering team spent the Friday in South Bay, visiting customer offices and helping customers architect better voice agents.
  • Last week, the team pushed 28,346 lines of code to make Conversational AI reliable.

Feature Spotlight: Improving System Prompts from Simulation Failures

0:00
/0:23

Bluejay can run simulations and tell you how to fix your agent's issues.

Via BluejayAI, users can now learn exactly why their agents failed in a simulation and receive recommendations on system prompt changes that can fix the issues encountered. This speeds up the improvement loop while recommending fixes that are grounded in real issues that Bluejay finds in conversations with your agent.

Coming Soon

  • Improvement loops as a self-serve feature!
  • Huge updates regarding The Nest...

That's all for now. I'll see you next time!

Faraz Siddiqi
Co-Founder & CTO @ Bluejay

Subscribe to The Bluejay Times

Sign up now to get access to the library of members-only issues.
Jamie Larson
Subscribe