Add analytics and AB tests to your LLM app in JavaScript using phospho

Add analytics and AB tests to your LLM app in JavaScript using phospho

Building an AI agent in JavaScript is now easier than ever. There is plenty of libraries like Langchain JS or Vercel AI to build cool products.

But like any cutting-edge technology, it comes with its own set of challenges. Hallucinations, incorrect responses, and misunderstandings can plague your LLM based product.

How to build an AI people want? You need to measure the quality and user satisfaction of your AI-powered product. It’s important to set clear goals and KPIs for your AI agent, focusing not just on its technical capabilities but also on how it meets the needs of its users.

You can’t build great products without monitoring how they are used.

So the first step of the journey is to start collecting analytics to build relevant KPIs.

Wait, what are analytics?

In software, analytics spans from performance metrics to user satisfaction insights :

  • Performance Analytics: Dive deep into code execution details to catch bugs and optimize performance. Is everything working as intended?
  • Product Analytics: Assess whether your agent meets the expectations of end-users. Am I making my users happy?

To create exceptional software, it’s crucial to balance both types of analytics. And it’s the same for an agent or an AI.

What analytics for your LLM app?

In order of priority, you want make sure that :

  1. Your code runs well and without bugs or crashes.
  2. The agent answers correctly in the main use case.
  3. The doesn’t do something crazy in edge cases.
  4. Users like the agent.

Now, the JavaScript code!

Assuming you have a basic JavaScript agent like this :

const myAgent = (query) => {
  // Here, you'd do complex stuff or calls to LLM APIs like OpenAI.
  // But for this example we'll just return the same answer every time.
  return "Hello World!";
};

You want to collect user inputs and associated outputs of your app. For this example, let’s use phospho, a product analytics platform for LLM apps.

Step 1 : Install the phospho module

npm i phospho

Step 2: Create an account on the phospho platform. Add your API key and project id as environment variables.

export PHOSPHO_API_KEY = "YOUR_API_KEY"
export PHOSPHO_PROJECT_ID = "YOUR_PROJECT_ID"

Step 3: Start capturing interactions of users with your app

import { phospho } from "phospho";

phospho.init();

const myAgent = (query) => {
  // Here, you'd do complex stuff.
  // But for this example we'll just return the same answer every time.
  return "It's Paris of course.";
};

// Log events to phospho by passing strings directly
phospho.log({
  input: question,
  output: myAgent(question),
});

Now your done! You will collect every interaction with your app, and now be able to build relevant KPIs to improve your product.

How to improve the product?

In the end, there is only one thing that matters : are you building something people want?

1. Understand how your product is currently used

phospho also detects events. For example, here phospho detected that this interaction was a “question answering” event. Set up custom events that trigger webhooks when detected. For example, to receive a slack message when a user discuss a certain topic.

2. Measure the success of the product in doing what the users want

phospho automatically labels the task as a success or a failure with default criterias. It can be improved with feedback given by you or by your users. Therefor, when a new tasks is logged to phospho, it can be classified as a successful or unsuccessful ineractions based on your users (or team) criteria

3. Improve the product : AB tests

Now it’s time to improve this success score! Release several new versions and use phospho’s AB testing tools to see which ones perform better.

Conclusion

In conclusion, enhancing the performance and user satisfaction of your JavaScript agent involves adding analytics into your development process as early as possible.

By strategically addressing both performance and product analytics, you can ensure that your agent is actually doing what your users are expecting.

Explore phospho to learn more about how to improve your LLM app.