LLmHub API Multi-Round Conversation

Docs

Official API documentation for LLmHub API (api.llmhub.dev)

This guide explains how to use the LLMHUB /chat/completions API for multi-turn conversations. The API is stateless, meaning the server does not retain the context of previous requests. To maintain the conversation flow, you need to concatenate all previous conversation messages and send them with each new request.

Below is an example in Python using the OpenAI library that demonstrates how to achieve multi-turn conversations by using model="automatic".

Python Example

from openai import OpenAI
 
llmhub_client = OpenAI(
    base_url="https://api.llmhub.dev/v1",
    api_key="API KEY",
)
 
 
# Round 1: Ask the first question.
messages = [{"role": "user", "content": "What's the highest mountain in the world?"}]
response =  llmhub_client.chat.completions.create(
    model="automatic",
    messages=messages
)
 
# Append the assistant's reply to the conversation context.
messages.append(response.choices[0].message)
print(f"Messages Round 1: {messages}")
 
# Round 2: Ask a follow-up question.
messages.append({"role": "user", "content": "What is the second highest?"})
response =  llmhub_client.chat.completions.create(
    model="automatic",
    messages=messages
)
 
# Append the new reply to the conversation context.
messages.append(response.choices[0].message)
print(f"Messages Round 2: {messages}")

How It Works

Stateless API:
Each call to the LLMHUB /chat/completions endpoint is independent. The server does not remember previous requests.
Maintaining Context:
To simulate a conversation, you must send all previous messages along with the new prompt. This way, the API can generate a response that takes the entire conversation into account.
Round 1:
- Initial Message:
```
[
    {"role": "user", "content": "What's the highest mountain in the world?"}
]
```
- The assistant's response is received and appended to the messages list.

Round 2:

Updated Message History:

[
    {"role": "user", "content": "What's the highest mountain in the world?"},
    {"role": "assistant", "content": "The highest mountain in the world is Mount Everest."},
    {"role": "user", "content": "What is the second highest?"}
]

The entire conversation history is sent to the API, which then generates a context-aware response.

Using this approach, you can build applications that support dynamic, context-rich conversations by continuously appending responses to the conversation history.

This method ensures that every API call considers the entire conversation, allowing for more natural and coherent multi-turn interactions with LLMHUB.

Models FAQ