Chat Completions API

The /v1/chat/completions endpoint allows you to interact with advanced large language models (LLMs) through a unified API, supporting both OpenAI-compatible and Anthropic-compatible models. This endpoint supports conversational AI, function calling, streaming, and multimodal (text+image) input, depending on model capabilities.

Endpoint

POST https://api.llm.vin/v1/chat/completions

Authentication

Request Format

{
  "model": "grok-3-mini",
  "messages": [
    {
      "role": "user",
      "content": "Write a one-sentence bedtime story about a unicorn."
    }
  ],
  "temperature": 1.0,
  "max_tokens": 256,
  "stream": false,
  "tools": [],
  "tool_choice": null,
  "stop": null
}

Required Parameters

Parameter Type Description
model string The ID of the model to use for chat. See Available Models.
messages array List of message objects representing the conversation history. Each message must have a role (user, assistant, or system) and content.

Optional Parameters

Parameter Type Default Description
temperature number 1.0 Sampling temperature to use (higher values = more random).
max_tokens integer 256 Maximum number of tokens to generate in the response.
stream boolean false If true, response will be sent as a stream of data chunks (see Streaming).
tools array [] List of tool definitions for function calling (if supported by the model).
tool_choice string/object/null null Specify which tool to use if multiple are provided.
stop string/array/null null Sequences where the API will stop generating further tokens.

Multimodal Input

Response Format

{
  "id": "chatcmpl-1716151540",
  "object": "chat.completion",
  "created": 1716151540,
  "model": "grok-3-mini",
  "choices": [
    {
      "index": 0,
      "message": {
        "role": "assistant",
        "content": "Once upon a time, a unicorn danced across the stars and wished you sweet dreams."
      },
      "finish_reason": "stop"
    }
  ],
  "usage": {
    "prompt_tokens": 18,
    "completion_tokens": 22,
    "total_tokens": 40
  }
}

Tool Calls (Function Calling)

If the model supports function calling and a tool is invoked, the response may include a tool_calls array in the message:

"tool_calls": [
  {
    "id": "call-abc123",
    "type": "function",
    "function": {
      "name": "get_weather",
      "arguments": "{\"city\": \"Paris\"}"
    }
  }
]

Streaming

If you set "stream": true, the response will be sent as a series of Server-Sent Events (SSE), with each chunk containing a partial completion:

Example stream chunk:

data: {"id":"chatcmpl-...","object":"chat.completion.chunk","created":...,"model":"gpt-4.1","choices":[{"index":0,"delta":{"content":"Once upon a time,"},"finish_reason":null}]}

data: {"id":"chatcmpl-...","object":"chat.completion.chunk","created":...,"model":"grok-3-mini","choices":[{"index":0,"delta":{"content":" a unicorn"},"finish_reason":null}]}

data: [DONE]

Available Models

Use the /v1/models endpoint to list available models and their capabilities.

Model ID Description Capabilities
grok-3-mini Advanced conversational LLM with function calling and tool support. chat_completions, function_calling
(Other models may be available depending on configuration.)

Example Requests

Basic Chat Completion

curl "https://api.llm.vin/v1/chat/completions" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "grok-3-mini",
    "messages": [
      {
        "role": "user",
        "content": "Write a one-sentence bedtime story about a unicorn."
      }
    ]
  }'

Streaming Chat Completion

curl "https://api.llm.vin/v1/chat/completions" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "grok-3-mini",
    "messages": [
      {
        "role": "user",
        "content": "Tell me a joke."
      }
    ],
    "stream": true
  }'

Chat with Function Calling

curl "https://api.llm.vin/v1/chat/completions" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "grok-3-mini",
    "messages": [
      {
        "role": "user",
        "content": "What is the weather in Paris?"
      }
    ],
    "tools": [
      {
        "type": "function",
        "function": {
          "name": "get_weather",
          "description": "Get the current weather for a city.",
          "parameters": {
            "type": "object",
            "properties": {
              "city": { "type": "string" }
            },
            "required": ["city"]
          }
        }
      }
    ]
  }'

Multimodal (Text + Image) Chat

curl "https://api.llm.vin/v1/chat/completions" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "grok-3-mini",
    "messages": [
      {
        "role": "user",
        "content": [
          { "type": "text", "text": "What is in this image?" },
          { "type": "image_url", "image_url": { "url": "data:image/png;base64,..." } }
        ]
      }
    ]
  }'

Error Handling

The API returns standard HTTP status codes:

Status Code Description
200 Success
400 Bad request (missing or invalid parameters)
401 Unauthorized (invalid or missing API key)
403 Forbidden (insufficient permissions for model)
404 Not found (invalid model)
429 Too many requests (rate limit exceeded)
500 Server error

Error responses include a JSON object:

{
  "error": {
    "message": "Model 'grok-3-mini' not found",
    "type": "invalid_request_error",
    "code": "model_not_found"
  }
}

Rate Limits

Notes