Endpoints
Create first Endpoint
The endpoints are fully managed and scalable to handle any AI workfload. They are designed for various applications and support environments such as Automatic1111, vLLM and Whisper. Each endpoint is powered by a specific number of workers (with GPU), dependent on the current load.
To configure your initial endpoint, you will require:
- an active modelserve AI account
- a generated Access Token
- sufficient funds (in USD) in your account
Discover more in the 🚀 Quickstart section.
To generate first AI Endpoint, use the endpoint below:
curl -s -X POST \
-H 'Accept: application/json' \
-H 'Content-Type: application/json' \
-H 'Authorization: Bearer X' \
-d '{"name": "string", "gpu_segment": 0, "package_type": "automatic", "model_repo_name": "string", "model_url": "string", "huggingface_auth_token": "string", "startup_script_url": "string"}' \
'https://api.modelserve.ai/api/v1/clusters/'
import requests
r = requests.post(
"https://api.modelserve.ai/api/v1/clusters/",
headers={
"Accept": "application/json",
"Content-Type": "application/json",
"Authorization": "Bearer X",
},
data={
"name": "string",
"gpu_segment": 0,
"package_type": "automatic",
"model_repo_name": "string",
"model_url": "string",
"huggingface_auth_token": "string",
"startup_script_url": "string",
},
)
fetch('https://api.modelserve.ai/api/v1/clusters/', {
"method": "POST",
"headers": {
"Accept": "application/json",
"Content-Type": "application/json",
"Authorization": "Bearer X"
},
"body": JSON.stringify({"name": "string", "gpu_segment": 0, "package_type": "automatic", "model_repo_name": "string", "model_url": "string", "huggingface_auth_token": "string", "startup_script_url": "string"})
});
Values:
"name": "string"
- custom name for endpoint"gpu_segment": 0
- segment id (learn more in the Segments section)"package_type": "string"
- type of environment, you can choose from:automatic
- Automatic1111vllm
- vLLMspeech2text
- audio (eg. Whisper)
"model_repo_name": "string"
- address of the model repository from HuggingFace"model_url": "string"
- address of the model URL from HuggingFace"huggingface_auth_token": "string"
(optional) - additional security token"startup_script_url": "string"
(optional) - additional script to modify the environment
Example payload:
data={
"name": "My own Stable Diffusion",
"gpu_segment": 8,
"package_type": "automatic",
"model_url": "https://huggingface.co/runwayml/stable-diffusion-v1-5/resolve/main/v1-5-pruned-emaonly.ckpt",
},
data={
"name": "My own Mistral",
"gpu_segment": 9,
"package_type": "vllm",
"model_repo_name": "mistralai/Mistral-7B-v0.1",
},
data={
"name": "My own Whisper",
"gpu_segment": 9,
"package_type": "speech2text",
"model_repo_name": "openai/whisper-tiny",
},
⚠️ Use model_repo_name
for models running on vLLM and Whisper, while use model_url
for models running on Automatic1111.
Display list of Endpoints
After creating the first Endpoint, you can view the list to obtain its ID. This way, you will be able to check the current list of Endpoints, their settings, and status.
To display list of Endpoints, use the endpoint below:
-
"results": [
{
"id": "eca2785b-d094-432b-8722-6c7d2f957eb2",
"name": "SD 1.5",
"gpu_segment": 3,
"package_type": "automatic",
"status": "terminated",
"model_repo_name": null,
"model_url": "https://huggingface.co/runwayml/stable-diffusion-v1-5/resolve/main/v1-5-pruned-emaonly.ckpt",
"startup_script_url": null,
"address": "https://6c7d2f957eb2.app.modelserve.dev-test.golem.network",
"running_workers": [],
"created_at": "2024-01-22T09:24:19.947509Z",
"last_update": "2024-01-22T12:38:42.907606Z"
}]
"id": "eca2785b-d094-432b-8722-6c7d2f957eb2"
- unique ID of endpoint"name": "SD 1.5"
- name for endpoint"gpu_segment": 3
- segment ID"package_type": "automatic"
- type of environment"status": "terminated"
- current status"model_repo_name": null
- address of the model repository from HuggingFace"model_url": "https://huggingface.co/runwayml/stable-diffusion-v1-5/resolve/main/v1-5-pruned-emaonly.ckpt"
- address of the model URL from HuggingFace"startup_script_url": null
- additional script to modify the environment"address": "https://6c7d2f957eb2.app.modelserve.ai"
- address URL of your endpoint"running_workers": []
- more information about workers"created_at": "2024-01-22T09:24:19.947509Z"
- creation date of endpoint"last_update": "2024-01-22T12:38:42.907606Z"
- date of last update of endpoint
The most important values are:
"id": "eca2785b-d094-432b-8722-6c7d2f957eb2"
- unique ID of endpoint"address": "https://6c7d2f957eb2.app.modelserve.ai"
- address URL of your endpoint
Remember to replace the "Bearer X" with your real Access Token. Where to find your Access Token (Bearer)? Learn more in the 🚀 Quickstart section.