[Serverless] How to obtain your Inference Model Endpoint URL and Access Token

To use the Hugging Face Inference API, you’ll need two key values:

Access token
Model Endpoint URL

This guide will help you collect both.

Prerequisites

An active Hugging Face account
An active Access Token with the correct configuration settings enabled
An active payment method in your Hugging Face account
A valid text-generation endpoint url that corresponds to the model you wish to use in Promptitude

📌 Remember, Serverless functionality is available with Enterprise plans in Promptitude.

Step-by-Step Guide

1️⃣ Create your Access Token

Log in to https://huggingface.co/
Open your profile dropdown.
Select Access Tokens from the menu:
Click the “Create new token” button:
Select the Fine-grained option and enter a name for the token. In the Inference section, be sure to check the following options to configure the access token properly:
- Make calls to the serverless Inference API
- Make calls to Inference Endpoints
Click the Create Token button at the bottom of the page to generate the token.
A popup will appear displaying your newly created token. Copy the token and save it in a secure location, as you’ll need it to add to Promptitude later:

2️⃣ Deploy your model using Inference Endpoints and obtain your Endpoint URL

Currently, only text-generation model endpoint URLs are supported in Promptitude. You can explore the available models at the following URL: Models - Hugging FaceModels - Hugging Face

Step-by-Step Guide

From the list, select the text-generation model you'd like to deploy:
In the model card, open the Deploy dropdown and select Inference Endpoints (Dedicated):
Customize your endpoint settings as desired and provide a unique name for it:
In the Security section, select the Protected option to require an Access Token for endpoint access. Then, click Create Endpoint to finalize the setup.
Afterward, you will be redirected to the endpoint page, where the status will show as Initializing:
Once initialization is complete, the status will change to Running. At this point, you can copy your Endpoint URL for use in Promptitude:

3️⃣ Configure your Serverless Inference API connection in Promptitude.io

Go to the Serverless Inference API Hub page and click the New Inference API Provider button:
In the slide-over panel, give a name to your new Inference API provider and click the Save button:
Once the provider is created, you'll be redirected to the provider's models. From here, click Add Endpoint Model:
Configure all the properties you previously obtained in order to add your available model. Make sure that the properties are correct, click Save:
If everything went well, you will see your model in the table:
Now you can use your model in your prompts:

✏️ Summary

By following these steps, you will have gathered the following information:

Access Token
Model Endpoint URL

After obtaining this information you will be able to configure and use the models you have configured within your Promptitude.io account.

Which AI Providers does Promptitude support?

[Serverless] How to Get Your Azure Model API Endpoint and API Key

[Serverless] How to get your model ID and access token

What is Serverless Inference API Hub in Promptitude?

Connect Microsoft Azure to Promptitude: Difference between Azure OpenAI and Models-as-a-Service