Training GPT Model with Swagger Documents, Need Help with Model Fit

Question

We are currently developing an application that performs actions based on user input using 'gpt-3.5-turbo-0613'.

For example, if a user enters the prompt "create a user with the name Potter and the role wizard," it should call the add_admin method available in our SDK.

To enhance the capabilities of the application, we would like to train the underlying GPT model using our Swagger document(of the API which the SDK internally calls). The Swagger document contains valuable information about the API endpoints and request/response structures. However, we are facing difficulties in adapting the current GPT model to incorporate this specific training data.

score 1 · Accepted Answer · answered Jun 21 '23 at 08:36

You cannot train gpt-3.5-turbo because it's not among the models that support fine-tuning in the OpenAI services

Open-source alternatives are, in general, vastly inferior to ChatGPT, but there may be cases where they are enough. You may check the LangChain library, which makes it easier to integrate external functionality they way you described. You can use it with many different models. You could try with some of the powerful ones like falcon-40b to see if they are enough, and then try something smaller if you have hardware constraints.

Gabriel Verreault · Answer 2 · 2023-09-20T11:49:37.547

I suggest you take a look at RestGPT: Connecting Large Language Models with Real-World RESTful APIs

Note that with this approach the limiting factor in passing the OpenAPI spec files becomes the context window, so I suggest leveraging models that allow for larger context windows, such as Claude 2 (100k tokens).

It is essentially a combination of chain of thought and Langchain APIChain Here is an example from their paper:

Training GPT Model with Swagger Documents, Need Help with Model Fit

2 Answers2