You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Currently when using the "/v1/model/list" or "/v1/models" endpoints, the user receives an array of objects that contain the model ID, the timestamp of when it was created, the "owner" of the model, etc.
It would be useful for more information to be returned, such as some default settings in the model's config.json - such as the default context length, the number of experts per token, default prompt template, etc.
I feel this would be useful because a use-case for this endpoint is for a UI application to get the list of all models so that it may display them to a user, so the user may select and load one. A UI application would ideally provide inputs that let the user specify the context length, number of experts per token, rope alpha, cache mode, etc. In order for the UI application to provide reasonable defaults for some of these values depending on which model is selected, it needs to get this information from the server since TabbyAPI may not be hosted on the local machine (and all models are stored wherever TabbyAPI is hosted).
At minimum, I feel these endpoints should return:
The default context length of each model
The default num experts per token for each model (due to popularity of MOE)
Alternatively, there could be an endpoint that lets the user request information about a specific model without loading it. This is also a perfectly viable solution since an application would likely only need to see the default values for one model at a time - however if this information is a part of the "/v1/model/list" endpoint that it can be retrieved and cached once for all models. I feel that returning the defaults for all models as a part of "/v1/models/list" would be the more long-term efficient solution.
The text was updated successfully, but these errors were encountered:
Currently when using the "/v1/model/list" or "/v1/models" endpoints, the user receives an array of objects that contain the model ID, the timestamp of when it was created, the "owner" of the model, etc.
It would be useful for more information to be returned, such as some default settings in the model's config.json - such as the default context length, the number of experts per token, default prompt template, etc.
I feel this would be useful because a use-case for this endpoint is for a UI application to get the list of all models so that it may display them to a user, so the user may select and load one. A UI application would ideally provide inputs that let the user specify the context length, number of experts per token, rope alpha, cache mode, etc. In order for the UI application to provide reasonable defaults for some of these values depending on which model is selected, it needs to get this information from the server since TabbyAPI may not be hosted on the local machine (and all models are stored wherever TabbyAPI is hosted).
At minimum, I feel these endpoints should return:
Alternatively, there could be an endpoint that lets the user request information about a specific model without loading it. This is also a perfectly viable solution since an application would likely only need to see the default values for one model at a time - however if this information is a part of the "/v1/model/list" endpoint that it can be retrieved and cached once for all models. I feel that returning the defaults for all models as a part of "/v1/models/list" would be the more long-term efficient solution.
The text was updated successfully, but these errors were encountered: