-
Notifications
You must be signed in to change notification settings - Fork 473
Add token cost editing/adding via the UI #719
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
fbd3257 to
e5922d7
Compare
Add the possibility to add/edit the input/output token cost per model via the UI.
e5922d7 to
963127f
Compare
|
Ah, interesting. I wasn‘t aware. But is that really true about different pricing per token per tier/region? I only looked at the docs of Claude amd OpenAI and only find one price per input/output per mTokens. But I only looked at it with that assumption and never challenged it. So could very well be that its different for different users. But even then, should we not keep it? If the users have different rates per mTokens, then that can be entered too… |
|
What if we kept it but default ed to blank, allowing someone to utilize it if they want? The concern I have is that what if people get billed differently because of the tier stuff or if the rate changes? Maybe @krschacht has an opinion? |
|
I wouldn’t have the requirement that this is exact to the cent. But I had the use case where I got billed pretty quickly from Claude in a mega conversation I had, checked the token usage and it was… a lot. But I couldn’t relate this to money because for the Claude 4 model I never bothered to add how much tokens cost. I was then just curious how this relates, I knew the models.yml has the information from some models, but not for the „self added“ ones. That’s where I came from. Just laziness and curiosity 😆 |
|
Cool. I'd like to leave the cost stuff out, but if we put it back in I think we shouldn't load defaults from models.yml (i.e. make them blank initially). |
|
I don't have a strong feeling about this, but my instinct was that this was pretty complex and getting harder over time. I think if you had a really long conversation it used to be the case that the cost could add up quickly, because every new question was re-submitting all the previous tokens. But then openai and anthropic both rolled out caching. I don't think all previous tokens are 100% free but I believe they're much, much cheaper than the additional tokens you add to each conversation. And then I signed up to unlimited billing in claude because I use it so much. And then this guy in that issue chimed in with yet more considerations that impact estimates. All of this was leading me to believe: we can't really accurately estimate cost by counting tokens and multiplying by price. And as the model costs come down (and caching rolls out) keeping an eye on how much your incurring becomes less important anyway. I guess I'd say: if we think we really do have a scheme which is pretty accurate (not to the cents but close enough to be useful) then I'm okay keeping it in place, although maybe we disable it by default. But my hunch is that the current scheme isn't taking into account caching and may already be way off in it's estimate. If that's the case, then I'd lean towards just cutting the featuring. |
|
Yeah, you are both right. Especially with these cached tokens. I will close the PR. Thanks for the good discussion! |
Add the possibility to add/edit the input/output token cost per model via the UI.