Skip to content

Conversation

@ceicke
Copy link
Collaborator

@ceicke ceicke commented Sep 30, 2025

Add the possibility to add/edit the input/output token cost per model via the UI.

@ceicke ceicke force-pushed the feat/ui_for_token_cost_input branch 2 times, most recently from fbd3257 to e5922d7 Compare September 30, 2025 13:37
@ceicke ceicke marked this pull request as draft September 30, 2025 13:38
Add the possibility to add/edit the input/output token cost per model via the UI.
@ceicke ceicke force-pushed the feat/ui_for_token_cost_input branch from e5922d7 to 963127f Compare September 30, 2025 13:40
@ceicke ceicke marked this pull request as ready for review September 30, 2025 13:45
@mattlindsey
Copy link
Collaborator

Uh oh! Hi @ceicke. I just removed all cost calculations based on a Discussion attached to #718 and merged it into main. Do you think we should have it?

@mattlindsey
Copy link
Collaborator

mattlindsey commented Sep 30, 2025

@ceicke See #696

@ceicke
Copy link
Collaborator Author

ceicke commented Sep 30, 2025

Ah, interesting. I wasn‘t aware. But is that really true about different pricing per token per tier/region? I only looked at the docs of Claude amd OpenAI and only find one price per input/output per mTokens. But I only looked at it with that assumption and never challenged it. So could very well be that its different for different users. But even then, should we not keep it? If the users have different rates per mTokens, then that can be entered too…

@mattlindsey
Copy link
Collaborator

What if we kept it but default ed to blank, allowing someone to utilize it if they want? The concern I have is that what if people get billed differently because of the tier stuff or if the rate changes? Maybe @krschacht has an opinion?

@ceicke
Copy link
Collaborator Author

ceicke commented Sep 30, 2025

I wouldn’t have the requirement that this is exact to the cent. But I had the use case where I got billed pretty quickly from Claude in a mega conversation I had, checked the token usage and it was… a lot. But I couldn’t relate this to money because for the Claude 4 model I never bothered to add how much tokens cost. I was then just curious how this relates, I knew the models.yml has the information from some models, but not for the „self added“ ones. That’s where I came from. Just laziness and curiosity 😆
That said, I would be happy to keep it in. But I can also live without it being there if it’s overly complex.

@mattlindsey
Copy link
Collaborator

Cool. I'd like to leave the cost stuff out, but if we put it back in I think we shouldn't load defaults from models.yml (i.e. make them blank initially).
Let's see if we hear from @krschacht before doing anything.

@keithschacht
Copy link
Contributor

I don't have a strong feeling about this, but my instinct was that this was pretty complex and getting harder over time.

I think if you had a really long conversation it used to be the case that the cost could add up quickly, because every new question was re-submitting all the previous tokens. But then openai and anthropic both rolled out caching. I don't think all previous tokens are 100% free but I believe they're much, much cheaper than the additional tokens you add to each conversation.

And then I signed up to unlimited billing in claude because I use it so much. And then this guy in that issue chimed in with yet more considerations that impact estimates. All of this was leading me to believe: we can't really accurately estimate cost by counting tokens and multiplying by price. And as the model costs come down (and caching rolls out) keeping an eye on how much your incurring becomes less important anyway.

I guess I'd say: if we think we really do have a scheme which is pretty accurate (not to the cents but close enough to be useful) then I'm okay keeping it in place, although maybe we disable it by default. But my hunch is that the current scheme isn't taking into account caching and may already be way off in it's estimate. If that's the case, then I'd lean towards just cutting the featuring.

@ceicke
Copy link
Collaborator Author

ceicke commented Oct 2, 2025

Yeah, you are both right. Especially with these cached tokens. I will close the PR. Thanks for the good discussion!

@ceicke ceicke closed this Oct 2, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants