Together.ai is interesting; I think it might be a relatively new business model in tech -- since they sell inference and training, you might be tempted to think of them as an engineering / infrastructure company.
But, because inference is largely quality based -- e.g. customers seem to be selecting "cheapest generation at the quality I require", they have a strong incentive to optimize speed of inference at different quality points, and so this paper is coming at the market from a very different place than "quality first - sell second", like OpenAI or Anthropic. On those terms, the ideas and concepts in Based are pretty interesting. Faster inference is awesome, faster sequential token generation is awesome, cheaper long range memory is awesome..
As revenues at these places grow, they should have access to more compute, which should mean they'll be able to start training at a scale that will get to 'minimum acceptable quality', and then they'll be off to the races.
I'm looking forward to the next year, where companies like together can start putting out models optimized toward specific workflows that compete on quality!
i think it makes total sense - infra will commoditize rapidly so you have to make research bets on future differentiators. Together is basically the only GPU infra company with a successful research dept (am I missing someone? i probably am) that is likely to pay off turning it into a frontier model lab at some point in future.
But, because inference is largely quality based -- e.g. customers seem to be selecting "cheapest generation at the quality I require", they have a strong incentive to optimize speed of inference at different quality points, and so this paper is coming at the market from a very different place than "quality first - sell second", like OpenAI or Anthropic. On those terms, the ideas and concepts in Based are pretty interesting. Faster inference is awesome, faster sequential token generation is awesome, cheaper long range memory is awesome..
As revenues at these places grow, they should have access to more compute, which should mean they'll be able to start training at a scale that will get to 'minimum acceptable quality', and then they'll be off to the races.
I'm looking forward to the next year, where companies like together can start putting out models optimized toward specific workflows that compete on quality!