Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

Maybe. Not enough data to say. Say it does a days worth of work in a query. It is sensible to use if it takes less than a day to review ~5 days worth of work. I don't know if we're near that threshold yet but conceptually this would work well for actual research where the amount of preparation is large compared to the amount of output written.

And eyeballing the benchmarks, it'll probably reach a >50% rate per query by the end of the year. Seems to double every model or two.



Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: