Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

" SolidGoldMagikarp"

Characters: 18

Tokens: 1

heh. all i know is this is a fun magic token but 1) i dont really know how they found this and 2) i dont know what its implications are. i heard that you can use it to detect if you are talking to an AI.



I think it's related to Reddit users who posted (very frequently!) on a counting focused subreddit (people literally post "1", "2" , "3" in sequence so usernames appear 50k+ times). Some screenshots and links in this Twitter thread: https://twitter.com/SoC_trilogy/status/1623118034960322560

Plus additional commentary here: https://twitter.com/nickmvincent/status/1623409493584519168 (in short: I think this situation is comparable to a "Trap Street" https://en.wikipedia.org/wiki/Trap_street that reveals when a map seller copies another cartographer)

I hadn't seen the Twitch plays pokemon hypothesis though (from another comment here), I wonder if it could be both!


"They" as in OpenAI, when they trained the tokenizer, just dumped a big set of text data into a BPE (byte pair encoding) tokenizer training script, and it saw that string in the data so many times that it ended up making a token for it.

"They" as in the rest of us afterward... probably just looked at the token list. It's a little over fifty thousand items, mostly short words and fragments of words, and can be fun to explore.

The GPT-2 and GPT-3 models proper were trained on different data than the tokenizer they use, one of the major differences being that some strings (like " SolidGoldMagikarp") showed up very rarely in the data that the model saw. As a result, the models can respond to the tokens for those strings a bit strangely, which is why they're called "glitch tokens". From what I've seen, the base models tend to just act as if the glitch token wasn't there, but instruction-tuned models can act in weirdly deranged ways upon seeing them.

The lesson to learn overall AIUI is just that you should train your tokenizer and model on the same data. But (also AIUI - we don't know what OpenAI actually did) you can also simply just remove the glitch tokens from your tokenizer, and it'll just encode the string into a few more tokens afterward. The model won't ever have seen that specific sequence, but it'll at least be familiar with all the tokens in it, and unlike never-before-seen single tokens, it's quite used to dealing with never-before-seen sentences.


Some of the magic tokens are related to Twitch Plays Pokemon. https://www.lesswrong.com/posts/aPeJE8bSo6rAFoLqg/solidgoldm...


hmm. so this is evidence that openai scraped twitch chat of all places? (notoriously ephemeral)

also opens a question as to how tokenizers are trained. should you discard or break up super niche words like this?


It doesn't necessarily mean it scraped twitch chat. That is the name of a moderator. They also moderate the subreddit and probably some other places. And being a moderator for such a popular event they probably had their name mentioned in other places as well. Every time they comment on Reddit their username would also appear.

https://www.reddit.com/r/twitchplayspokemon/comments/2cxkpp/...





Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: