Here's a step-by-step explanation of the SynthID process:
- Tokenization: The AI model breaks down the input text into individual words or tokens.
- Tournament sampling: The algorithm randomly pairs up possible word tokens in a tournament-style bracket.
- Winner selection: The winner of each pair is selected based on a watermarking function, which assesses the likelihood of the token being part of the AI-generated text.
- Multi-layered approach: The process is repeated multiple times, with the winners moving through successive tournament rounds until just one remains.
- Signature creation: The final winner is used to create a unique signature, which is embedded in the generated text.