Key Aspects of Test-Time Training for LLMs
Test-time training involves temporarily updating the model’s parameters during inference using a loss function derived from the input data. The process typically follows these steps:
- Start with the initial model parameters.
- Generate training data from the test input.
- Optimize the model parameters to minimize a loss function on this generated data.
- Use the updated parameters to make predictions on the test input.
- Restore the original parameters for the next test instance1
.