Spirit LM Base and Spirit LM Expressive
Meta has released two versions of Spirit LM: Spirit LM Base and Spirit LM Expressive. The Base version uses phonetic tokens to process and generate speech, while the Expressive version includes additional tokens for pitch and tone, allowing the model to capture more nuanced emotional states and reflect them in its output.
Spirit LM Base is a more basic version of the model that uses phonetic tokens to process and generate speech. It is designed to be more efficient and scalable than the Expressive version, while still providing a high level of expressiveness and natural-sounding speech.