Yeah, this is indeed a manual process that can be performed - extracting the auto generated transcription from youtube videos once they're processed. There is a delay before google performs this transcription though.
We could do some kind of mechanical turk system for watching chunks of the video and confirming the extracted subtitles.
We could simply use google translate against these extracted transcriptions, but that would be quite messy.