@alexgr Yeah thank you for your comments on my next week's blog posting :) I actually have a much better solution in mind based on contextual fingerprinting algorithms.
Basically a machine translates like a machine. Ergo you can spot machine translations when they occur and each translator has a way of screwing things up that is absolutely unique to them. For example, try putting "I would like a hotdog please." Into google translate, then translate to spanish and watch the hilarity ensure. (especially if you show it to someone who actually speaks spanish)
Now machine translation itself isn't a bad thing. I couldn't survive one day in a foreign country (some parts of the USA too) without tools such as google translate .
It does mean you need to look at the conceptual flow through the document, see if anyone has said anything which is conceptually and structurally the same. Identify those documents, and run them through various machine translators to see if you get a strong match.
This is called strong attribution via contextual analysis and knowledge extraction
There isn't a way to cheat this without actually rewriting the entire document yourself first and at that point it's pretty much the same as a term paper.
But umm that's next week's blog, so I hope you'll stop back by and comment on this then.
Looking forward to it :)