Yeah I basically took the auto-captions of YouTube I had already cleaned up for difficult words like "grid coin" and BOINC, as a vtt subtitle file.
I got rid of the vtt timecodes by GNU tools like sed
.
I then loaded up the vtt in TextEdit and cmd-F to highlight words. I noticed that @CM-Steem aka customminer uses stop words like "So" a lot so I put periods before those.
https://steemit.com/gridcoin/@nutela/gridcoin-whaletank-rough-transcript-friday-8th-aug-2017
Here's the video:
I edited upto 15 mins or so.
You wouldn't believe how much text one can fill be simply talking for 15 minutes. Way too much work to do by hand.
You could try to make use of the natural pauses in speech to add the full stops as well.
Hey that's a great idea! I wonder though how to get that, I was wondering if YouTube would offer any insight but their tool is closed off. IBM Whatson looks much cooler and even has a github link but I'm not so sure of the quality. It couldn't keep up when testing real time (with Loopback) but then again real time is maybe too much to ask.
Full post with plenty of images