Sounds a good idea to me, the tool can take a video and create captions. Your comment about humans being more accurate is also good, as surely once those captions have been created, a human can go through them, and I would assuek captions are stored in a external file, if this can be edited then the human job would be to simply edit the file and correct any minor errors.
Any tools that can make life a little easier is surely welcome. Perhaps the importantj point though is also transparancy, if you have used a tool to transscribe this should be clearly stated, so people know how the captions have been generated.