@james It’s an approach called “supervised learning:”
https://en.wikipedia.org/wiki/Supervised_learning
It can be totally valid. The trick (well, the first one, and after that I’m out of my depth) is that you can’t evaluate the results against the training data, so you train the system on only X% of your tagged data, then check how well it matches the desired output for the remaining Y% it hasn’t “seen” before.
@inthehands thank you very much! :)