Google research paper describes an algorithm that can identify low quality webpages, similar to what the helpful content signal does.
Google research paper describes an algorithm that detects low quality pages, spam content and machine-generated content
Algorithm features low resource usage and the ability to handle web-scale analysis
Algorithm does not have to be trained to find specific kinds of low quality content, it can learn by itself

Google has provided a number of clues about the helpful content signal but there is still a lot of speculation about what it really is.
Lastly, Google’s blog announcement seems to indicate that the Helpful Content Update isn’t just one thing, like a single algorithm.

Danny Sullivan writes that it’s a “series of improvements” which, if I’m not reading too much into it, means that it’s not just one algorithm or system but several that together accomplish the task of weeding out unhelpful content.
They used classifiers that were trained to identify machine-generated text and discovered that those same classifiers were able to identify low quality text, even though they were not trained to do that.