This site will look much better in a browser that supports web standards, but it is accessible to any browser or Internet device.
A fast efficient way to classify and categorize your stories.
Cogent (Code Generation Technology), developed for major newspaper and newswire publishers, constructs “topic channels” in real-time by assigning one or more category codes to news stories and photo captions on the run. The sources are diverse, but the category code taxonomy (developed by human domain experts) is unified.
Underlying the system is a leading-edge Bayes classifier using the most modern of discriminant algorithms.
A semantic entity-extraction scan is run first, and identifies:
The extracted entities can be indexed and used as features in the classification (for example, ticker symbols are automatically assigned to public companies).
Comparison with published results indicates that Cogent performs as well as the best available text categorizers for newswires, but uses substantially fewer features and computational resources during classification.