1. Statement of the Problem
A company has a given fixed classification system. The documents should be categorized according to this classification system, overriding the self-organizing categorization done by the constraint Kohonen algorithm applied in InfoCodex.
Example of a given classification system:
The categories at level 1 correspond to main topics, whereas the sub-categories (level 2, 3 etc.) should be represented by individual neurons in the information map of InfoCodex.
The given classification system may also contain some descriptors that describe the corresponding category (column D “Category description” in the example shown above).
In addition to the category description, a set of documents should be available whose target categories are known in advance and that could be used for the training of the map (“Learn documents”).
After the map has been created and trained according to the given classification system and the “Learn documents”, InfoCodex should be able to classify new documents into the existing classification system, i.e. to automatically assign target categories for the new documents with high accuracy.
2. Solution with InfoCodex
When setting up a new collection, the given classification system can be assigned to the collection. In this case, InfoCodex constructs the information map exactly according to the given classification system, i.e.
● the categories on level 1 become main topics (“Container/tank”, “Water heaters” etc.)
● the sub-categories (level 2, 3 etc.) are each represented by a neuron.
The neuron labels (lower right corner of the map) display the given category code, the category name and the first few category descriptors.
For the “Learn documents” used for the training of the map, the target categories must be supplied on an Excel table which contains at least one column with the file name and a second column with the target category. This training information can be assigned in the field “Metadata instructions” of the form for setting up the collection.
3. Matching of New Documents into the Given Classification System
With the new function, the following objectives can be achieved for any new set of documents:
● The automatic classification of the new documents according to the given classification scheme
● The automatic generation of keywords and abstracts for the new documents
A multiple matching is also supported, i.e. the assignment of a document to more categories. The results of the matching process are presented in a list as shown below.