Classification Example - Meeting Importance

1. Introduction

This example uses the Classification Framework to classify meeting importance based on a set of meeting features.

Meeting importance categories:

Meetings are represented by the following set of features:

The classifier requires supervised training, so we provide it with a set of sample meetings with known importance categories. The result of the training is a model that can be used to predict the importance of new meetings.

2. Use the Entity Class to Create the Object for Classification

We start by using the Entity class object as representation for the meeting objects that we want to classify. For each of our meeting objects, we create an Entity object with a unique meeting Id and a list of attributes (type, numParticipants, hostUser, hostProjectLeader, hostSeniorManagement, projectLeaderAttending, seniorManagementAttending, project, and location). The code fragment below shows the construction.

        List<IAttribute> attributes = new ArrayList<IAttribute>();
attributes.add(new Attribute(type));
attributes.add(new Attribute(numParticipants));
attributes.add(new Attribute(hostUser));
attributes.add(new Attribute(hostProjectLeader));
attributes.add(new Attribute(hostSeniorManagement));
attributes.add(new Attribute(projectLeaderAttending));
attributes.add(new Attribute(seniorManagementAttending, 3));
attributes.add(new Attribute(project));
attributes.add(new Attribute(location));
meetingObj = new Entity(meetingId, attributes);

Because the classifier will process all features as String terms within the same namespace, the feature values must be unique. In the case of the attribute seniorManagementAttending, we will represent its Boolean value (whether or not senior management is attending the meeting) with the strings "senior-management-attending-yes" and "senior-management-attending-no." In the case of the attribute numParticipants, we will use "0-10," "10-25," and "25-or-more" to represent the number of participants for the meeting.

Use the Classification Options Management GUI to set the tokenizer to Simple so that the framework will not filter or modify any of the attribute values.

By default, all attributes are treated equally (weight 1) by the classification framework. But some attributes can be set to have higher weights than others. In this example, we have chosen to give a weight of 3 to the attribute seniorManagementAttending.

3. Training and Classification

The classifier needs to be trained beforehand to develop a classification model. One way to obtain such a model is to pass into the classifier meeting objects with known importance categories. To do this, we create a HashMap with four entries, one for each of the meeting importance categories listed above. The keys will be the names of the categories and the values will be the lists of meeting objects for each category. In the code fragment below, trainingData is such a HashMap.

    IClassifierProcessorFactory factory = ClassifierProcessorFactoryLocator.getFactory();
IClassifierProcessor processor = factory.getDefaultProcessor();
processor.train(trainingData);
For classification, we build a list of meeting objects with unknown importance categories. In the code fragment below, dataToClassify is such a list.
    IClassifierResults results = processor.classify(dataToClassify);
The value returned is an entity classifier results object which contains the importance categories and a list of meetingIds for each category.

To process the results, we make use of the API utility class ResultsUtil. The following code fragment gets the most probable importance category for a meeting with a given meetingId.
    String importance = ResultsUtil.getClasses(results, meetingId, 1).get(0).getClassLabel();

The importance variable will contain either "very important", "important", "somewhat important", or "not important".