Classification Example - Meeting Location

1. Introduction

This example uses the Classification Framework to classify meeting location based on a set of meeting features. The meeting location classification is a set of room labels.

Meetings are represented by the following set of features:

The classifier requires supervised training, so we provide it with a set of sample meetings with known locations. The result of the training is a model that can be used to predict the locations of new meetings.

2. Use the Entity Class to Create the Object for Classification

We start by using the Entity class object as representation for the meeting objects that we want to classify. For each of our meeting objects, we create an Entity object with a unique meeting id and a list of attributes (type, numParticipants, hostUser, hostProjectLeader, hostSeniorManagement, projectLeaderAttending, seniorManagementAttending, and project). The code fragment below shows this construction.

        List attributes = new ArrayList();
attributes.add(new Attribute(type));
attributes.add(new Attribute(numParticipants, 3));
attributes.add(new Attribute(hostUser));
attributes.add(new Attribute(hostProjectLeader));
attributes.add(new Attribute(hostSeniorManagement));
attributes.add(new Attribute(projectLeaderAttending));
attributes.add(new Attribute(seniorManagementAttending));
attributes.add(new Attribute(project));
meetingObj = new Entity(meetingId, attributes);

Because the classifier will process all features as String terms within the same namespace, the feature values must be unique. In the case of the attribute seniorManagementAttending, we will represent its Boolean value (whether or not senior management is attending the meeting) with the strings "senior-management-attending-yes" and "senior-management-attending-no." In the case of the attribute numParticipants, we will use "0-10," "10-25," and "25-or-more" to represent number of participants for the meeting.

Use the Classification Options Management GUI to set the tokenizer to Simple so that the framework will not filter or modify any of the attribute values.

By default, all attributes are treated equally (weight 1) by the classification framework. But some attributes can be set to have higher weights than others. In this example, we have chosen to give a weight of 3 to the attribute numParticipants.

3. Training and Classification

The framework classifier needs to be trained beforehand to develop a meeting location classification model. One way to obtain such a model is to pass into the classifier a list of meeting objects with known meeting locations. To do this, we create a HashMap with n entries, one entry for each room in our set of meeting locations. The keys to the HashMap entries will be the room labels and the value of each entry will be a list of meetings that are known to have been held in that room. In the code fragment below, trainingData is such a HashMap.

    IClassifierProcessorFactory factory = ClassifierProcessorFactoryLocator.getFactory();
IClassifierProcessor processor = factory.getDefaultProcessor();
processor.train(trainingData);

For classification, we build a list which contains the meeting objects of unknown locations. In the code fragment below, dataToClassify is such a list.

    IClassifierResults results = processor.classify(dataToClassify);
The value returned is an entity classifier results object which contains the meeting locations and a list of meetingIds for each location.

To process the results, we make use of the API utility class ResultsUtil. The following code fragment gets the most probable location for a meeting with a given meetingId.
    String meetingLocation = ResultsUtil.getClasses(results, meetingId, 1).get(0).getClassLabel();