This example uses the Classification Framework to classify meeting importance based on a set of meeting features.
Meeting importance categories:
Meetings are represented by the following set of features:
The classifier requires supervised training, so we provide it with a set of sample meetings with known importance categories. The result of the training is a model that can be used to predict the importance of new meetings.
We start by using the Entity class object as representation for the
meeting objects that we want to classify. For each of our meeting objects,
we create an Entity object with a unique meeting Id and a list of
attributes (type, numParticipants, hostUser, hostProjectLeader,
projectLeaderAttending, seniorManagementAttending, project, and
location). The code fragment below shows the construction.
List<IAttribute> attributes = new ArrayList<IAttribute>();
attributes.add(new Attribute(seniorManagementAttending, 3));
meetingObj = new Entity(meetingId, attributes);
Because the classifier will process all features as String terms within the same namespace, the feature values must be unique. In the case of the attribute seniorManagementAttending, we will represent its Boolean value (whether or not senior management is attending the meeting) with the strings "senior-management-attending-yes" and "senior-management-attending-no." In the case of the attribute numParticipants, we will use "0-10," "10-25," and "25-or-more" to represent the number of participants for the meeting.
Use the Classification Options Management GUI to set the tokenizer to Simple so that the framework will not filter or modify any of the attribute values.
By default, all attributes are treated equally (weight 1) by the classification framework. But some attributes can be set to have higher weights than others. In this example, we have chosen to give a weight of 3 to the attribute seniorManagementAttending.
The classifier needs to be trained beforehand to develop a
classification model. One way to obtain such a model is to pass into
the classifier meeting objects with known importance categories. To do
this, we create a HashMap with four entries, one for each of the meeting
importance categories listed above. The keys will be the names of the
categories and the values will be the lists of meeting objects for
each category. In the code fragment below, trainingData is such a
IClassifierProcessorFactory factory = ClassifierProcessorFactoryLocator.getFactory();For classification, we build a list of meeting objects with unknown importance categories. In the code fragment below, dataToClassify is such a list.
IClassifierProcessor processor = factory.getDefaultProcessor();
IClassifierResults results = processor.classify(dataToClassify);The value returned is an entity classifier results object which contains the importance categories and a list of meetingIds for each category.
String importance = ResultsUtil.getClasses(results, meetingId, 1).get(0).getClassLabel();
The importance variable will contain either "very important", "important", "somewhat important", or "not important".