Contact Management (DEX) API

Architecture


Dex Server Architecture

Contact Management is deployed as a Web service using the Hessian protocol and can be accessed remotely from either a Web browser or programmatically from a Hessian client program.

A client sends a query request that includes a mandatory username and an optional email address to Contact Management. A client can opt to get the results from previous queries stored in the Contact Management cache or real-time data by forcing a cache update. If the query is not satisfied by the cache or the user requests the cache update, the query is sent to the Web Search API. The Web Search API provides a pluggable interface to Web-search services.

Typically, Web-search services require a license key provided by the search service in order to use the service. The client can send either a license key to the Web-Search service, if they have one, or a DEX-specific access code, in which case DEX will find the license key corresponding to the access code. The key received from the client licenseKeyFileAccessCodeFile is mapped to the license key of the services in the licenseKeyFile.

On receiving the URLs from the search service, Contact Management retrieves Web pages associated with each URL and harvests contact data and expertise-describing keywords. The harvested data is written to a disk cache and the results are returned to the client.

API

Client API

The following steps are required to make a programmatic call from a Java remote client to a Contact Management Web service:

  1. Create a remote proxy by passing  IDexMiner to the Hessian API.
  2. Create a map containing  key/value pairs described in  DexRequestKeys that define the query request.
  3. Call the findData method on the remote proxy.
  4. Iterate through the list of maps returned from the findData using the key constants defined in either MinedItemFieldConstants and MinedItemTypeNeutralConstants or MessageFaultConstants, as specified in the documentation for IDexMiner .
The steps would be basically the same for any other programming language. Please refer to the appropriate Hessian implementation for details.

Server API

Developers can use a different Web search service by creating their own implementation of IWebSearchEngineFactory and replacing the reference to edu.umass.cs.dex.web.example.ExampleWebSearchEngineFactory in the web.xml file with the name of the implementing class.

Java Classes


/**
* DexRequestKeys
*/
public class DexRequestKeys {
/**
* The name of the user for which contact information is requested.
*

* Example: John Doe
*

* Key value: String
*/
public static final String USER_NAME = "userName";

/**
* Email account for which contact information is requested.
*

* Example: jdoe@aol.com.
*

* Key value: String
*/
public static final String EMAIL_ACCOUNT = "emailAccount";

/**
* The license key to use for the Web search.
*

* Key value: String
*/
public static final String LICENSE_KEY = "licenseKey";

/**
* Set to true to force a Web search and cache update.
*

* Key value: Boolean
*/
public static final String FORCE_CACHE_UPDATE = "forceCacheUpdate";

/**
* The search level determines how wide of a search to perform.
*

* The queries that are passed to the Web search API are derived from
* a person's name and email address. The exact form and number of queries
* depends on whether the client has requested a SEARCH_LEVEL_NARROW or SEARCH_LEVEL_WIDE search.
* If a client requested information for "Firstname
Lastname username@division.company.com"
* using a SEARCH_LEVEL_WIDE search, Contact Management would issue each of the following queries
* until it found a page that contained contact data.
*

* "Firstname Lastname" site:division.company.com
* "Firstname Lastname" site:company.com
* "Firstname Lastname"
*
* If a SEARCH_LEVEL_NARROW search was requested, Contact Management would only try the first two queries.
*

* Key value: Integer
*/
public static final String SEARCH_LEVEL = "searchLevel";

public static final Integer SEARCH_LEVEL_NARROW = 0;
public static final Integer SEARCH_LEVEL_WIDE = 1;
}


/**
* This interface provides a callback to a Web access service that
* is capable of finding contact information.
*/
public interface IDexMiner {
/**
* Attempt to find contact information for a given email user name and account.
*

* If processing completes without an exception condition being raised,
* all Map entries will have keys that are defined in
MinedItemFieldConstants.
* If an exception condition occurs during processing, the returned list will
* contain a Map with
MessageFaultConstants#FAULT_MESSAGE as one of its keys
* and all Map entries will have keys that are defined in
MessageFaultConstants.
*
*
@param request information that defines the data being requested.
* See D
exRequestKeys for supported keys.
*
@return List of Maps. A map will either contain values for a key defined in
*
MinedItemFieldConstants or values for a key defined in MessageFaultConstants.
*/
List<Map<String, String>> findData(Map<String, Object> request);
}



/**
* Defines constants used as map keys for a data item found by a data miner.
*

* Values for all keys must be a String type.
*/
public class MinedItemFieldConstants {
/**
* Key for the type of item that will be returned when accessing the VALUE key.
*
* The value will be one of the constants defined in MinedItemTypeNeutralConstants.
*/
public static final String TYPE = "type";

/**
*
Key points to the URI where the data item was found, such as a Web page URL.
*/
public static final String SOURCE_URI = "sourceURI";

/**
* Key points the URI for the service which mined the data item.
*/
public static final String MINER_URI = "minerURI";

/**
* Key points to level of confidence in the accuracy of the data item
* as viewed by the service which mined the data item.
*

* The value should be a double (as represented by a string) between
* 0 and 1, where 1 is the highest level of confidence.
*/
public static final String CONFIDENCE_RATING = "confidenceRating";

/**
* Key points to the data item that was found by the miner.
*

* The value should always be a string representation.
*/
public static final String VALUE = "value";
}


/**
* Defines the possible data that can be found by Contact Management.

*/
public class MinedItemTypeNeutralConstants {
public static final String PERSON_NAME_PREFIX = "person_name_prefix";
public static final String PERSON_FIRST_NAME = "person_first_name";
public static final String PERSON_MIDDLE_NAME = "person_middle_name";
public static final String PERSON_LAST_NAME = "person_last_name";
public static final String PERSON_NAME_SUFFIX = "person_name_suffix";

public static final String PERSON_NICKNAME = "person_nickname";

public static final String PERSON_HOME_PAGE = "person_home_page";

public static final String PERSON_WORK_TITLE = "person_full_occupational_title";

public static final String PERSON_WORK_COMPANY_NAME = "person_work_company_name";

public static final String PERSON_KEYWORDS = "person_keywords";

public static final String PERSON_HOME_PHONE = "person_home_phone";

public static final String PERSON_WORK_ADDRESS = "person_work_street_address";
public static final String PERSON_WORK_CITY = "person_work_city";
public static final String PERSON_WORK_STATE = "person_work_state";
public static final String PERSON_WORK_ZIP = "person_work_zip";
public static final String PERSON_WORK_EMAIL = "person_work_email";
public static final String PERSON_WORK_PHONE = "person_work_phone";
public static final String PERSON_WORK_CELL = "person_work_cell";
public static final String PERSON_WORK_FAX = "person_work_fax";
}



/**
* MessageFaultConstants
*

* Defines key constants for message fault maps and some corresponding key values.
*

* MessageFaultConstants#FAULT_MESSAGE is the only required key for a message fault
* map.
*/
public class MessageFaultConstants {
/**
* All fault message maps must contain the FAULT_MESSAGE key.
* This differentiates the map from other types of maps.
*/
public static final String FAULT_MESSAGE = "_fault_message_";

/**
* The URL for the location where the fault occurred.
*/
public static final String FAULT_URL = "URL";

/**
* The stack trace for the fault (as a string).
*/
public static final String STACK_TRACE = "stackTrace";

/**
* Values for the following key must be one
* of the
<code>FAULT_TYPE_</code> constants.
*/
public static final String FAULT_TYPE = "faultType";

public static final String FAULT_TYPE_WARNING = "faultTypeWarning";
public static final String FAULT_TYPE_FATAL = "faultTypeFatal";
}


/**
* A factory that creates instances of IWebSearchEngine.
*/
public interface IWebSearchEngineFactory {
IWebSearchEngine newWebSearchEngine();
}


/**
* A Web search engine.
*/
public interface IWebSearchEngine {

/**
* Set the license key to use for the search.
*
*
@param licenseKey the license key
*/
void setLicenseKey(String licenseKey);

/**
* Set the IP of the remote user that initiated the search request.
*
*
@param userIp the ip of the user making the search request.
*/
void setUserIp(String userIp);

/**
* Perform a Web query.
*
*
@param query Format: "persons-name [site:domain]"
*
@return list of urls to Web pages that match the search query
*
@throws WebSearchRuntimeException if an exception is thrown from the search API
*/
List<String> findUrlsForQuery(String query);
}


Coding Example (Java Remote Client)

The Contact Management distribution ships with a sample Java remote client program. Please refer to the README.txt file located in the root of the code distribution for information about running the sample program.

The following code snippets are from DexRemoteClientExample:       
 // Create a factory that will be used to create a remote proxy to access Contact Management
private RemoteProxyFactory factory = new RemoteProxyFactory();

public void run() {
// The Contact Management remote proxy
IDexMiner dexMiner;

try {
// Set the URL for the Contact Management server that will be accessed from the client
factory.setServerUrl(url);

// Create a Contact Management remote proxy instance
dexMiner = factory.createProxy();
} catch (RuntimeException e) {
e.printStackTrace();
return;
}

// Create a map to define the Contact Management request parameters
Map<String, Object> request = new HashMap<String, Object>();
request.put(DexRequestKeys.USER_NAME, user);
request.put(DexRequestKeys.EMAIL_ACCOUNT, email);
request.put(DexRequestKeys.FORCE_CACHE_UPDATE, Boolean.FALSE);
request.put(DexRequestKeys.SEARCH_LEVEL, searchLevel);
request.put(DexRequestKeys.LICENSE_KEY, LICENSE);

// Make a remote call to Contact Management
List<Map<String, String>> list = dexMiner.findData(request);

// Format the results and print them to the standard output
System.out.println(DexResultsFormatter.format(list));
}

The following code snippets are from RemoteProxyFactory:

        // Create a factory to create remote proxies
       private HessianProxyFactory factory = new HessianProxyFactory();

       // return a remote proxy to access Contact Management
        return (IDexMiner) factory.create(IDexMiner.class, serverUrl);...


Example Program Output

If Contact Management finds multiple possible values for the same tag type, the example program will list them in descending order of confidence. This can be seen below in the person_work_street_address for query ("John Doe" doe@sri.com) .

Query Results:
============================================
Type: person_home_page
-------------------------
SourceURI: http://www.ai.sri.com/people/doe/
Value: http://www.ai.sri.com/people/doe/
Confidence Rating: 0.75
============================================
Type: person_keywords
------------------------
SourceURI: http://www.ai.sri.com/people/doe/
Value: privacy polic, home people, team members, aic home, calo software, cognitive assistant, organizes calo, information objects, home aic, corporation privacy, nonprofit corporation, pal program, related information, enabling users, projects cognitive, s perceptive, generation cognitive, learns pal, current projects, perceptive agent, cognitive agent, sri international, application framework, sri home, iris iris, darpa s, office related, software iris, program sri, objects iris, contact sri
Confidence Rating: 0.75
============================================
Type: person_work_city
-------------------------
SourceURI: http://www.ai.sri.com/people/doe/
Value: Menlo Park
Confidence Rating: 0.75
============================================
Type: person_work_fax
------------------------
SourceURI: http://www.ai.sri.com/people/doe/
Value: (555) 555-5555
Confidence Rating: 0.75
============================================
Type: person_work_state
--------------------------
SourceURI: http://www.ai.sri.com/people/doe/
Value: CA
Confidence Rating: 0.75
============================================
Type: person_work_street_address
-----------------------------------
SourceURI: http://www.ai.sri.com/people/doe/
Value: 333 Ravenswood Avenue
Confidence Rating: 0.75
----------
SourceURI: http://www.ai.sri.com/people/doe/
Value: Integrate. Relate. Infer. Share. 2006 SRI International   333 RavenswoodAvenue
Confidence Rating: 0.71
============================================
Type: person_work_zip
------------------------
SourceURI: http://www.ai.sri.com/people/doe/
Value: 94025-3493
Confidence Rating: 0.75

Types with multiple values are displayed in descending order of confidence.

Web Server Configuration File


<?xml version="1.0" encoding="ISO-8859-1"?>
<!DOCTYPE web-app
PUBLIC "-//Sun Microsystems, Inc.//DTD Web Application 2.3//EN"
"http://java.sun.com/dtd/web-app_2_3.dtd">

<web-app>
<display-name>dex</display-name>
<description>Dex application service</description>

<!-- Define servlets -->
<!-- Note that Tomcat insists on the following order: filter, servlet, servlet-mapping -->

<!-- DexEngine -->
<!--
Depending upon the search service, you may need to obtain a license key.

This servlet requires over 300MB of memory.
-->

<servlet>
<servlet-name>log4j-init</servlet-name>
<servlet-class>com.sri.dex.server.Log4jInitServlet</servlet-class>

<init-param>
<param-name>log4j-init-file</param-name>
<param-value>WEB-INF/web_log_config.xml</param-value>
</init-param>

<load-on-startup>1</load-on-startup>
</servlet>

<servlet>
<servlet-name>DexEngine</servlet-name>
<servlet-class>com.sri.dex.server.DexServlet</servlet-class>

<init-param>
<param-name>webSearchEngineClassName</param-name>
<param-value>edu.umass.cs.dex.web.example.ExampleWebSearchEngineFactory</param-value>
<description>The API to use for the Web search.</description>
</init-param>

<init-param>
<param-name>cacheDir</param-name>
<param-value>WEB-INF/dexCache</param-value>
<description>Directory where DEX caches its results.
The directory will be created if it does not already exist.</description>
</init-param>

<init-param>
<param-name>licenseKeyFileAccessCodeFile</param-name>
<param-value>WEB-INF/resources/keyFileAccessCode</param-value>
<description>File must contain a security code used to access the licenseKeyFile</description>
</init-param>

<init-param>
<param-name>licenseKeyFile</param-name>
<param-value>WEB-INF/resources/keys</param-value>
<description>File must contain a valid license key for the API used for the Web search</description>
</init-param>

<init-param>
<param-name>recordExtractorFile</param-name>
<param-value>WEB-INF/resources/record-extractor-crf.obj</param-value>
</init-param>

<init-param>
<param-name>stopWordFile</param-name>
<param-value>WEB-INF/resources/stop</param-value>
</init-param>

<init-param>
<param-name>modelFile</param-name>
<param-value>WEB-INF/resources/maxent_list.model</param-value>
</init-param>

<init-param>
<param-name>webSearchTimeout</param-name>
<param-value>60</param-value>
<description>The maximum number of seconds to wait for a Web query to return.</description>
</init-param>

<init-param>
<param-name>webPagesPerPerson</param-name>
<param-value>10</param-value>
<description>The larger this value, the greater the possibility of finding information
on the person and the longer the time to process. This value must be greater
than zero.</description>
</init-param>

<init-param>
<param-name>tmpDirRoot</param-name>
<param-value>WEB-INF/tmp</param-value>
<description>Allows one to specify the parent directory within which the service will
create its temporary directory. If not specified, this will default to the OS's
temporary directory></description>
</init-param>

<load-on-startup>2</load-on-startup>
</servlet>

<servlet>
<servlet-name>DexForm</servlet-name>
<servlet-class>com.sri.dex.server.DexServletForm</servlet-class>
<load-on-startup>3</load-on-startup>
</servlet>

<servlet-mapping>
<servlet-name>DexEngine</servlet-name>
<url-pattern>/dex</url-pattern>
</servlet-mapping>

<servlet-mapping>
<servlet-name>DexForm</servlet-name>
<url-pattern>/form</url-pattern>
</servlet-mapping>
</web-app>