User-centric, Adaptive and Collaborative Information Filtering
An NSF funded Collaborative Project # III-COR 0704628 & 0704689
Quarterly Report: 2nd Year, 4th Quarter
Title: Optimal Ordering of Prediction Tasks
Introduction

Multiple prediction tasks with user interaction.
A motivating example that involves heterogeneous (classification, regression, and retrieval) predictions is interactive trouble report management -- each trouble report from the customer triggers a sequence of interdependent decisions, e.g., determining the priority, category, sub-categories, hardware type, and software type, assigning an expert, and finally formulating a resolution. For making each of these decisions, the user would be assisted by the system, which makes helpful predictions for each task and receives corresponding user feedback. It might be possible to come up with an ordering of these decisions that improves the overall accuracy of the predictions through more efficient use of the user's feedback. For example, it might be beneficial to pin down the hardware and software type of the problem before assigning an expert, because historical data suggests that the latter decision depends heavily on the former. By doing so, the system would be able to make more accurate predictions for the expert assignment. How to optimize the task order by simultaneously taking into account all such relationships and dependencies leads to a hard combinatorial optimization problem.
We have developed new methods for solving the optimal task ordering problem using an approximate formulation in terms of pairwise task order preferences, reducing the problem to the well-known Linear Ordering Problem. We proposed and experimented with two approaches for finding pairwise task order preferences:
- Entropy-based approach: A classifier-agnostic approach based on conditional entropy that chooses prediction tasks whose correct labels lead to the least uncertainty for the remaining predictions.
- Classifier-based approach: A classifier-dependent approach that empirically determines which task orders are favored by the classifier for better predictive performance. This is a flexible approach that allows task order optimization for different target performance metrics. We experiment with five different performance measures (see Tables 1 and 2).
Experiments
We applied the proposed solutions to two practical applications:- The Accenture dataset, that contains 60,000 documents related to client projects. The documents need to be categorized into multiple inter-related taxonomies, which correspond to the multiple prediction tasks.
- The F/A-18 dataset, that contains 6,000 aircraft trouble reports. The prediction tasks comprise of multiple categorizations, engineer assignments, priority assignment for each trouble report.
Results
Figure 1 shows a summary of the performance obtained by various approaches on the Accenture dataset. Tables 1 and 2 report detailed results obtained by each method for different combinations of optimization costs and target evaluation metrics.Figure 1. Summary of the results on the Accenture dataset.
Table 1. Accenture dataset -- Performance (columns) of various task ordering approaches (rows). In four of the five applicable cases, the best performance (bold entry in each column) is obtained when the performance and task order optimization criterion are matched.

Table 2. F/A-18 dataset -- Performance (columns) of various task ordering approaches (rows). In all of the five applicable cases, the best performance (bold entry in each column) is obtained when the performance and task order optimization criterion are matched.