# Metric Violation and Similarity

## Abstract

Metric data plays an important role in various settings, for example, in metric-based indexing, clustering, classification, and approximation algorithms in general. Often such tasks require the data to be metric, though for various reasons this basic property may not be satisfied. The first topic we consider is the metric violation distance problem, which seeks to minimally modify the data to make it metric, and thereby finding the nearest metric data set. Three variants are considered, one admitting a polynomial time algorithm. The other variants are shown to be APX-hard, and an approximation is given. We also introduce and give results for the generalized metric violation distance problem, where there are no longer constraints on all pairs. The second topic concerns the fundamental computational task of assessing the similarity of ordered data sets, i.e. the similarity of two trajectories. Here we introduce the Fréchet gap distance, which in some ways better captures the intuitive notions of curve similarity when comparing to the previous Standard Fréchet distance. In addition, a polynomial time exact algorithm and a much faster 1 + approximation to compute this measure will be given.