A research participant sees some sequence of dots. Later they are asked to say what order they think the dots appeared in. We want a meaningful way to say how accurate they are.

- The dots are numbered 1, 2, ..., n
- The participant does not see the numbers
- The dots always appear in numerical order, cycling from dot 1 back to dot n
- The participant sees several cycles
- The participant's response can be encoded as a reordering of the numbers 1, ..., n

- Quantify the accuracy of responses
- Say what fraction of the n! possible responses are at the same or greater accuracy as a given response

- Participant responses are scored by a real number
- A cyclic rearrangements should have the same score
- Responses which have more dots in sequence should have better scores than those which have fewer dots in sequence.

A sequence of numbers which has no repeats and includes all numbers in a range from 1 to n is called a *permutation*. It can be thought of as a rearrangement of the standard sequence (1,2,3,...,n).

A *metric* is a way of assigning distance. It is a function of two variables which returns a number. There are a few other axioms to require, and they ensure that the function gives a meaningful notion of "distance" between things.

- Mainly interested in invariant metrics, i.e., metrics d such that for permutations x,y,z d(zx, zy) = d(x,y)
- Implies that metric is independent of how the dots are numbered
- Might imply that metric is independent of cyclic rearrangement (below). I need to think about that.

- For an invariant metric d, d(x,y) = d(e,x^{-1}y), where e is the identity permutation.
- So just need w(x)= d(e,x); this is called a weight function

We have a number of different ideas for what metrics make sense to use. In fact, we could use *all* of them, and give some kind of composite score.

We want to score permutations the same if they differ by a cyclic rearrangement. One easy way to achieve this is, given any permutation, "rotate" it until the first term in the sequence is 1. (Start with 1, and go in the order specified by the sequence. When you get to the end, start back at the beginning until you get to the term before 1.) This can be done uniquely for every permutation.

For example, (6,2,3,1,4,5,7) would rotate to (1,4,5,7,6,2,3).

Note: Suppose n = 4. Consider the permutation given by (2,4,1,3). This would be rotated to (1,3,2,4). In both cases, there is exactly 1 correct transition, from 4 back to 1. We should be sure to count those transitions in our ranking.

Another Note: If we want to do some counting over the set of all permutations, we can speed up by only considering those which are already rotated to start with 1, and multiplying by the number of rotations (n).

Another: Some weight functions are already independent of rotation in this sense. Permutation weight and transition weight, for example.

Another: If a weight function is *not* independent of rotation, it could happen that different rotation has smaller weight. For example, the Hamming weight of (1,7,2,3,4,5,6) is 6, but (7,2,3,4,5,6,1) has Hamming weight 2.

Here is my understanding of the metric that Chris described. I call it the "transition metric" because it counts how many correct transitions are in the sequence.

*Given a permutation a = (a[1], a[2], ..., a[n]), count the number of i such that a[i+1] is not equal to a[i] + 1. For i = n, take i+1 = 1.*

This is similar in spirit to the Kendall tau below, but not quite the same.

There are (at least) 3 interesting problem domains where this kind of question comes up. Each one has a couple of standard metrics, and a whole host of modified versions for variants of the basic problems. Since none of these exactly match our setup, I propose that we just start with these.

- Comparing different rankings (e.g. My top 10 v.s. yours). Each ranking is a permutation of some given list, and metrics quantify the difference between them.
Standard metrics here are:
- Spearman footrule, a.k.a. Taxicab, a.k.a. ell_1
- Kendall tau

- Signal processing (Filtering noise, detecting typos)
- Hamming distance
- Transposition distance, a.k.a. Cayley distance

- Text changes (linguistics, biology (DNA mutations))
- Ulam distance

479001600

-1

120

[1, 7, 8, 9, 5, 2, 6, 3, 4]

6