Friday, July 25, 2008

Optimal path algorithm?

Given two travelers T1 and T2 and m nodes N, where m is a natural number, each traveler can be at a particular node, and each node has a distance from every other node in the space. However, it is not assumed that each node is reachable from every other node, but that at least one path exists between each node and one other node.

If no distance information is given then the travelers are assumed to be in a m-1 dimensional space with each node the same unit distance from every other node.

There is also a weight associated with the path from each node to another, which is given in general as cost. But cost can be any weight between nodes including distance.

For instance, if it takes $200 U.S. to go from N1 to N2, then that cost is what's used for a traveler at N1 considering going to N2.

Assume T1 is at N1 and T2 is at Nm. For the first iteration T1 considers moving to N2 which is reachable, and then T2 considers moving from each reachable node in turn EXCEPT N1 or N2 as it excludes the node that T1 is already at, and also excludes the one being considered.

For each potential move T2 calculates the cost from that node to Nm, as T2 is moving backwards in time, which I'll call cost2, and then it takes, if the node is reachable, the cost to T1 at N2--not at N1, as it is checking where T1 will be--and it multiplies the cost for that move times the cost of the move T1 is considering, which I'll call cost1, times the cost to T1, and stores that data.

After T2 has gone through every possible move, it simply takes the one that is smallest cost1*cost2*cost, and stores that info. If there are more weights, like time, it'd multiply time1*time2*time*cost1*cost2*cost, and so forth with as many multiplications as there are weights.

Now T1 considers moving to another reachable node, like N3 if there is a path from its current node to that one, and T2 does the same calculation again, and stores the smallest cost1*cost2*cost.

This process continues until all possible moves by T1 are done, and now T1 has a set of cost1*cost2*cost values from the calculations by T2, and selects the smallest one. If there is more than one path with the least value then it doesn't matter which is taken, so the first can be taken.

Now both T1 and T2 actually move to their respective nodes given by that smallest value.

Now the paths from nodes N1 and Nm to the chosen nodes in the direction moved is removed from consideration. However, the reverse path for each is still available.

And you have a complete iteration.

Note that you have a maximum (m-2)2 checks, if the travelers start at different nodes and every node is reachable from every other node, but you still have nearly the same number after the first iteration as now only paths are being removed and not nodes, but the algorithm is still polynomial time.

T1 and T2 continue until there are no more paths available or there is only one node available.

Note that in the latter case then EITHER can move to complete, so you can arbitrarily say that T1 moves forward in that case.

And that is how every node is reached, even if some nodes are not reachable from certain other nodes.

To have a complete circle back to a starting point, you have both T1 and T2 start at the same node, like N1.

The cost for a round trip path with a starting point from each node in turn is stored with the optimum path being the one with the minimum total cost.

Notice with this algorithm you may go to a node multiple times, and a well-visited node is called a hub.

If the optimum path from each node is the same then the graph is said to be perfectly correlated, and corresponds with a case where the costs correlate very well with the distance between nodes.

If only one path is optimal then the graph is said to be dis-correlated, and that corresponds to a case where costs have little or no relation to distances between nodes.

The percentage correlation can be found then by dividing the number of starting nodes that give an optimal path by the total number of nodes, where a graph for which there is a high percentage correlation is said to be a rational graph, where it seems to me that a 67% correlation or higher would qualify.

That could have real world relevance for determining, say, whether or not prices along paths correlate well between measures along the path, or it might reveal problem spots, where, for instance, there is a high cost associated with an otherwise desirable path, which, for instance, might help a city plan places for new roads.

James Harris


Penny Hassett said...

Can you explain/prove why this algorithm finds the optimal path please?
I confess that I don't understand why the straight line distance in and then it calculates the straight line distance isn't totally irrelevant because two nodes could be on opposite sides of an impassable canyon and the travelling salesperson might have to travel hundreds of kilometers along the side before a bridge across was found.

James said...

Ok, forget about my algorithm and imagine two travelers where one is the traveler going backwards in time from the starting point towards himself going forwards in time from the starting point to meet somewhere out there. Is there not a distance between the two of them at each decision point? Yes, there is.

Now imagine that distance for various paths from optimal ones to less optimal ones and consider how IT behaves.

Or just graph it with known solutions to the TSP and see what happens.

I haven't done that but I figure that in phase space the grouping of distances associated with each decision point will be minimum for the optimal path, as consider, if at each point the travelers are making decisions that choose a greater distance from each other then are they not traveling farther afield?

One way of looking at what I did is that using two travelers I collapsed all the conditions into one key variable--cost amplified distance.

So in that sense it's just a trick to shrink the variable space by cleverly introducing another variable--the second traveler going backwards in time.

Think phase space and then hopefully it will make sense.