Quantcast
Channel: T. C. Mits 108 » java
Viewing all articles
Browse latest Browse all 36

List sorting using topological ordering of a digraph

$
0
0

By creating a pseudo weighted directed graph on a list of elements, it is possible to apply graph algorithms to the sorting problem. This offers an alternative to approaches such as QuickSort or Merge sort. As an experiment a Java program implementation is shown.

Terms: topological sorting, directed acyclic graph, Java, JUnit, algorithm

This subject is continued in Linear list sorting using weighted digraph possible?

Pseudo weighted DAG
Given a set of numbers, for example, [9, 2, 12, 40, 12, 3, 1], we can create a directed acyclic graph (DAG) where each number is a vertex and an outgoing edge points to all other elements greater then it. “9″ will have an edge to 12 and 40. “12″ will have an edge to 40 and also to the other “12″ element. This latter edge will be a special ‘equal’ edge and indicates duplicates. If this were an ordinary edge, the graph would have cycles.

Transforming this list into a DAG yields:

list converted to digraph

list transform

Note that once this transformation is complete we essentially know the order of vertices, we just don’t have a linear arrangement yet. We don’t have to revisit any element and compare it with another as in some other sorting algorithms. Of course, this initial step is not efficient, O(n^2).

Topological sorting
A topological sorting (toposort) of a directed graph is a linear ordering of its vertices. An advantage of a toposort is that it runs in linear time. To this time must be added the graph construction phase, and in this implementation, a final step is inserting duplicate elements into the result.

A toposort of a complex graph can have many solutions. As applied here, there is only one solution. But, this remains to be proven or is already known to be true in Graph Theory. See uniqueness.

Implementation

Listing one below contains the Java implementation that sorts the list of elements given above. The program contains an embedded JUnit test. In the test the method loadDag(map, data) creates the DAG. While creating the edges, if a source and sink have values that are equal the ‘equal’ edge is created. The sink vertex is then marked as a duplicate.

The modified toposort in the program will skip vertices that are marked as duplicate. The resulting list is sorted, but is missing the duplicate vertices. In the final step each node in the result set is visited and if it has any outgoing equal edges these destination nodes are inserted into the final result list.

The resultant outputs of the program is:

Sort with dupes: [one:1, two:2, three:3, nine:9, twelve:12, twelve2:12, forty:40]
Sort with unique: [one:1, two:2, three:3, nine:9, twelve:12, forty:40]

Summary
Presented was a possible approach to list sorting by use of graph algorithms. A Java programming language based example implementation was given. Though not part of the topic, the programming example used a nested JUnit test class, an interesting approach.

Listing 1, source available here.

package com.octodecillion.sort;

import static org.hamcrest.core.Is.is;
import static org.hamcrest.core.IsEqual.equalTo;
import static org.junit.Assert.assertFalse;
import static org.junit.Assert.assertThat;
import static org.junit.Assert.assertTrue;

import java.util.ArrayList;
import java.util.Arrays;
import java.util.HashMap;
import java.util.HashSet;
import java.util.Iterator;
import java.util.List;
import java.util.Map;
import java.util.Map.Entry;

import org.junit.Test;
import org.junit.runner.RunWith;
import org.junit.runners.JUnit4;

/**
 * 
 * Sorting using directed graph embedding based topological ordering.
 * 
 * @author Josef Betancourt
 * @since 20120101T1212
 * 
 * 
 *        Example code based on code created from M. Jessup's *
 *        {@link "http://stackoverflow.com/questions/2739392/sample-directed-graph-and-topological-sort-code"}
 *        Which itself is an implementation of one presented in Wikipedia
 *        article: {@link "http://en.wikipedia.org/wiki/Topological_sort"}
 * 
 * 
 */
public class GraphSort {

	/**
	 * Modification of algorthim presented in Wikipedia article.
	 * 
	 * {@link "http://en.wikipedia.org/wiki/Topological_sort"}
	 * 
	 * <pre>
	 * 		L ? Empty list that will contain the sorted elements
	 * 		S ? Set of all nodes with no incoming edges
	 * 		while S is non-empty do
	 * 		    remove a node n from S
	 * 		    insert n into L
	 * 		    for each node m with an edge e from n to m do
	 * 		        remove edge e from the graph
	 * 		        if m has no other incoming edges then
	 * 		            insert m into S
	 * 		if graph has edges then
	 * 		    return error (graph has at least one cycle)
	 * 		else 
	 * 		    return L (a topologically sorted order)
	 * </pre>
	 * 
	 * See also: Algorithm presented in "Algorithms", 4th Edition by R.
	 * Sedgewick and K. Wayne.
	 * {@link "http://algs4.cs.princeton.edu/42directed/Topological.java.html"}
	 * *
	 * 
	 * @param <T>
	 * @param allNodes
	 * @return sorted list or empty list if not possible
	 * @throws IllegalArgumentException
	 *             if graph has cycles
	 */
	public static <T> ArrayList<Node<T>> topoSort(final Node<T>[] allNodes) {

		ArrayList<Node<T>> workList = new ArrayList<Node<T>>();

		HashSet<Node<T>> sinks = new HashSet<Node<T>>();
		for (Node<T> n : allNodes) {
			if (n.inEdges.size() == 0) { // indegree zero? Or source?
				sinks.add(n);
			}
		}

		while (!sinks.isEmpty()) {
			Node<T> nextNode = sinks.iterator().next();
			sinks.remove(nextNode);

			if (nextNode.duplicate) {
				continue;
			}

			workList.add(nextNode);

			for (Iterator<Edge<T>> it = nextNode.outEdges.iterator(); it
					.hasNext();) {

				Edge<T> outNode = it.next();
				Node<T> dest = outNode.to;
				it.remove();
				dest.inEdges.remove(outNode);

				if (dest.inEdges.isEmpty()) {
					sinks.add(dest);
				}
			}
		}

		// All edges are removed? If not, we have a cycle.
		for (Node<T> n : allNodes) {
			if (!n.inEdges.isEmpty()) {
				throw new IllegalArgumentException("Cycle present. Node, "
						+ n.name
						+ " has inEdges. Topological sort not possible");
			}
		}

		ArrayList<Node<T>> resultList = new ArrayList<Node<T>>();
		for (Node<T> node : workList) {
			resultList.add(node);

			if (!node.equiEdges.isEmpty()) {
				for (Iterator<Edge<T>> iterator = node.equiEdges.iterator(); iterator
						.hasNext();) {
					Edge<T> edge = iterator.next();
					resultList.add(edge.to);
				}
			}

		}

		return resultList;
	} // topoSort

	/**
	 * A node.
	 * 
	 * @param <T>
	 */
	static class Node<T> {
		public final String name;
		public final T value;
		public final HashSet<Edge<T>> inEdges;
		public final HashSet<Edge<T>> outEdges;
		public final HashSet<Edge<T>> equiEdges;
		public boolean duplicate;

		public Node(final String name, final T value) {
			this.name = name;
			this.value = value;
			this.duplicate = false;
			inEdges = new HashSet<Edge<T>>();
			outEdges = new HashSet<Edge<T>>();
			equiEdges = new HashSet<Edge<T>>();
		}

		public Node<T> addEdge(final Node<T> node) {
			Edge<T> e = new Edge<T>(this, node);
			outEdges.add(e);
			node.inEdges.add(e);
			return this;
		}

		@SuppressWarnings("unchecked")
		public Node<T> addEdges(final Node<T>... nodes) {
			for (int i = 0; i < nodes.length; i++) {
				Node<T> node = nodes[i];
				Edge<T> e = new Edge<T>(this, node);
				outEdges.add(e);
				node.inEdges.add(e);
			}

			return this;
		}

		public Node<T> addEquiEdge(final Node<T> node) {
			Edge<T> e = new Edge<T>(this, node);
			equiEdges.add(e);
			// node.equiEdges.add(e);
			node.duplicate = true;
			return this;
		}

		@Override
		public String toString() {
			return String.format("%s:%s", name, value);
		}
	} // Node<T>

	/**
	 * A directed edge.
	 */
	static class Edge<T> {
		public final Node<T> from;
		public final Node<T> to;

		public Edge(final Node<T> from, final Node<T> to) {
			this.from = from;
			this.to = to;
		}

		@Override
		public boolean equals(final Object obj) {
			@SuppressWarnings("unchecked")
			Edge<T> e = (Edge<T>) obj;
			return (e.from == from) && (e.to == to);
		}

		@Override
		public int hashCode() {
			return super.hashCode();
		}

		@Override
		public String toString() {
			// TODO Auto-generated method stub
			return String.format("%s->%s", from, to);
		}
	} // Edge<T>

	// ====================================================================
	// Nested JUnit test.
	// ====================================================================

	/**
	 * 
	 * Tests {@link GraphSort}.
	 * 
	 * Uses nested JUnit testing approach advocated by Ben J. Christensen in <a
	 * href=
	 * "http://benjchristensen.com/2011/10/23/junit-tests-as-inner-classes/"
	 * >"JUnit Tests as Inner Classes"</a>
	 * 
	 * @author jbetancourt
	 * 
	 */
	@RunWith(JUnit4.class)
	public static class GraphSortTest {
		@SuppressWarnings("unchecked")
		private Node<Integer>[] allNodes = new Node[0];

		/**
		 * Sort list with no duplicate nodes.
		 */
		@SuppressWarnings("boxing")
		@Test
		public void should_sort_without_duplicates() {
			uniqueList();
			ArrayList<Node<Integer>> result = GraphSort.topoSort(allNodes);

			assertFalse("empty result", result.isEmpty());
			assertThat(Integer.valueOf(result.size()), is(equalTo(6)));
			assertTrue("result not sorted", inOrder(result));

			System.out.println("Sort with unique: "
					+ Arrays.toString(result.toArray()));
		}

		/**
		 * Sort list with no duplicate nodes.
		 */
		@SuppressWarnings("boxing")
		@Test
		public void should_sort_with_duplicates() {
			dupeList();
			ArrayList<Node<Integer>> result = GraphSort.topoSort(allNodes);

			assertFalse("empty result", result.isEmpty());

			System.out.println("Sort with dupes: "
					+ Arrays.toString(result.toArray()));

			assertThat(Integer.valueOf(result.size()), is(equalTo(7)));
			assertTrue("result not sorted", inOrder(result));

		}

		@SuppressWarnings("unchecked")
		private void uniqueList() {
			String nodeData = "nine,9;two,2;forty,40;twelve,12;three,3;one,1;";
			String dagData = "one,three;one,twelve;one,forty;one,two;one,nine;two,twelve;two,forty;two,three;two,nine;three,nine;three,forty;three,twelve;nine,twelve;nine,forty;twelve,forty";

			Map<String, Node<Integer>> nodes = loadNodes(nodeData);
			loadDag(nodes, dagData);

			List<Node<Integer>> list = new ArrayList<Node<Integer>>();

			for (Entry<String, Node<Integer>> entry : nodes.entrySet()) {
				list.add(entry.getValue());
			}

			allNodes = list.toArray(new Node[list.size()]);

		}

		@SuppressWarnings("unchecked")
		private void dupeList() {
			String nodeData = "nine,9;two,2;forty,40;twelve,12;three,3;one,1;twelve2,12";
			String dagData = "one,three;one,twelve;twelve,twelve2,one,forty;one,two;one,nine;two,twelve;two,forty;two,three;two,nine;three,nine;three,forty;three,twelve;nine,twelve;nine,forty;twelve,forty;twelve2,forty";

			Map<String, Node<Integer>> nodes = loadNodes(nodeData);
			loadDag(nodes, dagData);

			List<Node<Integer>> list = new ArrayList<Node<Integer>>();

			for (Entry<String, Node<Integer>> entry : nodes.entrySet()) {
				list.add(entry.getValue());
			}

			allNodes = list.toArray(new Node[list.size()]);

		}

		/**
		 * 
		 * @param nodes
		 * @param dagData
		 */
		private void loadDag(final Map<?, ?> nodes, final String dagData) {
			String[] dagStrings = dagData.split(";");

			for (int i = 0; i < dagStrings.length; i++) {
				String[] spec = dagStrings[i].split(",");

				@SuppressWarnings("unchecked")
				Node<Integer> source = (Node<Integer>) nodes.get(spec[0]);
				@SuppressWarnings("unchecked")
				Node<Integer> sink = (Node<Integer>) nodes.get(spec[1]);

				if (!sink.duplicate || (source.value != sink.value)) {
					source.addEdge(sink);
				} else {
					source.addEquiEdge(sink);
				}

			}

		}

		/**
		 * 
		 * @param nodeData
		 * @return
		 */
		private Map<String, Node<Integer>> loadNodes(final String nodeData) {
			Map<String, Node<Integer>> nodeMap = new HashMap<String, Node<Integer>>();

			String[] nodeSpec = nodeData.split(";");

			for (int i = 0; i < nodeSpec.length; i++) {
				String[] string = nodeSpec[i].split(",");
				String name = string[0];
				String value = string[1];

				@SuppressWarnings("boxing")
				Node<Integer> node = new Node<Integer>(name,
						Integer.parseInt(value.trim()));

				nodeMap.put(name, node);
			}

			return nodeMap;

		}

		/**
		 * Check that list is sorted.
		 * 
		 * @param nodes
		 * @return
		 */
		@SuppressWarnings("boxing")
		private boolean inOrder(final ArrayList<Node<Integer>> nodes) {
			boolean result = true;
			Iterator<Node<Integer>> iterator = nodes.iterator();
			Node<Integer> prev = iterator.next();

			while (iterator.hasNext()) {
				Node<Integer> current = iterator.next();

				// in order?
				if (!(current.value >= prev.value)) {
					result = false;
					break;
				}
			}

			return result;
		}
	}

} // End class GraphSort.java

The graph was created with GraphViz using the input:

## File: listDAG.gv
## Command to generate: dot -Tpng -o listDAG.png listDAG.gv 
digraph {
	
	rankdir=LR
	
	
	A [label="9"];      
	B [label="2"];     
	C [label="12"];      
	D [label="40"];      
	E [label="12"];      
	F [label="3"];     
	G [label="1"];      
	G->B
	G->F
	G->A
	G->C
	G->D
	G->E
	B->F
	B->A
	B->C
	B->E
	B->D
	F->A
	F->C
	F->E
	F->D
	A->C
	A->D
	A->E
	C->D
	C->E [label="equi", fontcolor=darkgreen]
	
	label="\nlist: [9, 2, 12, 40, 12, 3, 1] transformed to DAG by Josef Betancourt";
}

Genesis of this idea
One day I was wondering if randomness could be used to sort. The idea I had was, as it turns out, a “better” Bozosort. Instead of swapping indiscriminately, I would swap elements if they were out of order. I had other ideas. But, then I started drawing some diagrams, and it led to other ideas based on graph construction.

I gave up the use of randomization. Is the technique presented using graphs unique? Does it work? Can it scale? Or Does it add to the list of bozo sorts?

Funny xkcd comic: “Ineffective Sorts

Updates
20130104:

  • Recreated the image using LR ranking in GraphViz.
  • Fixed algorithm which broke when I added missing arc from second 12 to 40. TG for Unit tests.

Further reading

  1. Weighted graphs and networks
  2. M. Jessup’s sample code on StackOverflow
  3. Topological.java
  4. topological_sort
  5. 4.2 Directed Graphs
  6. Directed graph
  7. “JUnit Tests as Inner Classes”
  8. Bogosort
Creative Commons License
This work is licensed under a Creative Commons Attribution-NonCommercial-NoDerivs 3.0 Unported License.

Viewing all articles
Browse latest Browse all 36

Trending Articles