CS201: Data Structures and Algorithms

Assignment 3

Version 1b

Printable Version

## Introduction

Your task is to implement Kruskal's algorithm for undirected graphs, using a disjoint-set data structure.

## Input

Your program will process a file containing a description of a graph. A graph description contains an arbitrary number of edge descriptions. An edge description consists of two vertices (optionally followed by a weight) followed by a semicolon. A vertex is simply a non-negative integer. If a weight is omitted, a weight of 1 should be assumed. A weight is a non-negative integer.

The file should be free format; whitespace may appear anywhere. Here a sample graph description:
1 5 ;
2
10
23 ;

214 33 1
;
which is equivalent to:
1 5 ;
2 10 23 ;
214 33 1 ;
In this example there are six vertices, named 1, 2, 5, 10, 33, and 214, and three edges, 1 to 5, 2 to 10 and 214 to 33, with weights 1, 23, and 1, respectively.

The name of your executable must be kruskal and the name of the file describing the graph will be passed to your program as a command line argument, as in:
\$ cat g1
1 2 1 ;
2 3 2 ;
3 1 3 ;
\$ kruskal g1
[output appears here]
Your program should interpret the graph description as an undirected graph and should report the minimum spanning tree with the smallest vertex mentioned in the graph description as the source vertex. Any unconnected vertices are ignored.

## Output

The output of your program should be a spanning tree, displayed as a breadth-first traversal followed by the tree's total weight. Here is an example display:
\$ cat g2
10 0 9 ;
6 7 11 ;
5 9 1 ;
4 8 7 ; 10 8 6 ;
11 12 2 ;
1 6 5 ;
0 6 10 ; 0 5 3 ; 0 4 8 ; 0 3 2 ; 0 1 4 ;
\$ kruskal g2
0: 0
1: 3(0)2 5(0)3 1(0)4 4(0)8
2: 9(5)1 6(1)5 8(4)7
3: 10(8)6 7(6)11
weight: 47
\$
Each level of the traversal starts with the level number (level 0 is the first level), followed by a colon, a space, and a space separated list of vertex descriptions. A vertex description is the vertex followed by its breadth-first-search predecessor/parent (in parentheses) followed by the weight of the edge from the predecessor to the vertex in question. Vertex descriptions in a level are to be ordered by increasing edge weight (the last number in the vertex description), with ties being broken in favor of vertex numbers (the first number in the vertex descriptions). You must follow the format exactly as diff will be used to assess your output.

## Options

Your executable must handle a -v option, as in previous assignments.

## Program organization

You must implement the your disjoint set as a separate module named set.c and set.h (http://beastie.cs.ua.edu/cs201/assign-disjoint.html). Likewise, you will need a vertex class and an edge class. Those implementations are left up to you.

## Constraints

In order for your program to run on a randomly created graph, if an edge is given more than once, replace previous occurrences with subsequent occurrences. Since the input graphs are undirected, a $u,v$ edge is the same as a $v,u$ edge.

You must be able to identify a duplicate edge in $\theta \left(\mathrm{log}E\right)$ time.

If more than one edge is eligible to be added to the minimum spanning tree at any given point, chose the edge with the smaller vertex number(s).

You cannot use any fixed-sized data structures for storing information about the graph.

All operations must run as efficiently as commonly expected.

When no spanning tree exists, the executable should display the word EMPTY followed by a newline.

## Other details

You must implement your program in C. Only the most foolish student would not recompile and thoroughly test the implementation on a Linux system.

You must provide a makefile which responds properly to the commands make, make test, make valgrind, and make clean. The make command must compile the kruskal executable no errors or warnings and it must compile with the highest level of error checking (the -Wall and -Wextra options plus -std=c99). The make test command should run your program through some test files of your choosing. The make valgrind command should run your tests through valgrind. The make clean command should remove all intermediate files, including the executable kruskal.

## Documentation

All code you hand in should be attributed to its author. Comment sparingly but well. Do explain the purpose of your program. Do not explain obvious code. If code is not obvious, consider rewriting the code rather than explaining what is going on through comments.

You must pass all tests for your program to be graded.

## Handing in the application

For preliminary testing, run a make clean and then send me all the files in your directory by running the command:
submit cs201 lusth test3
For your final submission, run a make clean and use the command:
submit cs201 lusth assign3
Again, your implementation may be developed on other hardware and operating systems, but it must also compile and run cleanly and correctly on a Linux system.

## Change log

1b clarified sorting order of levels in spanning tree
Mon Nov 12 16:27:56 CST 2018
1a modified the criteria for printing empty (e.g. when no tree exists)
Thu Nov 1 15:45:17 CDT 2018