Classifying by computer (1966)
By Steve GartnerJuly 30th, 1966
Scientists have always wanted to classify things, to divide them up into meaningful groups. We all do this every day, when we speak about human races, or give names to the plants we grow in our gardens. But to do this efficiently needs years of experience, years of working with the things we want to classify.
[Image changes to show a four wheel drive vehicle driving through a field and text appears: Classifying by Computer. Produced by CSIRO Film Unit in collaboration with Computer Research Section 1966]
[Image changes to show a group of people exiting the four wheel drive vehicle]
[Image changes to show a group of people carrying various items and walking into the field]
Narrator: Scientists have always wanted to classify things, to divide them up into meaningful groups. We all do this every day, when we speak about human races, or give names to the plants we grow in our gardens. But to do this efficiently needs years of experience, years of working with the things we want to classify.
[Image changes to show a man from the group hammering a pole into the ground, and the remainder of the group walking further away from him]
And it’s not just 40 years since scientists first began to try to use mathematical methods to help them.
[Image changes to show a woman sitting on the ground observing plants]
The first workers in this field were interested in land survey problems. The things they classified were sample plots of land, and the things they knew about them were the names of the plants growing in their plots.
[Image changes to show a group of people discussing the plants on the ground]
Later, much later, botanists began classifying the plants themselves by mathematical methods.
[Image changes to show plants in pots on a table]
They might get their data from living plants like these or from pressed specimens in a herbarium.
[Image changes to show a person observing pressed plant specimens through a microscope]
[Image changes to show an aerial view of a suburb]
Sociologists have wanted to classify the suburbs of Melbourne.
[Image changes to show newspaper headlines]
Criminologists trying to understand the causes of crime have wanted to classify delinquents.
[Image changes to show a report titled Institute for the Study of Crime and Delinquency]
[Image changes to show a man working on a table]
Whatever we’re classifying we always finish up with a table, a table with a list of things to be classified, the individuals down one side, in this case they’re chemical firms, and a list of the things we know about them, the attributes, down the other.
[Image changes to show a list of names, and then turns the paperwork to show the attributes]
[Image changes to show a building]
This is the form in which the data is usually brought into the computer building.
[Image changes to show a man working at a desk]
But the data must now be transcribed into a form in which the computer can interpret it. There’s no universal standard form for this. Sometimes it’s simple, as it is if it’s just a list of the plants growing in each plot.
[Image changes to show various tables of data]
Or more complicated as it has to be if you have information about soils at different levels.
[Image changes to show a woman entering data]
Now this table has to be transferred to punched cards.
[Image changes to show punched data cards]
The cards must be checked.
[Image changes to show a woman picking up the data cards and handing them to another woman]
This is done not by laboriously proofreading them, but by a different worker punching them again on a machine called a verifier.
[Image changes to show a woman loading the data cards into a machine and then entering data]
This will detect any discrepancy between the two punchings.
[Image changes to show a man being handed the data cards]
No existing computer has a program for classification automatically built in, so the computer must be provided with a set of cards, the program, which will tell it what mathematical operations are to be carried out.
[Image changes to show the man sorting the data cards]
The data cards are added. Finally a job card for record purposes and the deck of cards is ready to be taken to the computer itself.
[Image changes to show people working in a computer room]
There will already be other jobs being processed on the computer. The operators can, if they need, find out what the computer is doing at any moment, by watching the control console.
[Image changes to show the computer control console]
[Image changes to show a person unwrapping the data cards and loading them into the card reader]
Eventually it’s our program’s turn. This card reader will read in up to 1,200 cards a minute.
[Image changes to show a person pressing buttons on the card reader, and the data cards are being fed into the machine]
[Image changes to show the core store of the computer]
The information they contain is passed to the core store, the heart of the computer. Some of the information may need to be stored on magnetic tape, for use later in the calculations.
[Image changes to show tape reels on the computer]
[Image changes to show a man working at a computer]
The arithmetic unit now begins to calculate a measure of similarity or likeness between every pair of individuals, and it will then sort them into groups.
[Image changes to show various images of the computer]
[Image changes to show a line printer]
When the calculations are finished the answer is passed to a line printer, which operates at a thousand lines a minute. This printed output is taken away for examination.
[Image changes to show a man collecting the report]
[Image changes to show the report being viewed]
It normally gives first the program. Then it records where the various parts of the program were located in the store. Then it prints out the data, so that the user can check once more that no mistakes have been made in the data fed in. Lastly the results in a printed form. These brief results really describe a hierarchy, a family tree which shows the most alike individuals, fusing at the bottom into groups, and then these groups fusing together in their turn, and so on.
[Image changes to show a graph plotter]
We needn’t draw this out by hand, there’s a graph plotter which will do it for us.
[Image changes to show a computer monitor]
All we can ask for the hierarchy to be built up on a visual display.
[Image changes to show a computer image of a graph]
Here is our hierarchy. The computation is finished, and we have our answer.
[Image changes to show the data cards being fed into the machine]
But it isn’t the only answer we could get. We could read in the same data cards and by changing only one mathematical quantity in the machine get a whole set of different answers.
[Image changes to show a computer image of a graph]
Let us look at another hierarchy from a set of field data. The plots obviously fall into two sharp groups. We’ll ask the computer to repeat the calculation several times, changing only one simple instruction.
[Image changes to show computer images of various graphs]
Note as we go through these different answers how the scale changes, and how first the grouping becomes less intense.
[Image changes to show computer images of various graphs]
Grouping becomes weaker and weaker, and finally the plots appear not to be grouped at all.
[Image changes to show computer images of various graphs]
Now which of all these possible answers are we to choose? We normally provide a classification like this one, with a modest degree of grouping and the scale put on the side.
[Image changes to show a computer image of a graph]
But since it’s only one of an infinite set of possible answers the classification must be tested for usefulness or meaning in the context from which it came.
[Image changes to show a group of people working in a field]
Wherever it started there it must end, and if the process began by the collection of data in the field, then the worker must return to the field to find out what its application means. A computer classification must never in any circumstances finish at the computer.
[Text appears: Direction – Peter Bruce; Photography – Nick Alexander; Production – Stan Evans]