Hardware: ICL (possibly an ICL 2900) until c 1993, when the mainframe was decommissioned. Thereafter networked PCs. ASCII files exported from the SIR database were stored on magnetic tapes until c 1992, when migrated to optical disks; these were inherited by Analytical Services Directorate in 1999.
Operating System: GEORGE III until 1988, when replaced by VME. After the decommissioning of the mainframe, SIR ran on UNIX. Also MS DOS and Microsoft Windows (various).
Application Software: The production and subsequent use of Schools' Census datasets at the DfEE and its predecessors involved two separate operations, for which different packages were used. These operations were:
- The initial inputting and validation of the data.
- The subsequent analysis and tabulation of the data to produce published statistics.
The SIR relational database management system manufactured by SIR Pty Ltd is thought to have been for the first purpose from the 1970s. These datasets were also imported into the analysis packages used by Analytical Services' statisticians. During the 1970s and 1980s they used two applications, thought to have been developed in-house, about which little information is available: 'Stages' (used in the 1970s) is thought to have been a batch processing based system, while its successor 'Fasolt' (standing for 'facility for on-line tabulation') was used in the 1980s. It is believed that by the early 1990s these utilities had been replaced by QuickTab. By 2000 the DfEE were analysing Schools' Census data using QuickTab's successor, QStat. The Department for Education and Skills intended to replace QStat with SPSS for data analysis purposes once PLASC was introduced to all LEA-maintained primary, middle, secondary and special schools. By October 2001 the Data Collection Unit was also in transition from using SIR to using Microsoft SQL Server for building 'collecting' databases for the surveys with which it dealt.
Logical structure and schema: The datasets normally consist of a single table per dataset. The exceptions to this are the 1976, 1992, 1993 and 1994 datasets for primary, middle and secondary schools. For 1976, tables were created by NDAD to facilitate access to the data. 1992 and 1993 have three tables per dataset; the 1994 dataset has four tables. The datasets for 1995-1997 have six tables, while those for 1998-2001 have seven.
How data was originally captured and validated: Schools' Census data was gathered through forms which were completed by schools in January each year. Most questions asked schools to supply data reflecting the situation on an enumeration date, traditionally the third Thursday in January. In later years schools were increasingly asked to supply data for periods other than the enumeration date. The forms for LEA-maintained schools were distributed and collected by LEAs, which would then send the forms to the DfEE and its predecessors, sometimes after checking the forms and extracting data. Independent schools, grant maintained schools, and direct grant schools received their forms from the DfEE and its predecessors, to which they returned their completed forms.
The following Schools' Census forms were used:
- Form 7: Different versions were used for LEA-maintained primary, middle and secondary schools, and later city technology colleges.
- Form 7M: For special schools and hospital schools.
- Form 7R: For pupil referral units, a type of LEA-maintained school established by the Education Act 1993 to provide temporary education for certain types of pupils who were not able to attend a mainstream school.
- Form 9/26: Apparently used during the 1970s for direct grant technical schools.
- Form 11: For LEA-maintained nursery schools and direct grant nursery schools.
- Form 16: For direct grant grammar schools up to 1980.
- Form 30: Up to and including the 1978 census for independent schools 'recognised as efficient', i.e. independent schools which had 'sought and obtained recognition as efficient after inspection by Her Majesty's Inspectors of Schools'. In 1978 the arrangements for recognition as efficient were brought to an end. From 1979 onwards independent schools which had previously fallen into this category completed Form 101a.
- Form 101a: For 'other' independent schools up to 1978, and for all independent schools from 1979 onwards.
During the 1990s it became increasingly common for schools to generate their returns in digital form. This enabled validation checks to pick up discrepancies in the data. By 2000 about 70 percent of Schools' Census returns were being submitted digitally. Schools could send their digital returns on floppy disk. During the late 1990s data from paper returns was input for the DfEE by a data preparation bureau and transmitted to the DfEE electronically. Data from both the paper and digital forms was entered into a collecting database by the Data Collection Unit using SIR. The data was checked and validated, and datasets produced which were both archived (as ASCII files) and imported into the analysis packages used by Analytical Services' statisticians.
As the statisticians also kept the datasets in the analysis formats which they were using, this meant that there were effectively two 'archives' of Schools' Census data in the DfEE:
- ASCII files exported from the SIR collecting database.
- Files in QuickTab/QStat format which were 'owned' by the statisticians in Analytical Services Directorate and used by them for analysis.
Constraints on the reliability of the data: The metadata transferred with the 1975-1991 datasets has a number of defects. Many of the field descriptions in the data dictionary files transferred with the datasets show signs of having been truncated, often at around 60 or 70 characters. This truncation is thought to have occurred in 1991-1992 when the datasets held by the Department's statisticians were migrated from Fasolt to QuickTab format.
However, even where field descriptions have not been affected by truncation, the descriptions do not always provide a complete picture of the functions of fields. It is not unusual for more than one field to have the same description even though the fields contain different data.
For the 1975-1991 datasets, the National Digital Archive of Datasets (NDAD) supplemented the original field descriptions by comparing the data with examples of completed Schools' Census forms. Although this work has clarified the functions of a number of fields, some ambiguities remain which will have implications for future use of the data.
In addition to the problem of truncated or inadequate field descriptions, some of the datasets for the nursery schools and independent schools also exhibit a problem with certain fields where the data should be a real number (i.e. a decimal), but the decimal place appears to be either missing or in the 'wrong' location. This problem does not affect the datasets for 1986 onwards.
Validation performed after transfer: Details of the content and transformation validation checks performed by NDAD on the Schools' Census datasets are contained in the catalogues of individual datasets.