Agricultural Census and Survey System (ACS) (1968-1993):
- Hardware: ICL mainframes.
- Operating System: GEORGE (II & III).
- Application Software: Inhouse system written in COBOL. Enhanced and refined until its replacement by the Farm Survey System (FSS) in 1993. Tabulation Utility (TAU) was used in connection with the EC Structure Survey. In addition, there were 470 job control macros.
Farm Survey System (FSS) (1993-?):
- Hardware: Client/Server architecture operated via a Local Area Network (LAN). The FSS LAN was connected to the MAFF data communications network (MAFFnet). The FSS ran on two servers: FSS1, the main server. This managed the Ingres relational database in which data from censuses and surveys was stored. FSS2, was used by statisticians in Stats(C&S) to carry out complex statistical analyses of the data held on FSS1.
- Operating System: UNIX.
- Application Software: Ingres database management.
User Interface: Not known for ACS. FSS allowed up to 80 users to connect to the main server (FSS1) concurrently. Users moved through the system via a hierarchy of screens and menus.
Logical Structure and Schema: The County Summaries datasets contain two tables: one for data used to produce the sections of the Published Statistical Material Statement dealing with crops, farmers and workers, and livestock; the other for data used in the section of the PSM Statement on horticulture. This reflects the fact that each County Summaries dataset was transferred to NDAD as two Excel files: one file for data used in the horticulture section of the PSM Statement, the other file for data used to produce the other sections.
In MAFF's view, the system can be divided into three main sub-systems:
- A fixed application, the same for all censuses and surveys run on the system.
- A survey application specific to each survey.
- Software tools to manipulate and analyse the data.
How data was originally captured and validated: During the 1989-1996 censuses, the basic method of data capture was paper forms. Farmers were asked to return the forms within about a week of the census date in mid June. Non-respondents were sent reminders. MAFF had the power to prosecute non-respondents.
Before 1997, farmers were sent either a 'short' or a 'long' form. The short omitted the sections on horticulture and glasshouses. Farmers were sent a long form unless previous census returns indicated that no horticulture or glasshouse data was likely to be present.
Before 1995 forms were sent to all holdings which were classed as 'main' holdings. In 1995 an element of sampling was introduced. Thereafter, sampling was applied to main holdings with a Standard Gross Margin (SGM) of less than 9600 (SGM was 'an economic measure of a farm's profitability . . . derived from the activity being recorded on the Agricultural Census return for the farm'). 'Small main holdings', were sent a census form every three years on a rolling basis; 'very small main holdings' (those with an SGM of less than 3600) were sampled randomly each year at a rate of 10%.
MAFF also attempted to integrate data collection for the census with data collection for its Integrated Administration and Control System (IACS). By 1996 Stats(C&S) were investigating the possibility of establishing closer links between IACS and the census. By 1999 this has progressed to the point that census data was derived from IACS for about 6,000 farms whose production was covered by an IACS claim.
In the 1989-1996 censuses, the forms were returned by farmers to the Parish List Section of Stats(C&S). The rate of return of forms could be as high as 10,000 in a single day. Under the ACS, receipting was done by batching the forms and sending them to ITD's PCK section for the recording of the holding numbers ('CPHing'). With the introduction of the FSS in 1993, bar-coded forms were introduced. After forms were receipted, they were scrutinised by the Parish List clerks. Data inputting was done by batching the forms and sending them to PCK section for speed keying. This continued to be the normal method of data entry after the introduction of the FSS. However, the Parish List clerks acquired the ability to input the data directly to the FSS via input screens.
Once the forms were loaded into the system, a parameter sheet was generated which listed batches loaded, and the number of forms in each. A data load report (recording forms that failed to load) was also produced, and a validation error report (listing validation queries by county in the order of CPH numbers). The data was validated against previous returns, against other values on the form, and for credibility via validation rules written by statisticians. Validation queries were cleared by Parish List staff by reference to the original survey forms, by following standard instructions, and by contacting the farmer directly.
Provisional results for the census were published after about 60% of the forms had been received. Final results were calculated when about 80% of the despatched forms were deemed sufficiently 'clean' to be used. In order to account for non-responses statistical techniques were employed to generate a dataset covering 100% of main holdings. Two techniques were used:
- Ratio estimation: 'using previous years data as a base . . . the returned sample is poststratified by farm size (SGM) and farm type to minimise non-response bias. Each item being measured is raised independently of others and so different raising factors apply. Coverage and standard errors for the estimates vary between items but relative standard errors are typically under 2 per cent for the main crops and under 5 per cent for the livestock'.
- Imputation: 'this involves applying an appropriate trend (obtained from responding units) to the previous years data for a missing farm to give a best estimate of the current year's information'.
Production of final results involved the use of estimation to compensate for holdings which had not returned a form, or for returns which were being queried.
Validation performed after transfer: Details of the content and transformation validation checks performed by NDAD staff on each Agricultural and Horticultural Census dataset are contained in the catalogues of individual datasets.
Constraints on the reliability of the data: The County Summaries datasets have been aggregated to a high level, exclude some 'disclosive' data, and do not necessarily have data items which relate directly to the survey questions.