The Historic Mortality Data Files database was originally created as a basic tool for researchers studying mortality in England and Wales. These two datasets reflect different versions of the database. The first dataset covers 1901-1992, and reflects the Historic Mortality Data Files database before it was redesigned in 1997. The second dataset covers 1901-1995, and is the version which was re-issued by ONS in 1997 as Twentieth Century Mortality Files. There is a significant overlap between them, and many similarities as well as differences which are detailed below.
The 1901-1992 and 1901-1995 datasets contain the following types of tables:
- Historic Deaths tables (both datasets);
- Population tables (both datasets);
- ICD Dictionary tables (1901-1995 only).
The Historic Deaths tables record the number of deaths in England and Wales in each year broken down by age group, sex and the underlying cause of death. From 1911 onwards, the cause of death is coded according to the contemporary version of the International Classification of Diseases (ICD). For the period 1901-1910, causes of death follow a classification scheme which was used in England and Wales before the ICD was adopted. Each dataset thus contains an Historic Deaths table for 1901-1910, and a table for each period in which a different revision of the ICD was in force. Down to 1992, the data relates to deaths which were registered in the year in question; from 1993 onwards, the figures represent deaths which occurred during the year.
Each dataset also contains a single Population table which contains estimates of the population of England and Wales (the 'population at risk of dying') by year, by sex, and by age groups. The age groups correspond to the age groups used in the Historic Deaths tables.
Finally, the second dataset includes ICD Dictionary tables which explain the codes used for causes of death in the Historic Deaths tables. There is one ICD Dictionary table for each Historic Deaths table. These tables are not present in the 1901-1992 dataset, in which codes for causes of death were not explained except in the accompanying documentation.
ICD Codes: The ICD originated in a draft nomenclature of causes of death which was presented to a session of the International Statistical Institute by Dr Jacques Bertillon in 1893. The first revision of the ICD was adopted at an international conference in 1900. New versions have been issued at roughly 10-year intervals. Maintenance of the standard rests with the WHO. In England and Wales the ICD was first adopted in 1911, in the form of an amended version of the second revision. Nine revisions of the ICD are accounted for in these datasets and the adoption of each roughly corresponds with the commencement of each decade of the 20th Century.
During the period 1901-1910, causes of death in England and Wales were classified by the General Register Office using a list of causes which was a variant of the first revision of the ICD, but did not employ ICD codes. When the Historic Mortality Data Files database was developed by Office of Population Censuses and Surveys (OPCS), codes were assigned to causes in this unnumbered list. This is the basis for the codes for causes of death in the Historic Deaths table for 1901-1910, in both datasets. The other Historic Deaths tables in the datasets cover the periods of the second through to the ninth revisions of the ICD.
In the 1901-1992 dataset, ICD codes are represented by 'computer codes', which can differ substantially from the ICD codes. This is particularly true in the case of ICD revisions 2-5, for which the alphanumeric ICD codes were converted into purely numeric codes in the dataset. Explanations of the computer codes and the ICD codes were not included in the dataset. However, the documentation accompanying the dataset allowed for the matching up of computer codes and ICD codes, and explained the meaning of the ICD codes for ICD revisions 2-5 and the codes used in the period 1901-1910. By contrast, the 1901-1995 dataset includes actual ICD codes in the Historic Deaths tables from 1911 onwards, with explanations of the codes being provided in the ICD Dictionary tables. The most significant difference is that where the ICD employed a 3-digit code, a '0' was added at the end in the dataset to ensure that all codes had 4 digits.
Data on causes of death recorded in the Historic Deaths tables represents data on the underlying causes of death, and is ultimately derived from the system for registering deaths for civil purposes. 'Underlying cause of death' was defined in the 9th revision of the ICD as (1) the disease or injury that initiated the train of events leading to death, or (2) the circumstances of the accident or violence (e.g. suicide) that produced the fatal injury. Where death was not due to natural causes, ICD revisions 6-9 allowed two codes to be assigned to each death: one covers the external cause of injury and the other the nature of the injury. To avoid any double counting of deaths, only counts for external causes of injury are included in the Historic Deaths tables for these revisions, in both datasets.
Age Groups: In both datasets, data in the Historic Deaths tables and the Population table is divided into standard age groups. These age groups vary according to the period covered by the data. In most cases, five-year age groups from age 5 up to age 85+ are used. However, there are variations from this for some of the periods corresponding to the earlier ICD revisions.
From 1986 onwards, data in the Historic Deaths tables for deaths under the age of 1 excludes deaths in the first 28 days of life. This resulted from the introduction of a new form of death certificate for stillbirths and neonatal deaths in that year, which abandoned the concept of an underlying cause of death. Instead, physicians were required to supply details of maternal and foetal contributions to mortality.
In both datasets, down to 1992, the years assigned to data in the Historic Deaths tables represent the year when the death was registered. In 1993 OPCS began publishing mortality statistics by the year in which the death occurred, rather than by the year in which the death was registered. This affects data in the Historic Deaths tables in the 1901-1995 dataset, where the year represents the year of registration up to 1992, and the year of occurrence of death for 1993-1995.
The datasets in this series are available to download. Links to individual datasets can be found at piece level.