CDMS Governance

3 Governance of climate data management systems

3.1 Data policy
3.1.1 Commitments What commitments has the implementing organization agreed to either explicitly through a contractual arrangement or implicitly through national policy agreements, regional and international agreements, WMO membership or other means?
3.1.1.1 WMO resolutions	This component refers to the policy framework within WMO, governed at the highest level through Congress and the Executive Council. The list of current resolutions can be found in Resolutions of Congress and the Executive Council (WMO-No. 508). Some resolutions have major implications for data policy, such as Resolution 40 (Cg-XII), which provides guidelines for the policy and practice of exchanging meteorological data, and Resolution 25 (Cg-XIII), which provides guidelines on the exchange of hydrological data. The major innovation in these resolutions is the definition of essential data. Through these resolutions, WMO Members commit to making essential data available on a free and unrestricted basis at no more than the reasonable cost of extraction and formatting for delivery to the user. Members may not impose additional charges for essential data.	Required
3.1.1.2 WMO Technical Regulations	This component covers standard practices and procedures, as well as recommended practices and procedures, for WMO Members to follow and implement. The main references are the four volumes of the WMO Technical Regulations (WMO-No. 49): 1. Volume I – General Meteorological Standards and Recommended Practices 2. Volume II – Meteorological Service for International Air Navigation 3. Volume III – Hydrology 4. Volume IV – Quality Management Also see the publications of the annexes that are part of the Technical Regulations: Annex I – International Cloud Atlas (WMO-No. 407), Volume I Annex II – Manual on Codes (WMO-No. 306), Volumes I.1 and I.2 Annex III – Manual on the Global Telecommunication System (WMO-No. 386), Volume I Annex IV – Manual on the Global Dataprocessing and Forecasting System (WMO-No. 485), Volume I Annex V – Manual on the Global Observing System (WMO-No. 544), Volume I Annex VI – Manual on Marine Meteorological Services (WMO-No. 558), Volume I Annex VII – Manual on the WMO Information System (WMO-No. 1060) Annex VIII – Manual on the Implementation of Education and Training Standards in Meteorology and Hydrology (WMO-No. 1083), Volume I	Required
3.1.1.3 WMO technical commission guides	This component includes the recommendations or best practices that are generally drawn up as guides published by the different technical commissions: 1. Commission for Basic Systems 2. Commission for Instruments and Methods of Observation 3. Commission for Hydrology 4. Commission for Atmospheric Sciences 5. Commission for Aeronautical Meteorology 6. Commission for Agricultural Meteorology 7. Commission for Climatology 8. Joint WMO/IOC Technical Commission for Oceanography and Marine Meteorology (JCOMM) While most of the guides are relevant to CDMSs, it is recommended that particular attention be paid to: Guide to Meteorological Instruments and Methods of Observation (WMO-No. 8), from the Commission for Instruments and Methods of Observation Guide to Climatological Practices (WMO-No. 100), from the Commission for Climatology	Required
3.1.1.4 International	This component concerns commitments that organizations are required to adhere to because they qualify as international agreements. Some examples are: 1. The Global Framework for Climate Services, which has an important role in climate data management, climate monitoring and assessment, climate products and services, and climate information for adaptation and risk management. 2. International data policy agreements, such as the Infrastructure for Spatial Information in Europe, which has defined and legislated common principles for European Union countries that enable the sharing of environmental spatial information among public-sector organizations. This generally includes climatological data. 3. The Group on Earth Observations (GEO), which concerns the exchange of observations data. 4. The OGC and ISO, which focus on the exchange of data via open spatial standards. Note that not all standards are relevant.	Required
3.1.1.5 National	This component represents national policies that organizations are required to comply with. Some examples are: 1. National government policies on open data and open-source software. 2. National spatial data infrastructure frameworks that govern the exchange of national spatial data.	Required
3.1.2 Sustainability
3.1.2.1 Disaster recovery	This component refers to a policy that governs how an organization implements disaster recovery and business continuity solutions and covers issues such as how to handle data backups. Some initial issues to consider are: 1. How important is the climate record? 2. How often should climate data be backed up? `While the Wikipedia article on backups can provide some background information, this question must be answered by each NMHS. Their answer will need to take the following into account: 1. How important are their data? 2. How much data can they afford to lose? 3. How much work can they afford to redo? 4. In the interim, while NMHSs are assessing their requirements, it is suggested that daily backups be made with weekly off-site storage.` How many redundant copies of the data should be kept? What secure off-site facility should be used to store disaster recovery versions of the data? What is the acceptable downtime for each of the key CDMS applications? How high is the risk of catastrophic damage to the organization’s infrastructure due to a range of factors, including: Natural disasters, such as flooding, bush fires, tidal waves, cyclones or earthquakes? Man-made disturbances, such as military conflict, terrorism and fires? Where should the secure off-site storage be located? For NMHSs situated in areas where their infrastructure is likely to be damaged, it may be appropriate to place the secure off-site storage in another city or country. Is it appropriate to use cloud-based services to store and manage climate data and offer CDMS services to end-users?	Required
3.1.2.2 Funding	This component concerns a policy that regulates how an organization funds its CDMS to ensure that it is sustainable. This includes sufficient funding for: 1. Climate data management specialists, including staff with knowledge of observations data, data rescue, quality assurance, etc. 2. IT specialists who support the everyday maintenance of CDMS applications and related databases and IT equipment. 3. IT specialists who conduct enhancements to the CDMS and related IT environments. 4. The provision and scheduled upgrade of IT systems to ensure that the appropriate IT environment has been implemented to support the CDMS.	Required
3.1.2.3 Data custodian	This component represents a policy that governs how an organization manages and maintains its climate data. A data custodian is typically a senior-level manager who is accountable for the integrity of the climate record of the NMHS. Some examples of a data custodian’s duties are: 1. Preserving the integrity of the climate record, including quality control and ensuring that observational networks provide data that are suitable for climate purposes. 2. Championing the cause of data management to ensure that sufficient funding is allocated and managed effectively so that the climate record remains viable. 3. Facilitating the development and maintenance of suitable policies governing climate data. 4. Ensuring that climate data are effectively managed and maintained. 5. Ensuring that observations metadata, discovery metadata and data provenance are effectively maintained. 6. Formally delegating authority to appropriate staff members, together with related performance accountabilities. 7. Taking primary responsibility for CDMS applications, CDMS enhancement and IT maintenance projects. 8. Ensuring that IT changes do not corrupt the climate record. 9. Implementing and monitoring relevant key performance indicators to help monitor ongoing performance of the CDMS and related processes.	Required
3.1.2.4 Access to data	This component involves a policy that governs how an organization allows access to data. Some issues to consider are: 1. How is read-and-write access to climate data controlled? 2. What training is needed before staff are granted write access? 3. What types of controls are needed to monitor and approve changes to the climate record to facilitate data maintenance and error rectification? 4. How are any specific access constraints, such as security, contractual or commercial constraints, to be managed? Note: The issue of open data is increasing in relevance for many national governments that have made open-data commitments. Therefore, organizations should have a clear understanding of, and a clear policy for, the provision of NMHS climate data in accordance with open data principles (see Wikipedia article on open data). In summary, open data is about ensuring that data are freely available for reuse with no constraints apart from perhaps the requirement to acknowledge the data source and ensure that the data are made available under a share-alike agreement.	Required
3.1.2.5 Archival policy	This component represents a policy that dictates how organizations archive their climate data, including both digital and hard-copy historical records. This archive should be considered as permanent to ensure that the climate record is available for use by future generations. Therefore, care should be taken to ensure that the data are preserved in a format that users will be able to access and use in years to come.	Required
3.1.3 Intellectual property
3.1.3.1 Data licensing	This component refers to policies that ensure that any data licensing agreements that relate to the use of a dataset are clearly understood. For example: 1. What data licences apply to NMHS data? 2. What data licences will the NMHS release its data under? For example: 1. Is a Creative Commons licence appropriate? (See Creative Commons website) 3. What data licences used by external dataproviders are permitted for use within the NMHS? 4. Can the data be distributed to third parties? 5. Will the NMHS use and/or archive data that are not covered by an appropriate data licence? 6. Will the NMHS comply with the licensing of data passed to it, or will it choose not to use or archive the data?	Required
3.1.3.2 Access constraints	This component concerns policies that clearly define any access constraints relating to the use of climate data. For example: 1. What may the NMHS do with the data? 2. Can the data be used for the organization’s website, web services, publications and so forth? 3. Do any commercial or contractual constraints apply to the use of the data? 4. Are the data subject to any national security constraints?	Recommended
3.1.3.3 Usage constraints	This component covers policies that ensure that any constraints imposed on the end user regarding the use of a dataset are clearly understood. These constraints may apply to the NMHS or to third parties. For example: 1. Can the data be freely reused? 2. Is there a cost that applies to the use of the data? See the Commercial component (3.1.5.3) for a number of related considerations. 3. Can derived products be made using the data? 4. Can the data be used for commercial purposes? 5. Can the data be used for private, study or research purposes? 6. Can the data be shared with others? The Creative Commons website provides examples of data usage constraints and related licences.	Recommended
3.1.3.4 Copyright	This component refers to policies that clearly explain any copyright issues relating to the use of climate data.	Recommended
3.1.3.5 Attribution	This component deals with policies that ensure that any data attribution issues relating to the use of climate data are clearly understood. For example: 1. Should the source of the data be acknowledged? 2. How does the NMHS ensure that any attribution text required by the data provider is applied when the data are used?	Recommended
3.1.4 Data delivery
3.1.4.1 Interoperability standards	This component involves policies that ensure that climate data are delivered using appropriate open interoperability standards, such as open spatial standards. This will ensure that data are available in a form that facilitates data interoperability and are accessible to a wide range of end-users from disparate industries using a wide variety of proprietary and open-source software applications. Note: It is possible to enforce the use of data formats such as BUFR or GRIB within an NMHS in an attempt to facilitate interoperability. In reality though, only NMHSs and closely related organizations will be able to understand the formats and have the software to use the data. For more information, see section 8.1 of this publication.	Recommended
3.1.4.2 Quality of delivered data	This component refers to policies that clearly define the issues relating to the quality of the climate data delivered by NMHSs. For example: 1. What quality of climate data does the organization undertake to deliver and in what circumstances? 2. Will the organization only provide high-quality homogenized data? 3. Will the organization provide raw observations? 4. When is it appropriate to deliver data at each quality level?	Required
3.1.4.3 Cost recovery	This component refers to policies that ensure that issues relating to the recovery of costs for the provision of NMHS climate data services are clearly understood and communicated. These policies should also take into account: 1. NMHS commitments to WMO relating to Resolutions 25 (Cg-XIII) and 40 (Cg-XII), as discussed above. 2. National commitments to open data policies.	Recommended
3.1.5 Third-party data
3.1.5.1 Crowdsourced	This component refers to policies that address and explain issues relating to the use of crowdsourced climate data by the NMHS (see Wikipedia article on crowdsourcing). In summary, crowdsourcing is about obtaining data from a large pool of volunteers who would like to collaborate with the organization. Crowdsourcing has considerable potential for enhancing data generated by NMHSs, provided that the data are used appropriately. There are many examples of crowdsourcing initiatives that have generated very useful data. Some examples are: 1. OpenStreetMap 2. Old Weather – a data rescue project to digitize meteorological observations from old ship logs 3. Weather Observations Website (WOW), Met Office (United Kingdom) 4. Veilleurs du temps, Meteo-France Some issues that crowdsourcing policies should consider are: Only accepting crowdsourced data when the supplier has agreed to either provide the data under a Creative Commons ShareAlike licence (see Creative Commons ShareAlike page) or to have its intellectual property rights assigned to the NMHS. Having intellectual property rights to the data recorded for future use. When the data contributed are observations data: Having the contributor commit to providing and maintaining suitable observations metadata for their station(s) and sensor(s). Having the contributor agree to appropriate site visits by NMHS staff to assist with quality assurance processes. Deciding when it is appropriate to use crowdsourced data and under what conditions. Deciding how the crowdsourced data will be managed.	Recommended
3.1.5.2 Other agency	This component refers to policies that provide clear explanations of issues regarding the use of climate-related data captured and maintained by government agencies external to the NMHS. Examples of issues to consider include: 1. Is there a clear data licence allowing the NMHS to use the data for any relevant purpose, including for internal use, to create derived products, to publish the data on the NMHS website or for data redistribution, if required? 2. Are there any costs associated with the use of the data? 3. Is it a high-quality dataset? Do the discovery metadata clearly describe the intended use, lineage and quality assessment of the data? 4. What constraints apply to the use of the data? 5. If the data are observations data, are there high-quality metadata available to support CDMS activities? 6. If the data are of similar quality to NMHS data, share similar functions and help enhance the NMHS data, is the provider willing to consider joint ownership and maintenance of the data?	Recommended
3.1.5.3 Commercial	This component refers to policies that address and clarify issues on the use of climate-related data captured and maintained by commercial organizations. In addition to the considerations discussed under the Other agency component (3.1.5.2), other issues that commercial policies should consider include: 1. What contractual arrangements apply to the use of the data? 2. What costs apply explicitly to the use of the data? 3. Are there constraints on the number of users who can access the data at the same time? 4. Are there pricing qualifications based on the specific server environment that hosts the data, or more specifically on: 1. The number of central processing units, processor cores, threads and so forth? 2. Whether there are any restrictions based on the virtualization of the server? 5. Is there a time limit for the use of the data? If so, what must be done with the data at the end of this period? 6. Are the data only available via a subscription service or web service? If so, what impact will this have on NMHS operations in the event of a disruption of the service or when an invoice is not paid on time? 7. Are there any explicit commercial constraints on the use of the data? 8. Are the data actively maintained? 9. Do the costs, constraints and associated risks call for an arrangement to be made between the NMHS and the data provider? 10. Are there alternative datasets of similar functionality and quality that could be obtained elsewhere, such as a community-maintained dataset like OpenStreetMap?	Recommended
3.1.6 Climatology policy
3.1.6.1 Climate metadata	This component refers to polices that ensure that appropriate climate metadata are maintained to facilitate a better understanding of climate data. As defined in section 4.3, climate metadata include metadata on observations, discovery and data provenance.	Required
3.1.6.2 Data lineage traceability	This component concerns policies that ensure that the CDMS is able to trace the lineage of climate data from published scientific texts and other papers back to raw observations. This will include the ability to reproduce specific data that were held in the climate database at a particular point in time. Note that it may not be practical to implement this policy retrospectively, i.e. for papers published in the past. This specific policy requirement has become increasingly relevant following the so-called Climategate issue. One of the conclusions of the UK parliamentary enquiry that investigated this issue was that: It is not standard practice in climate science to publish the raw data and the computer code in academic papers. However, climate science is a matter of great importance and the quality of the science should be irreproachable. We therefore consider that climate scientists should take steps to make available all the data that support their work (including raw data) and full methodological workings (including the computer codes). Had both been available, many of the problems at UEA [University of East Anglia] could have been avoided. (UK Parliament Science and Technology Committee, 2011)	Required
3.1.6.3 Data generation	This component covers a range of policies that govern the generation and interpretation of observation variables. As these rules have changed and may change again in the future, the policies will need to cover past, current and future data generation policies. Some considerations are: 1. These policies will need to cover the methods, algorithms, models and software source code used to generate data. 2. They should include a definition of a climatological day. 3. They should also cover the rules relating to the management of missing observations. Some inconsistencies have been found in a number of WMO guidelines. For example, conflicting approaches regarding the handling of missing observations when computing a daily or monthly average, and especially regarding the handling of a number of consecutive missing data, are presented in: a) Guide to Climatological Practices (WMO-No. 100) b) Calculation of Monthly and Annual 30-Year Standard Normals (WMO/TD-No. 341), WCDP-10 c) Handbook on CLIMAT and CLIMAT TEMP Reporting (WMO/TD-No. 1188) Inconsistencies in WMO guides have also been noticed regarding the generation and storage of observations data recorded at minute frequency and the generation of hourly observation data. For example, the definition of climatological days may differ: Between NMHSs Between different stations within a single NMHS Over time within the same NMHS Between a given NMHS policy and the method applied by the software used in automatic weather stations (which often appear as a black box to NMHSs). For example, a climatological day could represent the following time periods: From 0000 LST to 2300 LST From 0100 LST to 0000 LST the next day From 2346 LST the previous day to 2345 LST the current day From 0900 LST the previous day to 0900 LST the current day There are also inconsistencies due to the use of either local standard time or daylight saving time. The same issues apply to the definition of a climatological hour.	Required
3.1.6.4 Climate networks	This component concerns a range of policies that determine the design of climatological networks and establish the station and network operations, including observation times, on-site quality control, observer training, station inspections, etc. For more information, see: a) Guide to Climatological Practices (WMO-No. 100), section 2.5 The design of climatological networks, and section 2.6 Station and network operations b) Technical Regulations (WMO-No. 49), Volume I, Part II, 1.3.1.1.2: The distribution of stations from which monthly surface climatological data are transmitted should be such that every 250 000 km2 is represented by at least one station and up to 10 stations where the density of the regional basic synoptic network permits; the distribution of stations from which monthly upper-air climatological data are transmitted should be such that every 1 000 000 km2 is represented by at least one station.	Required
3.1.6.5 Sensor or station change	This component covers a range of policies that apply to changes affecting a station or sensor (such as a replacement or relocation). These policies are of great importance for time-series analysis. They have implications for climate metadata and for the possible implementation of parallel observations for certain time periods. For more information, see: a) Manual on the Global Observing System (WMO-No. 544), Volume I, Part III, 3.2.4: At reference climatological stations, any change in instrumentation should be such as not to decrease the degree of accuracy of any observations as compared with the earlier observations, and any such change should be preceded by an adequate overlap (at least two years) with the earlier instrumentation. b) Guide to Meteorological Instruments and Methods of Observation (WMO-No. 8), Part II, 1.1.4 Climatological requirements: The following general guidelines are suggested for a sufficient operational overlap between existing and new automated systems: (i) Wind speed and direction: 12 months (ii) Temperature, humidity, sunshine, evaporation: 24 months (iii) Precipitation: 60 months. (It will often be advantageous to have an ombrometer operated in parallel with the automatic raingauge.) A useful compromise would be an overlap period of 24 months (i.e. two seasonal cycles). c) Guidelines on Climate Metadata and Homogenization (WMO/TD-No. 1186), WCDMP-53	Required
3.1.6.6 Quality assurance	This component refers to policies that ensure that quality assurance issues within an organization are clearly understood. Some issues to consider are: 1. What level of quality assurance can the organization afford to maintain over the long term? 2. Will the organization conduct quality assurance checks on all observation phenomena or only on a subset? 3. What quality assurance levels (or tiers) are used by the organization for the long term? 4. How is each quality assurance tier defined? 5. What quality assurance tests must data successfully pass before they are promoted to the next tier? 6. How could the ISO 9000 series of quality management standards help improve data management processes? See: a) Technical Regulations (WMO-No. 49), Volume IV	Recommended
3.1.6.7 Future Climate Data Framework	This component refers to policies that may be established by the future Climate Data Framework, as discussed in section 11.1. This may include, for example: 1. Common and consistent definitions of key datasets to be maintained by the NMHS, including common dataset names and service names. 2. A common definition of a climatological day, hour and so forth. 3. A consistent way of determining data uncertainty. 4. Common definitions of data quality. 5. The establishment of a quality assessment classification for derived data. 6. Authoritative taxonomies and code lists. 7. Common policies for handling missing data when creating derived data. 8. Common policies for deriving data when input data have differing levels of quality. For example, policies covering: 1. The calculation of monthly averages when daily data are at different levels of quality. 2. The generation of derived gridded datasets for a region where data for relevant stations are at different levels of quality. 9. Common policies for all components discussed within this data policy section (3.1). 10. The application of best practices for data homogenization techniques.	Recommended

3.2 Governance
3.2.1 Data governance
3.2.1.1 Controlled access to data and systems	This component refers to governance processes that ensure a clear understanding of user access to data and IT systems within an organization. Some data-related issues to consider are: 1. Which staff roles have read or read-and-write access to each type of data? 2. Which roles have read or read-and-write access to each quality tier of data? 3. What process is in place for designating staff to each role? 4. What process applies to ensure that access to data is subject to approval by a delegate of the data custodian? 5. Should users who have write access to data stored within a database only be able to change data under software control? 6. Should each successful change to observations data be audited to ensure that the change, the details of the operator who made the change and the time of the change have all been recorded? Some system-related issues to consider are: What roles should be given to run-time users to access data? Which roles have the ability to change operational software or systems? What process applies to ensure that access to systems and software applications is subject to approval by a delegate of the data custodian?	Required
3.2.1.2 Approval process for new data types	This component refers to governance processes that ensure that issues relating to the approval of new data types within an organization are clearly understood. Some issues to consider are: 1. What needs to be considered prior to accepting a new data type for long-term archival (such as user requirements, scientific requirements or a new statutory requirement)? 2. What are the projected storage requirements for the data? 3. What is the appropriate format for storing the data over the long term? 4. Is there likely to be an application that can read that format in 10 or 20 years? 5. What skills are available to process and analyse the data over the long term? 7. Is suitable funding available to maintain the data over the long term?	Required
3.2.1.3 Approval process to change data	This component refers to governance processes that clearly define the approval process required to modify the data held within an organization’s climate record.	Required
3.2.1.4 IT change approvals - no data corruption	This component refers to governance processes that ensure that any IT change does not result in an unexpected change to, loss of or corruption of the climate record.	Required
3.2.2 IT governance This collection of components refers to the overall governance of information technology to ensure that effective CDMSs are developed, enhanced and maintained. While these components are specifically aimed at larger organizations with a substantial investment in information and communication technology, smaller organizations can also benefit from investment in the types of issues discussed. These components are also relevant to broader initiatives, such as when an aid or development agency sponsors CDMS development and implementation. These issues will only be discussed very briefly, as each component refers to a substantial body of knowledge that will require ongoing investment from specialists to ensure effective management of resources and funds for information and communication technology.
3.2.2.1 IT service management	This component refers to an overarching framework for IT service management used to ensure effective, efficient and cost-effective management of the delivery of business value from IT services. This framework is typically based on global best practices. It comprises a consistent framework within which the remaining IT governance components can operate in a tightly integrated manner. For more information, see: a) Wikipedia overview of IT service management b) Information Technology Infrastructure Library (ITIL) website c) ITIL publications	Recommended
3.2.2.2 Managed change	This component covers governance processes that ensure that any change to the CDMS is carefully managed. Uncontrolled change can result in chaos for users of the CDMS, for example: 1. Systems may break down for no known reason. 2. Data corruptions or even data loss may occur and may not be detected at all. 3. Provenance of data is severely impacted, as data managers do not know what algorithms or processes have been applied to the data. 4. Considerable disruption may be experienced by staff that rely on the CDMS for day-to-day activities and may inconvenience users who rely on the data and derived products. 5. Data corruption, data loss and lack of availability of key systems may have a significant impact on the reputation of the NMHS. Some issues to consider: How is the change process managed? Who needs to be consulted when considering a potential change? How is the proposed change analysed to assess its potential impact on data integrity and on other CDMS components? Who can authorize a change? What testing is required to mitigate the potential impacts of the change and give confidence that the change is desirable? How can the change be made with minimal impact? Are there any dependencies between a series of proposed changes that may dictate that the changes occur in a particular order? How is the change to be made and tested? If the change fails, how can the system be rolled back to its previous state? How can CDMS components be implemented such that they are self-testing to support concepts such as continuous integration? Will processes based on an IT service management framework such as ITIL help improve the management of IT-related changes? (See Wikipedia article on IT service management)	Required
3.2.2.3 Project management	This component concerns governance processes that ensure that any development activity, infrastructure change or other enhancement related to the CDMS is carefully managed. Good project managers and project governance processes can mean the difference between a project delivering desired results or the same project failing at great expense, frustrating users and possibly corrupting the climate record. Uncontrolled enhancement can result in: 1. An undesirable functionality added to the CDMS. 2. Lower-priority work undertaken at the expense of higher-priority tasks. 3. The development of activities that do not have a clearly defined scope, set of deliverables, timeline and budget. Some additional considerations include: How is the business benefit for a proposed project defined and assessed? How are the relative priorities of projects assessed? How are projects managed? What approach is adopted? What process is used to monitor the progress of a project? How are the project deliverables assessed to ensure that they meet requirements? What lessons can be learned from a finished project?	Recommended
3.2.2.4 IT architecture	This component refers to strategic IT governance processes that ensure that CDMSs and related IT systems are carefully designed so that the science and NMHS requirements for CDMSs may be effectively and efficiently implemented. Uncontrolled development of components can have adverse impacts on the CDMS, including on the ability of the NMHS to maintain the CDMS over the long term. For example: 1. Reliance on proprietary solutions provided by a single vendor could result in a situation known as vendor lock-in. As time passes, more and more components are developed based on the single solution. It can be very difficult and costly for a NMHS to move to a new CDMS-related solution if the current vendor’s product becomes unaffordable or if the vendor goes out of business, decides that it is no longer in its interest to offer CDMS components, ceases to maintain its systems or no longer offers a competitive CDMS solution. 2. Similarly, uncontrolled bespoke development of CDMS components by NMHS staff or contractors can also be a problem. This may result in a wide variety of disparate CDMS components that cannot be easily integrated. Many different types of technology could be used, resulting in high maintenance costs. Key-person dependencies could also develop, leaving the NMHS with a significant and costly issue to resolve, particularly if the key person leaves the organization and the CDMS ceases to operate. 3. CDMS component solutions could be developed that do not effectively use NMHS climate data, for example: 1. Data may be replicated from the climate database and made into stand-alone files to support a particular technology or due to the developer’s lack of experience. Whereas the data in the climate database are subject to a lengthy quality assurance lifecycle and may change, the replicated data are fixed at a point in time and may not be updated to reflect later changes to the climate record. This may become a significant issue for the NMHS, particularly if this practice is systematically applied by many developers, across many applications and over many years. 2. The data extraction process may not take into account quality assurance flags and just present raw observations with no indication of data quality or reliability. 3. Developers may not understand the complexities of the climate database data model and extract incorrect data. 4. Components may have any number of data inconsistency or misuse issues. What is the impact on the reputation of the NMHS if the same data are presented with differing values by different CDMS components? There may also be architectural considerations that are better off understood early in the CDMS development so that governance processes can ensure that the considerations are implemented within developed and/or implemented software. One issue to consider is whether the CDMS should be able to work in multiple languages. The answer to this will have implications for the design of software for the user interface of CDMS components, as well as for the generation and presentation of data products. IT architecture is a specialized field of information technology that is typically undertaken by highly experienced professionals with very broad IT experience. Ideally, this task will be conducted by experienced NMHS staff to ensure that consistent CDMS components are implemented in accordance with a strategic vision. See also Wikipedia articles on: a) Enterprise architecture b) Solution architecture	Recommended
3.2.2.5 Documentation	This component covers governance processes that ensure that the CDMS is adequately documented and that this documentation is kept up to date to facilitate efficient day-to-day use by staff and ease the learning curve of new staff, contractors, consultants, etc. This documentation is very broad and includes: 1. An overview of the CDMS 2. An overview of the data being managed both within and outside the climate database 3. The CDMS components, design, business requirements, architecture, test plans and deployment processes 4. CDMS policies and governance processes 5. CDMS backup and disaster recovery processes 6. IT systems management and administration processes 7. Various CDMS-related metrics	Required

4 Time-series climate data

4.1 Observations data
4.1.1 Climate observations This subsection refers to the system’s capacity to handle climate observation variables. The list below is not comprehensive and additional variables may be required or recommended. The list is currently based on the ECVs published in 2010 by GCOS. However, it is worth noting that there are other types of variables that are very relevant to climate observation, such as visibility, evaporation and meteorological phenomena. Observations include measurements made by observers, traditional sensors and remote sensors. For more information, see: a) GCOS website b) GCOS web page on ECVS
4.1.1.1 Atmospheric	Observations relevant to the GCOS ECVs are needed to support the work of the United Nations Framework Convention on Climate Change and the Intergovernmental Panel on Climate Change. Within this component, observations relating to the following atmospheric ECVs (over land, sea and ice) are either required or recommended, as shown below: Required 1. Surface 1. Air temperature 2. Wind speed and direction 3. Water vapour 4. Pressure 5. Precipitation 6. Surface radiation budget 2. Upper air 1. Air temperature 2. Wind speed and direction 3. Water vapour Recommended Upper air Cloud properties Earth radiation budget (including solar irradiance) Composition Carbon dioxide Methane Ozone Aerosol Other For more information, see: a) GCOS web page on ECVs b) Guide to Climatological Practices (WMO-No. 100), Table 2.1	Required
4.1.1.2 Terrestrial	Terrestrial ECVs: 1. River discharge 2. Water use 3. Groundwater 4. Lakes 5. Snow cover 6. Glaciers and ice caps 7. Ice sheets 8. Permafrost 9. Albedo 10. Land cover (including vegetation type) 11. Fraction of absorbed photosynthetically active radiation 12. Leaf area index 13. Above-ground biomass 14. Soil carbon 15. Fire disturbance 16. Soil moisture For more information, see: a) GCOS web page on ECVs b) Guide to Climatological Practices (WMO-No. 100), Table 2.1	Recommended
4.1.1.3 Oceanic	Oceanic ECVs: 1. Surface 1. Sea-surface temperature 2. Sea-surface salinity 3. Sea level 4. Sea state 5. Sea ice 6. Surface current 7. Ocean colour 8. Carbon dioxide partial pressure 9. Ocean acidity 10. Phytoplankton 2. Subsurface 1. Temperature 2. Salinity 3. Current 4. Nutrients 5. Carbon dioxide partial pressure 6. Ocean acidity 7. Oxygen 8. Tracers For more information, see: a) GCOS web page on ECVs b) Guide to Climatological Practices (WMO-No. 100), Table 2.1	Recommended

4.2 Logical data models In order to share climate data between organizations or make meaningful comparisons between information from datasets provided by different publishers, it is highly recommended that climate data conform to a logical data model. A logical data model describes the content and structure of information resources at an abstract level. When implemented in a CDMS, the logical data model underpins the design of the information objects managed by the system and helps to determine the questions that one may ask of the system when querying those information objects and their interrelationships. Furthermore, the logical data model may be used as the basis for developing data formats within which the information objects can be serialized for exchange between systems – thus ensuring interoperability between those systems. WMO is currently developing a logical data model termed the Modèle pour l’ Échange des informations sur le Temps, le Climat et l’Eau2 (METCE), which is intended to provide a basis for application-specific data models, including those used for climate observing and climate data management. Ideally, the underlying database structure will be based on the logical data model.
4.2.1 Climate database This subsection refers to the database(s) used to store and manage a range of time-series data, including: climate observations, climate metadata (observations, discovery and data provenance), spatial information, derived data and related data required for effective data management. More advanced CDMSs may manage the data in a series of related databases rather than in a single database. It is recommended that the climate database provide support for the following functionalities, classified by priority: Required 1. Managing core observations described in the Guide to Climatological Practices (WMO-No. 100). 2. Managing observation metadata (such as station metadata) and integrating them with observations data. 3. Handling observations from multiple sensors per station, per phenomenon, and recording the source of each observation. 4. Managing multiple tiers of data quality, from raw records to homogenized data. 5. Managing spatial and time-series data. Recommended 6. Covering at least the GCOS ECVs. 7. Using a robust data model that takes into account the requirements of open spatial standards, particularly the ISO 19156:2011 Geographic information – Observations and measurements standard, METCE and the WMO climate observations application schema (see component 4.2.3.2). 8. Managing metadata related to data provenance. This entails ensuring that each change to an observation is recorded for future recovery, and recording the details of why a particular change was made, which includes: 1. Tracing the product lineage to the data source. For example, what observations and gridded data were used to underpin the analysis released in peer-reviewed paper X? 2. Ensuring that the reason for each observation change is recorded. 9. Managing third-party and crowdsourced data. 10. Managing intellectual property rights related to data. 11. Enabling point-in-time recovery. For example, what data were present in the database for station X at time T? 12. Storing a range of document formats, such as: 1. Photographs of observation stations and instruments, meteorological phenomena, etc. 2. Scanned paper observation forms 3. Scanned microfiche/microfilm 4. Relevant observations metadata documents, such as instrument calibration reports 5. Technical manuals 6. Site location plans and sections 7. Videos and other multimedia formats Optional 13. Handling data uncertainty (for more information, see Wikipedia articles on uncertain data and uncertainty). 14. Managing multidimensional time-series gridded data and possibly numerical models. 15. Providing support for the information management concepts of semantics and linked data.
4.2.1.1 Data dictionary	This component represents the data dictionary, which describes the database structure, data model and other elements used by the climate database.	Required
4.2.2 Foundation standards
4.2.2.1 Observations and measurements	This component represents technology that provides rules and a standardized approach for modelling observations data, regardless of the domain. In essence, the ISO 19156:2011 Geographic information – Observations and measurements standard treats an observation as an event at a given point in time whose result is an estimate of the value of some property of a feature of interest, obtained using a known observation procedure. This standard is being widely adopted as the framework for a number of logical data models related to observations data, such as WaterML and the Meteorological Information Exchange Model of the International Civil Aviation Organization (ICAO). It also underpins current work on the WMO logical data model called METCE (see below). For more information, see: a) ISO 19156:2011, Geographic information – Observations and measurements b) OGC Abstract Specification: Geographic information – Observations and measurements	Recommended
4.2.3 WMO logical data models
4.2.3.1 METCE	The METCE component represents technology that provides rules and a standardized approach for modelling observations and simulations in the weather, water and climate domains. METCE is an application schema conforming to ISO 19109:2005 Geographic information – Rules for application schema. Furthermore, METCE is a profile of the Observations and measurements standard that provides domain- and application specific semantics for use within the weather, water and climate domains. The initial iteration of METCE and its companion model, the Observable Property Model, were developed by the Task Team on Aviation XML to support the ICAO Meteorological Information Exchange Model. However, METCE will provide a common semantic basis for a growing number of data products relating to observation and simulation within WMO. Not only will this simplify the requirements for software systems working with WMO products, it is also expected to simplify the mappings between WMO data products and counterparts from other communities such as CF-netCDF. As at December 2013, plans have been made to provide mappings/rules to convert from the METCE application schema to BUFR sequences and/or GRIB templates at some point in the future. For more information, please see: a) AvXML-1.0 data model b) ISO 19109:2005, Geographic information – Rules for application schema	Optional
4.2.3.2 Climate observations application schema	This component represents technology that provides rules and a standardized approach for modelling climate observations data. It is anticipated that METCE will be used as the basis for developing an application schema that will provide more detailed semantics and constraints specific to a given domain or application. In this way, METCE will provide the basis for an application schema developed to support the wide array of climate observation applications. The scope of such an application schema is expected to cover both the climate observations themselves and the associated observation metadata (see subsection on observations metadata (4.3.1)).	Optional

4.3 Climate metadata The term climate metadata is defined in this publication as the suite of supporting data required to effectively manage climate data and assess the data’s fitness for purpose. Climate metadata are made up of the following components: Observations metadata Time-series data that describe how, when and where meteorological observations were made and the conditions they were made under. Discovery metadata Information intended to facilitate the discovery and assessment of a dataset to determine if it is fit for reuse for a purpose that may be at odds with the reason for which it was originally created. Data provenance metadata Information relevant to climate data that allows end-users, including data managers, scientists and the general public, to develop trust in the integrity of the climate data. The following section expands on each component of climate metadata.
4.3.1 Observations metadata This subsection covers access to and management of station metadata and platform metadata. Station and platform metadata are time-series data that describe how, when and where meteorological observations were made and the conditions they were made under. They are used to support a range of activities that allow climate professionals to understand the fitness for purpose of specific data and, in many cases, improve the quality of climate observations data. This type of metadata is referred to as observations metadata in this publication. It is anticipated that application schemas (also known as logical data models) will be developed to formally define the structure and content of the information required to describe climate observing stations, sensors and platforms (see the Climate observations application schema component (4.2.3.2)) Note: As a general rule, it will be necessary to record and maintain the details of any change to observations metadata in order to understand the context surrounding specific climate data and to support future data homogenization activities. In addition to details of the change, specific reference must be made to: 1. The date/time of the change - Note: It may not always be possible to define the exact date of the change, for example when a change happens between two site visits. Therefore, it may be more appropriate to include a period of time during which the change occurred. 2. The reason the change was made. 3. The beginning and end dates of the prior value. 4. Any date/time reference will need to be constrained by the appropriate temporal datum to ensure that date handling is consistently applied. For more information, see: a) Guide to Climatological Practices (WMO-No. 100) b) Guide to Meteorological Instruments and Methods of Observation (WMO-No. 8) c) Guide to the Global Observing System (WMO-No. 488), Appendix III.3 Automatic weather station metadata d) Manual on the Global Observing System (WMO-No. 544), Volume I, Attachment III.1 Standard set of metadata elements for automatic weather station installations e) Discussion paper on stations metadata and the WMO Core Profile (Bannerman, 2012) f) Draft paper on the climatological needs for minimum stations metadata in the frame of WMO publication No. 9, Volume A (Stuber, 2012) g) Guidelines on Climate Metadata and Homogenization (WMO-No. 1186), WCDMP-53
4.3.1.1 Station identifier	This component supports the management of identifiers associated with the observation station or platform. Identifiers include: 1. A globally unique WMO identifier. The use of this identifier must become a priority in order to support future global analysis. See recommendation 11.6 of this publication. 2. Other identifiers or aliases used for the station. 3. A history of past used identifiers, including historical WMO identifiers. 4. The beginning and end dates of each historical identifier used for the station.	Required
4.3.1.2 Station overview	This component covers what to provide in an overview of the observation station or platform. This should include: 1. Station owner - If required, the sensor owner 2. Station manager - If required, the sensor manager 3. Maintenance authority - If required, the sensor maintenance authority 4. Station licence agreement - If required, the sensor licence agreement 5. Station data usage constraints - If required, the sensor data usage constraints 6. Purpose of the station 7. Observation practices 8. Observation schedule - this is particularly relevant for stations that use manual observation methods and where observations are not taken on a continuous basis 9. Definition of which datasets provide the actual observations data for a given station, sensor and phenomenon combination, together with the URL of the relevant discovery metadata records 10. Observers and maintenance personnel, including their names, experience and training level 11. Station logistics, including consumables, electricity suppliers, communications suppliers, etc.	Required
4.3.1.3 Station status	This component supports recording the period(s) of activity during which observations were being made at the station or platform. As it is possible for stations to close and then reopen at a later time, the time period of each status is also required. Valid operational status codes are: 1. Operational 2. Not operational	Required
4.3.1.4 Station type	This component supports recording the type of station or observation platform. Ideally, the type will be recorded for each instrument used at that station. At a minimum, the station type is to be recorded in accordance with the following guidelines: 1. Manual on Codes (WMO-No. 306), Volume I.1, code 1860, and Volume I.2, code 0 02 001 In addition, there may be multiple definitions of station types used by the NMHS and other organizations.	Required
4.3.1.5 Location	This component covers the recording of details relating to the location of the station or observation platform. As discussed in the Sensor component below (4.3.1.7), recording the location of each sensor at the station is also mandatory. The following information is required: 1. Latitude 2. Longitude 3. Elevation 4. Spatial reference system (horizontal and vertical) 5. Date/time of the survey observation used to record the location of the station and/or sensor 6. Temporal reference system 7. Method used to determine the location of the station 8. Positional accuracy of location 9. Date/time the station or sensor moved, together with previous locations 10. Administrative boundaries within which the station is located (as required by the NMHS) 11. Time zone In exceptional circumstances, it may be necessary to move a station and keep the same station identifier. This should only be done in accordance with future global climate data policies and data governance processes (see recommendation 11.1 of this publication). Typically, this will involve parallel observations at the old and new station over a period of time (two years, for example). If this is necessary, the time of the move should be recorded, together with current and past location details. The precision required for the latitude, longitude and elevation should be in accordance with guidance provided by the Commission for Instruments and Methods of Observation. While not authoritative, the final report of the first session of the Commission for Instruments and Methods of Observation Expert Team on Standardization suggests that the precision required for latitude and longitude measurements is one second (of arc), which equates to approximately 30 meters at the equator. This degree of precision should be achievable using a survey observation process that uses handheld GPS techniques. Administrative boundaries may refer to different types of boundaries that contain the station, including: i. Political boundaries, such as state, regional or district boundaries ii. Administrative boundaries, such as the forecast district iii. Natural boundaries, such as hydrological, catchment or topographic areas For more information, see: a) ISO 19111-2:2009 Geographic information – Spatial referencing by coordinates, Part 2: Extension for parametric values b) Guide to Meteorological Instruments and Methods of Observation (WMO-No. 8), section 1.3.3.2 Coordinates of the station. Specifically note the instructions on determining the elevation of a station relative to the raingauge. c) OGC Abstract Specification: Spatial referencing by coordinates	Required
4.3.1.6 Local environment	This component concerns the recording of information on the local environment surrounding the station or observation platform. The following information is required: 1. Site location diagram 2. Site plans - For an example, see Guide to Meteorological Instruments and Methods of Observation (WMO-No. 8), Figure 1.1 3. Site skyline diagram 4. Site photographs and video showing the surroundings and instrument layout 5. Station exposure 6. Site roughness 7. Type of soil 8. Type of vegetation 9. Surrounding land use 10. Date/time of each visit For more information, see: a) Guide to Meteorological Instruments and Methods of Observation (WMO-No. 8), section 1.3.3 Siting and exposure, Annex 1.B Siting classifications for surface observing stations on land, and Annex 1.C Station exposure description	Required
4.3.1.7 Sensor	This component covers the recording of details relating to the meteorological sensors and/or instruments used at the station or observation platform. In this publication, the term sensor will be used to cover all instrument types. The following information is required: 1. Sensor description, including: 1. Name 2. Type 3. Serial number 4. Brand and model details 5. Photograph of sensor in situ 6. Supplier 7. Manufacturer 8. Location of manuals 9. Sensor firmware, version and dates during which each version was used 10. Length of time the observation data are stored locally on the sensor, prior to deletion 2. Sensor installation details, including: 1. Technician and organization that installed the sensor 2. Date sensor was installed 3. Sensor status, including: 1. Operational status: - Operational - Not operational - Defective - Testing 2. Date/times applicable for each status 4. Sensor maintenance: 1. Scheduled maintenance 2. Actual maintenance 3. Result 4. Replacement of consumables 5. Sensor uncertainty: 1. System performance statistics claimed by manufacturer 2. Sensor calibration results 3. Observed sensor performance characteristics 6. Sensor siting details: 1. Instrument height above ground 2. Station exposure description 3. As discussed in the Location component (4.3.1.5), recording the location of each sensor is required. 7. Recommended sensor settings for optimal operations on site 8. Details of what meteorological variable is being observed by the sensor (i.e. the observed property), including: 1. Phenomena observed 2. Frequency of measurement 3. Frequency of acquisition 4. Units of measurement 5. Precision of measurement	Required
4.3.1.8 Data processing	This component concerns the recording of details relating to any data processing that has occurred to convert a sensor’s signal into its recorded observation value. The following information is required: 1. Software, including: 1. Version 2. Software language 3. Software name 4. Location of software source code 5. Description of processing applied (for example, whether values were calculated per minute, hour or other) 6. Formula/algorithm implemented 7. Processor details (the version, type of central processing unit and so forth) 8. Date/time covering the period of validity of the method 2. Input source (instrument, element and so forth) 3. Data output, including: 1. Data format and version of format	Required
4.3.1.9 Data transmission	This component refers to the recording of details relating to the transmission of data from stations or observation platforms. The following information is required: 1. Sensor communications, including: 1. Frequency of transmission 2. Time of transmission 3. Primary communication details 4. Redundant communication details 5. Network addresses 6. Method of transmission Note: Some NMHSs with more advanced IT infrastructures may choose to store this type of information within their configuration management system. In these instances, it is important to ensure that at least the frequency and time of the transmission are replicated in the observations metadata.	Recommended
4.3.1.10 Network	This component concerns the recording of details relating to the observation network(s) that stations or observation platforms may belong to. The following information is required: 1. Network name (such as Regional Basic Climatological Network, Regional Basic Synoptic Network, GCOS, GCOS Upper-Air Network or National Climate Network) 2. Network priority: 1. Critical 2. Essential 3. Not applicable 3. Time of observations 4. Reporting frequency 5. Date/time of network membership 6. There is a possibility that a station does not belong to a network. This information is also useful.	Required
4.3.2 Dataset discovery metadata This subsection refers to the processes, software and governance arrangements that ensure that discovery metadata are captured, managed and maintained. Discovery metadata are intended to facilitate the discovery and assessment of a spatial dataset to determine if it is fit for reuse for a purpose that may be at odds with the reason for which it was originally created. Discovery metadata may also be known as WIS metadata. They are not the same as the observations metadata described above. Note that some of the components below may be in addition to the WMO Core Profile of the ISO 19115 Geographic information – Metadata standard. This is not expected to be an issue as the WMO Core Profile does not restrict the use of additional ISO 19115 records. For more information, see: a) Discussion paper on stations metadata and the WMO Core Profile (Bannerman, 2012) b) WMO Core Metadata Profile version 1.3, Part 1 Conformance requirements c) WMO Core Metadata Profile version 1.3, Part 2 Abstract test suite, data dictionary and code lists d) Tandy (2010) also provides a useful introduction to discovery metadata. However, note that the specifications part of this text has been superseded. e) ISO 19115 Geographic information – Metadata
4.3.2.1 Dataset identifier	This component represents a unique identifier used to identify the dataset.	Required
4.3.2.2 Dataset overview	This component gives an overview of a dataset. This may include a description of the dataset (such as an abstract), the intended use of the dataset, its lineage and status.	Required
4.3.2.3 Dataset data quality	This component represents a general assessment of the quality of a dataset.	Required
4.3.2.4 Distribution	This component covers information about the distributor of and options for obtaining a dataset.	Required
4.3.2.5 Access constraints	This component provides information on the restrictions in place for a dataset.	Required
4.3.2.6 Dataset maintenance	This component provides information on the scope and frequency of updates and maintenance conducted on a dataset.	Required
4.3.2.7 Spatial representation	This component covers information on the mechanisms used to represent spatial information within a dataset.	Required
4.3.2.8 Reference systems	This component gives information on the reference systems used by a dataset. These include a horizontal spatial reference system, vertical spatial reference system and temporal reference system.	Required
4.3.3 Data provenance This subsection refers to the processes, data and governance arrangements that record and manage information relevant to climate data and enable end-users, including data managers, scientists and the general public, to develop trust in the integrity of the climate data. Data provenance allows an end-user to understand the history of each piece of data, and thus helps the user to identify what version of the data was available at any given time. The need for this new type of climate metadata has become more evident following a number of attacks on the credibility of climate data. One notable example is the so-called Climategate incident and subsequent inquiries. Therefore, it is important for NMHSs to establish the reliability of their climate data and processes and to ensure that these data are subsequently seen as the authoritative source that can be used for global climate studies. While the concept of data provenance has been relatively nebulous within the information management domain for many years, there has been a significant amount of work on the concept within the World Wide Web Consortium (W3C) for over more than a decade, particularly with regard to the development of the PROV standard. The W3C defines provenance as: > [A] record that describes the people, institutions, entities, and activities involved in producing, > influencing, or delivering a piece of data or a thing. In particular, the provenance of information > is crucial in deciding whether information is to be trusted, how it should be integrated with > other diverse information sources, and how to give credit to its originators when reusing it. In > an open and inclusive environment such as the Web, where users find information that is often > contradictory or questionable, provenance can help those users to make trust judgements. > (W3C, 2013a) While this work is still relatively new, it is showing significant potential for use within the climate domain. The concepts presented here for climate data provenance are quite embryonic and need further work to ensure that they can be implemented effectively. See the related recommendation in Chapter 11. Maintaining a dataset with high levels of data provenance metadata is expected to be quite expensive and, as a result, will be limited to data of high importance such as high-quality climate monitoring datasets. It is anticipated that guidance will be required to suggest what data should be maintained with what level of data provenance metadata. This guidance could perhaps be included as a policy within a future climate data framework. For more information, see: a) Overview of the PROV family of documents (W3C, 2013b) b) PROV data model specification (W3C, 2013a)
4.3.3.1 What was changed?	This component refers to the processes, software and governance arrangements that ensure that any change to climate data is recorded.	Optional
4.3.3.2 When was it changed?	This component covers the processes, software and governance arrangements that ensure that the time of the change is recorded.	Optional
4.3.3.3 What was it derived from?	This component deals with the processes, software and governance arrangements that ensure that a dataset’s lineage is understood. In other words, where did the data come from? This component is also required in the section on data discovery (8.2).	Optional
4.3.3.4 What was done to change it?	This component refers to the processes, software and governance arrangements that ensure a clear explanation of any ad hoc modifications to a climate record. This includes: 1. What was changed 2. When the change was made 3. Details describing what was done	Optional
4.3.3.5 How/why was it changed?	This component refers to the processes, software and governance arrangements that ensure that the rationale behind a modification to a climate record is clearly understood. This includes: 1. How the change was made 2. Why it was made	Optional
4.3.3.6 Who/what changed it?	This component involves the processes, software and governance arrangements that ensure a clear understanding of the agent that affected the change.	Optional
4.3.3.7 Who did they act on behalf of?	This component refers to the processes, software and governance arrangements that ensure that the person or role who requested the change is identified.	Optional
4.3.3.8 Who was responsible?	This component refers to the processes, software and governance arrangements that ensure that the person or role who authorized the change is identified.	Optional

4.4 WMO standard products
4.4.1 Observation data products This subsection outlines the types of data products that NMHSs have committed to generate and provide to WMO.
4.4.1.1 Routine messages	This component represents data computed from observation data for use in WMO products. An example is the daily minimum and maximum temperature, evaporation and evapotranspiration that are typically transmitted via the Global Telecommunication System (GTS) as a SYNOP message or in the corresponding table-driven code form (TDCF) for SYNOP. For more information, see: • Manual on Codes (WMO-No. 306)	Required
4.4.1.2 Climatological standard normals	This component covers monthly and annual standard normals. See Wright (2012a) for a description of a recommended change in the method of calculating climatological standard normals. The approach is understood to comprise the following. (However, as at December 2013 this has yet to be officially endorsed.) 1. A fixed reference period (1961–1990) for longterm climate variability and change assessment. This is to be adopted as a stable WMO reference period, until such time as there is a compelling scientific case for changing it. 2. A varying 30-year period updated every 10 years (suitable for most climate services). The current period is 1981–2010. For more information, see: a) Calculation of Monthly and Annual 30-Year Standard Normals (WMO/TD-No. 341), WCDP-10 b) Technical Regulations (WMO-No. 49), Volume II c) Guide to Climatological Practices (WMO-No. 100), section 4.8 Normals d) Manual on Codes (WMO-No. 306) e) 1961–1990 Global Climate Normals (CLINO) (WMO-No. 847) f) A Note on Climatological Normals: Report of a Working Group of the Commission for Climatology, Technical Note No. 84 g) The Role of Climatological Normals in a Changing Climate (WMO/TD-No. 1377), WCDMP-61 h) Discussion paper on the calculation of the standard climate normals (Wright, 2012a)	Required
4.4.1.3 CLIMAT	This component concerns CLIMAT messages in either traditional alphanumeric codes (TAC) or TDCF formats. These messages are transmitted to WMO via the GTS. Note: The use of TAC is being phased out. For more explanation, see: a) Handbook on CLIMAT and CLIMAT TEMP Reporting (WMO/TD-No. 1188) b) Practical Help for Compiling CLIMAT Reports (WMO/TD-No. 1477), GCOS-127 c) Manual on Codes (WMO-No. 306), Volume I.2, Part C, section d. Regulations for reporting traditional observation data in table-driven code forms (TDCF): BUFR or CREX	Required
4.4.1.4 World Weather Records	This component covers the annual World Weather Records. For more details, see: a) Guidelines on Submission of the World Weather Records Tenth Series (2001–2010), WCDMP-77 b) World Weather Records website	Required
4.4.1.5 Aeronautical climatology	This component refers to the monthly aerodrome climatology summary -tabular forms (models A to E). For more explanation, see: a) Technical Regulations (WMO-No. 49), Volume II, C.3.2 Aeronautical climatology	Required
4.4.1.6 Other	This component covers other WMO standard products, as may be required. Other standard products are to be confirmed, particularly for the hydrological, agricultural and marine domains.	Optional
4.4.2 Climate change indices This subsection represents the recommendations of the Commission for Climatology in the elaboration of climate change indices. The Commission for Climatology/Climate Variability and Predictability (CCl/CLIVAR) Working Group on Climate Change Detection has been coordinating an international effort to develop, calculate and analyse a suite of indices so that individuals, countries and regions can calculate the indices in exactly the same way such that their analyses will fit seamlessly into the global picture. Those indices have been split in two categories: core indices and approved indices. For more information, see the website of the Joint CCl/CLIVAR/JCOMM Expert Team on Climate Change Detection and Indices (ETCCDI).
4.4.2.1 Core indices	This component represents the ETCCDI core climate change indices. As at June 2013, 27 core indices have been defined, see: a) ETCCDI website	Required
4.4.2.2 Other indices	This component covers other climate change indices. As at June 2013, different research groups have defined different indices for their particular purposes. One example is the Statistical and Regional dynamical Downscaling of Extremes for European regions (STARDEX) project, mentioned in the ETCCDI website referred to above.	Optional

4.5 Derived climate data
4.5.1 Derived observation data This subsection describes a range of derived observational data products generated from climate observation variables.
4.5.1.1 Homogenized data	This component represents high-quality homogenized time-series datasets. Such datasets aim to ensure that the only variability remaining in the time series is that resulting from actual climate variability. See also the subsection on data homogenization (6.1.3).	Recommended
4.5.1.2 Computed	This component deals with derived data computed from observations for NMHS products. The computation shall be in accordance with the climatology policies in place. Some examples are: 1. Data generation of accumulation, averages or extremes, such as generating ten-day data from daily data or daily data from hourly data. 2. The generation of particular data derived from raw data, such as computing the potential evapotranspiration output for agricultural purposes. 3. The generation of any statistical parameter required for products, such as extreme value analysis, homogenized data and others. 4. Climate indices such as computed teleconnection indices.	Recommended
4.5.1.3 Normals and averages	This component represents any normals and averages used by NMHSs that are in addition to climatological standard normals. For example, the component should be able to compute: 1. Averages over specified time periods (daily, hourly, 5 days, 10 days, monthly and so forth) 2. Period averages for any period (for example 5, 10, 30 or 100 years) For more information, see: a) The Role of Climatological Normals in a Changing Climate (WMO/TD-No. 1377), WCDMP-61	Recommended
4.5.1.4 Other	This component concerns any other derived observation data product not mentioned above that is required for NMHS purposes.	Optional
4.5.2 Gridded spatial distribution of observations This subsection concerns the capacity to generate or manipulate gridded data according to different techniques such as interpolation and extrapolation. Some of these techniques are described in the Guide to Climatological Practices (WMO-No. 100), section 5.9 Estimating data. These gridded data are spatial data. They are included in this section to show their lineage as a type of derived climate data.
4.5.2.1 Analysed data	This component refers to spatially distributed gridded data that have been derived from observational data as the result of an analytical process. Some examples are: 1. Singular variables such as: 1. Normals 2. Observations for a given day or time 3. Averages 4. Percentiles 5. Cumulative data 6. Extremes 7. Homogenized data 2. Multivariables such as: 1. The generation of anomalies (difference between the normals data and a specific monthly variable) 2. More complex data such as potential evapotranspiration	Recommended
4.5.2.2 Other	This component covers any spatially distributed gridded data product not mentioned above that are required for NMHS purposes.	Optional
4.5.3 Numerical models Note: The infrastructure, software and skills required to operate numerical models are undoubtedly beyond the reach of many NMHSs. As the output from numerical models is of interest to most NMHSs, it is expected that such output will be available via a number of sources, including Regional Climate Centres. Therefore, the CDMS should ideally have the ability to work with such data.
4.5.3.1 Numerical models	This component refers to the data output from a variety of climate modelling processes. Such data are generally represented by multidimensional array grids. Some examples are: 1. Climate models (such as global climate models) - numerical representations of the climate system based on physical, biological and chemical rules. They vary on timescales ranging from seasonal to centennial. Climate models are often used to produce climate change projections. 2. Downscaled models – derived from climate models but at a much higher resolution to support regional and local analysis. 3. Reanalysis - designed for climate studies, reanalyses provide gridded data over a long time period. Reanalyses are created via an unchanging (frozen) data assimilation scheme and model(s) which ingest all available observations. This unchanging framework provides a dynamically consistent estimate of the climate state at each time step. Some available reanalysis products include ERA-40 (40 years) and ERA-Interim (1979 to the present) from the European Centre for Medium-Range Weather Forecasts, and the Twentieth Century Reanalysis project (1871–2011) from the National Oceanic and Atmospheric Administration, United States. 4. Numerical weather prediction See also: a) Guide to Climatological Practices (WMO-No. 100), section 6.7 Climate models and climate outlooks	Optional

4.6 Ancillary data This section covers data required to support CDMSs.
4.6.1 Spatial This subsection represents a wide range of spatial information typically used to provide context to climate data or as an input for spatial analysis processes. These data may be presented in vector, image, gridded or multidimensional array data formats. Typically, spatial representations of data contain aspatial attributes that describe various properties of spatial features. The spatial and aspatial attributes of the data can be used to support a variety of spatial analysis processes. The components listed below are indicative of the types of spatial data that could be relevant to climate. This list is not exhaustive. See also: a) Statement of guidance for climate, Attachment 1 Requirements for climate data (Wright, 2012b)
4.6.1.1 Topography	Although this component is labelled topography, it actually refers to a wider set of data. Some examples are: 1. Typical topographic data such as drainage, relief, cultural and nomenclatural features 2. Digital elevation models	Recommended
4.6.1.2 Emergency management	This component concerns datasets that are useful for supporting emergency management and related warning systems.	Optional
4.6.1.3 Agricultural	This component refers to agricultural information datasets. Some examples are: 1. Data from the Food and Agriculture Organization of the United Nations that could relate to agriculture, animal production and health, fisheries, forestry, land and water or plant production and protection. 2. Regional, national or international data from different organizations such as primary industry departments or research centres on agriculture.	Optional
4.6.1.4 Health	The component refers to a wide variety of health datasets. Some examples are: 1. Data from the World Health Organization covering datasets on a very large spectrum. 2. National or international data from different organizations such as health departments or health research centres. 3. Epidemiological studies (see Wikipedia article) and so forth.	Optional
4.6.1.5 Environmental	This component refers to environmental datasets. Some examples are: 1. Data from the United Nations Environment Programme or the United Nations Educational, Scientific and Cultural Organization. 2. National or international data from different organizations such as environment departments or research centres. 3. Data relating to the distribution of particular flora and fauna, etc.	Optional
4.6.1.6 Administrative	This component covers administrative data. Some examples are: 1. Localities and gazetteers 2. Administrative boundaries 3. Transportation networks 4. Cadastres	Recommended
4.6.1.7 Impacts data	This component concerns a range of spatial data that relate to things impacted by climate. This could include: 1. Deaths caused by heatwaves, prolonged droughts, floods, cyclones, etc. 2. Infrastructure damage caused by a range of events such as floods, bush fires or cyclones. 3. Changing land use, such as agricultural adaptations due to a changing climate.	Recommended
4.6.1.8 Other	This component refers to a range of other spatial data that may be relevant to climate.	Optional
4.6.2 Climate documentation
4.6.2.1 Published reports	This component represents the processes and governance arrangements that result in the preparation and release of a wide variety of written reports. Some examples are: 1. Peer-reviewed papers 2. Climate change impact studies 3. Climate statements and studies 4. Assessments from the Intergovernmental Panel on Climate Change 5. Monthly and annual summaries As this is essentially a scientific and intellectual process, this component will not be expanded upon in this publication.	Optional
4.6.2.2 Documentation	This component refers to a range of textual data that describe various climate-related phenomena or serve as documentation for CDMSs. Some examples may be: CDMS technical and user documentation Text on a web page Diagrams representing climate processes, such as the Community Earth System Model from the National Center for Atmospheric Research, United States, shown in section 2.1 of this publication Various climate forecasts and events Climate processes such as El Niño-Southern Oscillation NMHS policies and practices Various documents and reports Training documentation Presentations	Recommended
4.6.2.3 Various media	This component covers a range of media used to support various climate-related services on an NMHS website. Some examples may be: 1. Scanned hard copy climate records 2. Image portrayal of various climate data, such as an extract from a radar image stored in portable network graphics (PNG) format 3. Podcasts and video clips used to communicate various climate-related messages 4. Photographs of various climate-related phenomena	Recommended
4.6.3 Climate software As discussed in the article by the UK Parliament Science and Technology Committee (2011), one of the recommendations of the UK parliamentary review of the Climategate issue was to ensure that climate scientists make available the full methodological workings (including computer codes) used to support their work. An extract is reproduced below. > It is not standard practice in climate science to publish the raw data and the computer code in academic papers. However, climate science is a matter of great importance and the quality of the science should be irreproachable. We therefore consider that climate scientists should take steps to make available all the data that support their work (including raw data) and full methodological workings (including the computer codes). This has implications for the effective management of climate data and software in that software source code will also require careful management. In addition, it will be necessary to keep track of the time period during which each software version is in operation, as this may also have implications for climate data and climate analysis. On a conceptual level, this is similar in a way to the need to have effective observations metadata that describe the maintenance of sensors and stations.
4.6.3.1 Source code management	This component deals with managing the source code of the software used to process climate data. This component should have the following capabilities at a minimum: 1. Maintain a library of a variety of software source code. 2. Manage different versions (or branches) of the software concurrently, with the ability to maintain each version independently and to easily backport newer functionalities to an older version. 3. Easily detect the differences between software versions.	Recommended
4.6.3.2 Package management	This component refers to the functionality that facilitates the packaging of software and its configuration for installation on a computer. In addition, the component should facilitate dependency management to ensure that all required supporting software is also installed and configured appropriately at installation time.	Recommended
4.6.3.3 Environment configuration	This component concerns the functionality that facilitates the recording and management of information relating to any changes to operational software. This includes: 1. What software was deployed on what server? 2. What version of the software was deployed? 3. Details of any configuration changes. 4. Details of any change made to the operational software. 5. Details of the decommission of the software at the end of its period of operation, including decommission date.	Recommended
4.6.3.4 Software testing	This component covers the testing of software that is to be deployed to manipulate climate data. This includes: 1. Details of test plans and individual test cases, including user-acceptance testing. 2. Details of the test data, database, etc. 3. Details of test systems and environment. 4. Details of test results and artefacts, particularly proof that the test data were not affected by the software or a change to the software.	Recommended

5 Climate data management

5.1 Ingest and extract This section covers a very broad set of functionalities relating to the capture and initial processing of observation and related data. In essence, this section involves: 1. Loading data into or extracting data from the climate database. 2. Transforming data as required from one format into another more suitable for data management, analysis and storage.
5.1.1 Data ingest
5.1.1.1 Business rules	This component supports a wide range of user defined business rules that govern how data are ingested into the climate database. Some examples (for observations data) are: 1. Action required when new phenomena are to be ingested but a record already exists in the database for that time period. 1. Should the new record replace the current record in the database or should the new record be rejected? `There is potential for data that have not been quality controlled to overwrite perfectly good quality-controlled data. An example is a message that is reingested and the ingest process does not take into account the possibility that the data already exist in the database and that they have been modified.` Action required when a message arrives for ingest but the message type is not appropriate according to the observations metadata on record for that station. Action required should a message arrive containing an observed value that is outside the accepted bounds for a given phenomenon. For example, a message contains a value of 90°C for temperature, where the maximum accepted temperature is 60°C. Action required should a message arrive that is of a lower order of precedence to one that has already been ingested for the same time period and station. For example: The priority level given to records being ingested may relate to the method of data acquisition. A record that has been keyed in via a quality assurance process may be given a higher priority than a record acquired via a real-time message ingest.	Required
5.1.1.2 WMO messages	This component allows for the import of data from a range of WMO message formats, including TAC and TDCF. As both historical and current data will need to be imported, this component should be able to work with data in a wide variety of past, present (and future) data formats. Some examples are: 1. Binary: 1. BUFR 2. GRIB 2. Alphanumeric: 1. CREX 2. SYNOP 3. TEMP 4. SHIP 5. METAR 6. World Weather Records Note: While TAC formats are being phased out, support for them will still be required by this component to support the ingest of historical data. For more information, see: a) WMO international codes	Required
5.1.1.3 Vector	This component supports the import of a series of vector spatial formats. For example: 1. Shapefile 2. Geography Markup Language (GML) (see OGC GML web page)	Recommended
5.1.1.4 Raster array	This component supports the import of a series of raster array spatial formats. For example: 1. CF-netCDF 2. Hierarchical data format 3. ArcInfo ASCII 4. GeoTIFF	Recommended
5.1.1.5 Other formats	This component covers the import of a range of other formats. For example: 1. Photographs (PNG, JPEG, TIFF, etc.) 2. Scanned documents 3. PDF files 4. ASCII generic formats such as CSV 5. Data managed in spreadsheets 6. Tabular formats, such as the import of data from a relational database management system	Recommended
5.1.1.6 Status log	This component concerns the recording of each ingest activity status in order to: 1. Monitor the ingest job status. 2. Automatically recover failed ingests. 3. Record warning and other error messages to enable manual intervention if required, for example if expected data are not received.	Required
5.1.1.7 Automated with self-recovery	This component supports the automated ingest of a range of ingest types (particularly WMO messages and data from automatic weather stations). The component also allows for the automatic recovery of ingest tasks in the event that a task fails either entirely or part way through an ingest. This could be due to a number of reasons, including: 1. Corrupted messages 2. Network failures 3. Hard disk failures 4. Database failures 5. Upstream data flow disruptions	Recommended
5.1.1.8 Transformation	This component supports the transformation of an ingest record. This may include: 1. Transforming data from one format to another. 2. Transforming codes into formats more suitable for the destination climate database. 3. Correcting records that have been abbreviated in accordance with accepted local observation practice.	Required
5.1.2 Data extraction
5.1.2.1 Data extraction	This component allows data to be extracted from the climate database in accordance with NMHS data policy and governance processes. Data may be transformed into a wide range of formats as described in the subsection on data ingest (5.1.1). Note: This component is only intended for advanced users who have an intimate knowledge of the climate database, its data structures, the relevant data policies and the appropriate use of quality flags and other aspects in order to perform one-off data extraction activities. End-user data extraction is intended to be constrained to defined data types via the climate data delivery services components (Chapter 8), using components under Chapter 7, such as: Tables and charts, Integrated search of climate data and Data download.	Recommended

5.2 Data rescue
5.2.1 Imaging
5.2.1.1 Document imaging	This component supports the functionality required to digitally capture a physical document and store the resultant file and associated discovery metadata, perhaps within the climate database. Some examples of the types of documents to be digitally captured are: 1. Scanned paper observation forms 2. Scanned microfiche/microfilm 3. Relevant observations metadata documents such as instrument calibration reports 4. Technical manuals 5 Site location plans and sections For more information, see: a) Guidelines on Climate Data Rescue (WMO/TDNo. 1210), WCDMP-55	Recommended
5.2.1.2 Optical character recognition	This component provides the functionality required to digitally capture data stored in scanned documents such as hand written and/or typed meteorological observation forms.	Optional
5.2.1.3 Chart digitization	This component refers to the capacity to digitize data from recording cards such as those used with a Campbell-Stokes sunshine recorder, thermograph, barograph or other meteorological instrument. The typical functionality required for this component would be to: 1. Scan a physical recording chart (or card) using the Document imaging component (5.2.1.1). 2. Analyse the image of the chart. 3. Extract numeric points from the chart. 4. Calculate a value for those points. 5. Store the resultant data in the climate database.	Optional
5.2.2 Monitoring
5.2.2.1 Data rescue metrics	This component maintains metrics relating to the capture of historical observations data. These may contain: 1. Name and brief description of data rescue project 2. Countries where activity is taking place 3. Contact person for project 4. Types of data rescued 5. Summary and per cent digitized 6. Summary and per cent scanned 7. Summary and per cent scanned but not digitized 8. Summary and per cent undigitized	Recommended
5.2.3 Data entry This subsection covers the functionality required to enable an appropriately trained and authorized person to manually enter data into the climate database. Typically, this functionality is tightly controlled according to NMHS data governance processes. Some issues to consider are: 1. Data entry staff should only be able to add data to or edit data in the climate database under programme control, with appropriate safeguards in place to protect the integrity of the climate database. 2. Any functionality that provides write access to the database should also include an audit function to allow an independent review of database changes - One example could be the use of database triggers that write the details of a transaction, including the previous values, into a separate set of audit tables. 3. Another approach could be to ensure that the data entry process creates an interim data file that is then entered into the database via data ingest processes, bypassing the need for direct access to the database. 4. NMHS data policy may enforce the need for double entry practices, where two or more operators key in data for the same form, independent of each other, to detect and minimize key-in errors. 5. Careful consideration should be made to ensure that an organization has very effective IT security and monitoring in place prior to allowing key-in access via the Internet. Most organizations will not have suitable controls in place. Therefore, key-in via the Internet should be avoided as a general rule. 6. NMHS data policy should provide guidelines as to appropriate data quality considerations applied to data that are manually entered.
5.2.3.1 Forms	This component covers: 1. The visual design of a form. 2. The software logic that controls the data key-in process. 3. The mapping of fields in the form with appropriate records and tables within the climate database. 4. Ensuring that the integrity of the climate database is protected by validating data before they are added to the database. The component should also support: A custom definition of user input forms that mimic traditional meteorological forms (including the language where appropriate). Efficient and effective data entry that minimizes operator fatigue and automatically calculates appropriate values. The component should provide adequate support for monitoring the validity of data that are entered. Some examples are: 8. Performing data quality consistency checks of the data to be entered. These checks and the appropriate values are to be customizable according to NMHS data policy and governance processes. 9. Ensuring that appropriate data types and context are entered for each field. 10. The component should alert the operator to any doubtful entries detected, providing appropriate advice as per NMHS data policy guidelines.	Required
5.2.3.2 Key entry	This component provides the functionality to support manual key-in of meteorological data.	Required
5.2.3.3 Computation	This component allows for the automatic derivation of parameters at key-in. Such computation should be customizable according to NMHS data policy and governance processes. Some possible scenarios where this functionality may be used are: 1. The computation of a value for relative humidity after the values for dry-bulb temperature and dewpoint have been entered. 2. Decoding shorthand codes and replacing them with appropriate values.	Recommended

5.3 Observations quality control
5.3.1 Quality management For more information, see: a) Guide to Climatological Practices (WMO-No. 100) b) Guide to the Global Observing System (WMO-No. 488), Appendix VI.1 Data quality control, and Appendix VI.2 Guidelines for quality control procedures applying to data from automatic weather stations c) Guidelines on the Quality Control of Surface Climatological Data (WMO/TD-No. 111), WCP-85 d) Guidelines on Climate Data Management (WMO/TD-No. 1376), WCDMP-60 e) Guide on the Global Data-processing System (WMO-No. 305), Chapter 6 Quality control procedures
5.3.1.1 Consistency checks	This component covers a range of tests to ensure that inconsistent, unlikely or impossible records are either rejected or flagged as suspect. A manual investigation may then assess the validity of the suspect values. This component includes the concepts of internal, temporal and summarization consistency checks as discussed in the Guide to Climatological Practices (WMO-No. 100), section 3.4.6 Consistency tests. Some examples are: 1. Is the minimum temperature lower than the maximum temperature? 2. Is the maximum temperature within the historical range for maximum temperatures for a given station?	Required
5.3.1.2 Data comparison	This component covers a series of tests that use and cross-reference data from a number of sources to validate suspect observations. Some examples of datasets that may be cross-referenced are: 1. Observations data showing daily precipitation at a station 2. Radar data covering the station 3. Synoptic forecast charts 4. Satellite imagery	Recommended
5.3.1.3 Heuristic checks	This component refers to a set of tests that rely on experience and knowledge of observation processes, techniques and instrumentation to detect inconsistent, unlikely or impossible records and flag them as suspect. A manual investigation may then assess the validity of the suspect values. Some examples are problems typically caused by: 1. Inexperienced operators. 2. Instruments that are not or are incorrectly calibrated. 3. Operator behaviour or organizational policy, for example not recording rainfall data over a weekend period and aggregating the results on the following Monday. 4. Known deficiencies in observers handling data such as evaporation-related observations. 5. Changes over time caused by changes at an observation site. For example, a shift in the magnitude of wind recorded from a specific direction may be an indicator of a problem at the site location, such as a new building structure or trees obstructing the flow of the wind in that direction.	Required
5.3.1.4 Statistical checks	This component covers a number of tests that statistically analyse historical data to detect inconsistent, unlikely or impossible records and flag them as suspect. A manual investigation may then assess the validity of the suspect values. Some examples are: 1. Climate tests that highlight extreme climatic values, such as a record maximum air temperature. 2. Flatline tests where a constant value exceeds the specified limit in a time series, for example when the station air temperature remains constant for 12 hours. 3. Spike tests conducted in a time series to identify data spikes exceeding a specified limit, for example when a three-hourly air temperature observation is at least 50 degrees colder than all others during the day. 4. Rapid change tests conducted in a time series to identify rapid changes exceeding a specified limit, for example when a 100 cm soil temperature suddenly changes in consecutive 3-hourly observations from a relatively stable 22°C to 38°C for all following observations.	Required
5.3.1.5 Spatial checks	This component covers a range of spatial tests to detect inconsistent, unlikely or impossible records and flag them as suspect. A manual investigation may then assess the validity of the suspect values. Some examples are: 1. Comparing the results of a time series of observations at a given station with those at nearby stations. 2. Using a Barnes or similar analysis to derive spatial patterns against which anomalous and possibly erroneous station values stand out.	Recommended
5.3.1.6 Data recovery	This component refers to the processes, policies, governance arrangements, audit processes, etc., that enable the recovery and insertion of data in the climate database, possibly overwriting existing data. This component involves a number of manual processes undertaken by experienced and well-trained personnel, supported by effective technology, governance and data management processes, to investigate anomalous observations and either accept or reject suspect records. Personnel will typically review and consider a wide range of data in their investigations, such as raw records, synoptic charts, satellite imagery, radar and other types.	Required

5.4 Quality assessment
5.4.1 Observations quality assessment This subsection refers to the processes implemented to help NMHSs assess the quality of observations used by their organization. It covers all stages, from the observation site and expertise level of personnel to the final product distributed to users. The aim of this subsection is to move towards a more objective way of defining the quality of observations data.
5.4.1.1 Siting classification	This component refers to the processes, software, governance mechanisms and analysis that classify sensors according to the rating scale described in the Guide to Meteorological Instruments and Methods of Observation (WMO-No. 8), Annex 1.B Siting classifications for surface observing stations on land.	Required
5.4.1.2 Sustained performance classification	This component refers to the processes, software, governance mechanisms and analysis that classify sensors according to their sustained performance over time. The best description found to date on how to determine this classification may be found in Annex III of the final report of the first session of the Commission for Instruments and Methods of Observation Expert Team on Standardization. Note: A more objective approach to developing this classification for the global WMO community is required.	Recommended
5.4.1.3 Multilayer quality flags	This component refers to the processes, software, governance mechanisms and data analysis used to understand and enumerate the quality flags of a specific record of data. This will facilitate: 1. Future analysis that requires data of a specific quality flag value. 2. Communication on the assessed quality of records. The best description to date on how to define this classification may be found in the Guide to Climatological Practices (WMO-No. 100), pp. 3–8 to 3–9. This reference describes a way of flagging quality based on a combination of: Data type (original, corrected, reconstructed or calculated) Validation stage Acquisition method This approach is still quite limited. It does not provide a clear way of determining just what level of quality control a record has been subjected to. While the classifications are relevant and relate to the perceived quality of a record, they do not allow for an explicit comparison of data of similar perceived quality. For example, the subsection on quality management (5.3.1) describes a series of classifications of tests (without providing actual details). If a record has passed all such tests, can it be considered to be better quality than one that has not passed any test? Objective quality classifications are required to support a consistent approach within the global WMO community so that data can be: Objectively compared to ensure that data of similar quality can be compared and analysed as required. Stored and easily retrieved from a climate database. It is becoming increasingly apparent that organizations will need to retain observations at multiple levels of quality from the raw observation through various edit and analysis processes in order to demonstrate the true lineage of a record and explain and justify the changes made to the raw observations. Note: A more objective approach to determining this classification for the global WMO community is required.	Required
5.4.1.4 Climate observation quality classification	This component refers to the processes, software, governance mechanisms and data analysis used to understand and enumerate the quality of a specific record of data relative to an objective index. This index will need to combine a number of criteria relevant to data reliability and quality. Note: This index has yet to be created. For the purposes of this publication, it is called the climate observation quality classification. However, this name may change. It is envisioned that this index will need to take into account a number of factors, including: 1. Siting classification 2. Sustained performance classification 3. Regular maintenance and calibration of sensor 4. Sensor reliability 5. Uncertainty inherent in observations 6. Observation quality control processes 7. Multilayer quality flags 8. Lineage 9. Homogeneity 10. Other appropriate factors See also the summary of findings of the seventh Data Management Workshop of the European Climate Support Network (ECSN) held at the Danish Meteorological Institute, in particular: Noting that “everybody” talks about different levels of Quality Control [QC] and (almost) nobody uses the same wording or nomenclature – it is recommended that an overview of QC nomenclature in ECSN is worked out. It might be considered if such an overview could form the basis for a recommended set of QC wordings. (Kern- Hansen, 2009)	Optional
5.4.2 Derived-data quality assessment
5.4.2.1 Derived-data quality assessment	This component refers to the processes, software, governance and data analysis processes used to understand and enumerate the quality of derived data relative to an objective index. There are many factors that can influence the quality of derived data. Some issues to consider are: 1. What is the quality of the source data? 2. What algorithms have been applied to the source data to arrive at the derived data? 3. What is the impact of these algorithms on the quality of the derived data? 4. If the derived dataset is spatial, how has the positional location of the data been derived? 1. What is the quality of the source spatial data? 2. What is the impact of the algorithms used to spatially distribute the data on the positional accuracy of the derived data? For more information, see also the Derived data component (5.4.4.2). Note: This index has yet to be created. For the purposes of this publication, it is called the derived-data quality assessment. However, this name may change.	Optional
5.4.3 Quality assurance metrics
5.4.3.1 Quality assurance metrics	This component refers to the processes, software, governance mechanisms and analysis used to monitor the performance of quality assurance processes. Such monitoring will allow network managers and climate data specialists to validate the performance of quality assurance software and processes. This can be done, for example, by reviewing automatically generated reports that: 1. Summarize observational errors detected by each quality assurance test. 2. Summarize false positives and valid errors detected. 3. Compare the performance of current quality assurance metrics with historical averages. These types of metrics can also help data and network managers improve quality assurance processes and software.	Recommended
5.4.4 Uncertainty This subsection refers to the processes, software, governance processes and data analysis used to understand and record the uncertainty inherent in the data. As noted in the OGC Abstract Specification: Geographic Information – Observations and measurements (p. 13), all observations have an element of uncertainty: > The observation error typically has a systematic component, which is similar for all estimates made using the same procedure, and a random component, associated with the particular application instance of the observation procedure. If potential errors in a property value are important in the context of a data analysis or processing application, then the details of the act of observation which provided the estimate of the value are required. This functionality will support: 1. Future statistical analysis that takes into account the uncertainty inherent in data. 2. Communication of data uncertainty. For more information, see Wikipedia articles on: a) Uncertain data b) Uncertainty
5.4.4.1 Measurements	This component refers to the processes, software, governance mechanisms and data analysis used to understand and record the uncertainty inherent in observation measurements and processes. The Guide to Meteorological Instruments and Methods of Observation (WMO-No. 8) provides a number of examples per meteorological variable. For more information, see: a) Guide to Meteorological Instruments and Methods of Observation (WMO-No. 8), Annex 1.D Operational measurement uncertainty requirements and instrument performance b) Annex III of the final report of the first session of the Commission for Instruments and Methods of Observation Expert Team on Standardization	Required
5.4.4.2 Derived data	This component refers to the processes, software, governance mechanisms and data analysis used to understand and record the uncertainty inherent in gridded data that have been derived from observation data. Many factors can contribute to the uncertainty inherent in gridded derived data. Some examples are: 1. Uncertainty inherent in the source observations data. 2. Uncertainty inherent in the location of sensors/stations used to generate the grids. 3. The relative accuracy of the algorithms used to generate the derived data. 4. The precision of variable data types used in the software that generates derived data. It is also worth noting that a number of these factors may propagate through the data derivation process.	Optional

5.5 Climate metadata
5.5.1 Manage climate metadata This subsection refers to the processes, software, governance mechanisms and data analysis required to effectively manage climate metadata, which include metadata on observations, discovery and data provenance. Note: This subsection is deliberately kept generic in this version of the CDMS Specifications, with all types of climate metadata bundled together. While the Create and Maintain components (5.5.1.1 and 5.5.1.2, respectively) are classified as recommended, in reality data provenance metadata has not yet been adequately defined. Therefore, a pragmatic approach would rightly not address the creation and maintenance of data provenance metadata until this has been rectified. The creation and maintenance of discovery and observations metadata, however, is required. For more information on climate metadata, see section 4.3 of this publication.
5.5.1.1 Create	This component refers to the processes, software and governance processes needed to effectively and efficiently create climate metadata.	Recommended
5.5.1.2 Maintain	This component covers the processes, software and governance mechanisms required to effectively and efficiently maintain climate metadata.	Recommended
5.5.1.3 Quality control	This component deals with the processes, software and governance processes needed to effectively and efficiently assess and control the quality of climate metadata. More work is required to provide effective guidance on this component.	Recommended
5.5.1.4 Metrics	This component refers to the processes, software and governance processes required to effectively and efficiently maintain metrics relevant to climate metadata. Some examples are: 1. Which stations or sensors do not have observations metadata records? 2. Which datasets do not have discovery metadata records? More work is required to provide effective guidance on this component.	Recommended

6 Climate data analysis

6.1 Analysis This section describes a series of processes used to analyse climate data. More work will be required to expand on this section in future revisions of this Specification.
6.1.1 Climate modelling
6.1.1.1 Numerical models	This component represents the software, processes and governance mechanisms that provide numerical models such as general circulation models (GCMs), also known as global climate models. A Wikipedia article defines general circulation models as: [A] mathematical model of the general circulation of a planetary atmosphere or ocean and based on the Navier–Stokes equations on a rotating sphere with thermodynamic terms for various energy sources (radiation, latent heat). These equations are the basis for complex computer programs commonly used for simulating the atmosphere or ocean of the Earth. Atmospheric and oceanic GCMs (AGCM and OGCM) are key components of global climate models along with sea ice and land-surface components. GCMs and global climate models are widely applied for weather forecasting, understanding the climate, and projecting climate change. Such climate numerical models could be global in scale (like GCMs) but they could also be regional, with generally a higher precision. Downscaling techniques are often used when creating regional models. Models can be based on: 1. The rules of physics, biology and chemistry 2. Statistical rules 3. A mix of dynamic and statistic methods The use of climate models includes: Simulation of the present climate Simulation of the future climate Palaeoclimate reconstruction Seasonal forecasts Decadal prediction Note: Most NMHSs would not have the resources needed to effectively manage the infrastructure and software required to support this component. However, the data output from such components may be useful and should be available for NMHSs via a number of sources, including Regional Climate Centres. For more information, see: a) Wikipedia article on general circulation models	Optional
6.1.1.2 Reanalysis	This component concerns the software, processes and governance mechanisms that establish “a meteorological data assimilation project which aims to assimilate historical observational data spanning an extended period, using a single consistent assimilation (or “analysis”) scheme throughout”. (See Wikipedia article on meteorological reanalysis)	Optional
6.1.1.3 Model ensembles	This component refers to the software, processes and governance mechanisms used to aggregate data from: 1. A number of GCMs to produce products that portray a range of model forecasts. 2. A series of runs of the same model.	Optional
6.1.2 Generate derived data from climate observations
6.1.2.1 Spatial analysis	This component represents the software, processes and governance tools that handle a very wide variety of raster and vector spatial analysis techniques. Some examples are: 1. Generating grids that show the spatial distribution of observations of a phenomenon such as precipitation. 2. Generating grids that represent the distribution of the average maximum temperature for the month of May for climatological standard normals. 3. Generating grids that represent the distribution of the maximum temperature anomalies for May 2010 when compared to the climatological standard normal. 4. Selecting all meteorological stations located within a 10 km radius around a national administrative boundary.	Recommended
6.1.2.2 Image analysis	This component covers the software, processes and governance tools that handle a very wide range of image analysis techniques. Some examples are: 1. Processing remotely sensed satellite imagery to measure the relative solar reflectance of a satellite image, determine the cloud cover within a scene or generate a normalized difference vegetation index to measure vegetation greenness. 2. Processing ground-based radar imagery to detect rain and storm activity.	Recommended
6.1.2.3 Time-series analysis	This component concerns the software, processes and governance mechanisms that analyse timeseries data using a very broad range of analysis techniques. Some examples are the analysis required to produce: 1. WMO standard products such as extremes, standard normals, World Weather Records and climate change indices. 2. A variety of derived observations data. For more information, see: a) Guide to Climatological Practices (WMO-No. 100), Chapter 4 Characterizing climate from datasets, and Chapter 5 Statistical methods for analysing datasets	Recommended
6.1.2.4 Teleconnection indices	This component represents the software, processes and governance processes used to analyse, record and manage data representing teleconnections and major climate indices such as the El Niño-Southern Oscillation and the Southern Oscillation Index. According to an article on Wikipedia, “[t]eleconnection in atmospheric science refers to climate anomalies being related to each other at large distances (typically thousands of kilometers)”.	Optional

7 Climate data presentation

7.1 Graphical user interface – time-series data exploration
7.1.1 Tables and charts This subsection represents the technology, software, processes and governance mechanisms suitable for generating a broad array of tabular and graphical reports to effectively communicate issues relating to climate data.
7.1.1.1 Tables	This component refers to the technology, software, processes and governance processes suitable for generating a wide variety of tabular reports to effectively communicate issues relating to climate data.	Recommended
7.1.1.2 Graphs	This component concerns the technology, software, processes and governance processes suitable for generating a large variety of graphs to effectively convey climate data issues. Graphs could be presented in a wide array of formats, including: 1. Scatter plots 2. Histograms 3. Windroses 4. Time-series graphs using one or more variables For more information, see: a) Guide to Climatological Practices (WMO-No. 100), Chapter 6 Services and products	Recommended
7.1.2 Manage content
7.1.2.1 Manage content	This component covers the technology, software, processes and governance processes suitable for generating a wide variety of content to effectively communicate issues relating to climate data. This includes: 1. Preparing texts, documents and data for effective web presentation. 2. Using technology such as content management systems to simplify web content presentation. 3. Implementing effective governance processes that review, validate and authorize content prior to being published.	Recommended
7.1.3 Visualization
7.1.3.1 Cartography	This component represents the technology, software, processes and governance processes suitable for generating a wide variety of cartographic output to effectively convey climate data issues. It includes: 1. Spatial data preparation 2. Cartography 3. Simple point-and-click web maps	Recommended
7.1.3.2 3D	This component provides the technology, software, processes and governance mechanisms suitable for visualizing and exploring climate data and issues within a 3D environment.	Optional
7.1.3.3 Media viewer	This component refers to the technology, software, processes and governance processes that enable various media to be displayed within the graphical user interface. Some examples are: 1. Photographs 2. Diagrams 3. Scanned documents such as scanned station records 4. Videos 5. Recorded audio media	Recommended
7.1.4 Integrated search of climate data
7.1.4.1 Spatial intelligence	This component represents the technology, software, processes and governance processes that support an effective and dynamic analysis of climate data within a web environment to facilitate understanding of climate matters and communicate issues relating to climate data. This dynamic analysis includes: 1. Geographical Information System (GIS) functionality, including the ability to perform spatial overlay analysis such as selecting points in a polygon. 2. The ability to search features by attribute, for example: 1. Conducting a search of all stations within the catchment of a specific river. 2. Filtering the resultant stations to view only those that observe precipitation. 3. Viewing summary observations data for each of those stations. This component integrates into a map-based interface a wide range of time-series data including climate observations, climate grids, satellite imagery and topography, together with appropriate textual and other attribute data, such as climate metadata. It also facilitates dynamic data exploration and analysis using a broad array of integrated media, including maps, charts, graphs, tables and written reports.	Recommended
7.1.4.2 Integrated search of observations (metadata and data)	This component concerns the functionality that allows an end-user to conduct an integrated search of the climate database and the observations metadata catalogue. The search results will contain observations data and observations metadata. Some examples are: 1. Determining what observations data are available based on a set of parameters and viewing the results in a table. These parameters may include: 1. Station 2. Sensor or procedure 3. Phenomena 4. Data quality (based on quality flags, the climate observation quality classification or other method) 5. Time period 6. A variety of other observation metadata parameters 2. Reviewing observations metadata for selected stations. 3. Determining what datasets provide the actual observations data for a given station, sensor and phenomenon combination, together with the URL of the relevant discovery metadata records. The discovery metadata records will in turn provide the URLs of any services providing dynamic access to the data. An example could involve searching for stations that use both a tipping bucket raingauge and manual methods to observe rainfall. For more information, see: a) Section on climate metadata (4.3) b) Subsection on observations metadata (4.3.1) c) Observations metadata catalogue component (8.2.1.2) d) Linked data component (8.2.2.1)	Required
7.1.4.3 Search discovery metadata	This component refers to the functionality that] allows an end-user to search the CDMS discovery metadata catalogue to: 1. Determine what datasets are managed by the NMHS. This search may be limited to datasets that are available publicly or those that are only available for internal use. 2. Search for datasets in accordance with WIS parameters, categories and keywords. 3. Review discovery metadata records that adequately describe a dataset to enable searchers to determine whether it is suitable for their particular use. 4. Determine the URL that can be used to access online services that host the dataset for dynamic access and data download. This same component could be used to search WIS metadata catalogues. For more information, see: a) Section on climate metadata (4.3) b) Subsection on dataset discovery metadata (4.3.2) c) Discovery metadata catalogue component (8.2.1.1) d) Linked data component (8.2.2.1)	Recommended
7.1.4.4 Search data provenance metadata	This component provides the functionality that allows an end-user to search the CDMS data provenance metadata catalogue to: 1. Broadly determine the lineage of a dataset, including the processes the dataset has been subjected to. 2. Trace the provenance of individual records in detail, taking into account: 1. What was changed? 2. What was it derived from? 3. When was it changed? 4. What was done to change it? 5. How and why was it changed? 6. Who changed it? 7. Who did they act on behalf of (if applicable)? 8. Who authorized the change? For more information, see: a) Section on climate metadata (4.3) b) Subsection on data provenance (4.3.3) c) Data provenance metadata catalogue component (8.2.1.3) d) Linked data component (8.2.2.1)	Optional
7.1.5 Data download
7.1.5.1 Data download	This component represents the functionality enabling end-users to download climate data. This component is related to the climate data delivery components (Chapter 8) and data discovery registers.	Required

8 Climate data delivery services

8.1 Open spatial standards This section has been included because open spatial standards are seen as a mechanism that is being increasingly adopted by many organizations and industry sectors around the world. These types of services underpin global attempts at making data easily accessible through initiatives such as the Global Earth Observation System of Systems (GEOSS) (see GEOSS web page). The WMO Information System can be considered as a component of GEOSS. Open spatial standards are being increasingly supported by a wide range of off-the-shelf software, including traditional desktop GIS software, making it easier for users to access data that are served via such services. This is particularly important for CDMSs, as there is a large potential user base that does not routinely use climate data but could benefit from having a reliable source from which to take the data to ensure consistent use across industry. Some examples of growing interest in climate data can be found in sectors such as agriculture, emergency services, aquaculture, fishing, tourism, transportation, health and environment. Such industries typically do not understand WMO data formats such as BUFR or GRIB, nor do they understand how to exploit data delivered in such formats. While developing countries may not have reliable access to the Internet to take advantage of external open spatial services, it would certainly be possible to use them within their own internal local area networks, particularly as a means to visualize their data. In short, open spatial standards are expected to become an increasingly important mechanism for distributing data in future years. Note: The open spatial components presented below are indicative of the types of standards and services that are available and appropriate for the delivery of climate data. These components are not intended to be exhaustive as there are many more services and standards that are also relevant. It is anticipated that this will be expanded upon in future revisions of this publication. For more information, see: 1. The Climate Challenge Integration Plugfest 2009 executive summary video, which describes the results of a global collaborative project demonstrating how climate data could be used via open spatial standards. 2. The Open-source Geospatial Foundation overview of OGC standards (see OSGeo Live). 3. The Geonovum wiki, which provides an overview of open spatial standards. This wiki is an initiative of the National Spatial Data Infrastructure executive committee in the Netherlands. 4. The OGC Abstract Specifications, which provide a more detailed theoretical overview of the theory and rationale underpinning open spatial standards. 5. The ISO/Technical Committee 211 Advisory Group on Outreach Standards Guide: ISO/TC 211 Geographic Information/Geomatics.
8.1.1 Open Geospatial Consortium services
8.1.1.1 Web Map Services	This component represents technology suitable for the distribution of a wide range of climate data via a Web Map Service (WMS). In essence, a WMS provides a map view of data distributed via a georeferenced image. For more information, see: a) OGC WMS documentation b) The OGC Meteorology and Oceanography Domain Working Group paper on the use of WMS with time-dependent and elevation dependent data	Recommended
8.1.1.2 Web Feature Services	This component represents technology suitable for the distribution of a broad range of vector climate data via a Web Feature Service (WFS). In essence, a WFS could provide vector and tabular climate data, which could be presented in a number of formats such as GML (see OGC GML web page) or Environmental Systems Research Institute (ESRI) shapefile. Some WFS implementations can serve data constrained by a logical data model (also known as an application schema). It may also be possible to enable WFS server software to provide meteorological observation data via WMO formats such as BUFR. There may be issues with serving time series via WFS. There is discussion that a Sensor Web Service may be better for time-series observations. For more information, see: a) OGC WFS documentation	Recommended
8.1.1.3 Web Coverage Services	This component represents technology suitable for the distribution of a wide range of gridded and array climate data via a Web Coverage Service (WCS). In essence, a WCS provides the actual gridded or array data. Future versions of WCSs are intended to support logical data models defined as application schemas. For more information, see: a) OGC WCS documentation	Recommended
8.1.1.4 Sensor Web Enablement Services	This component represents a range of technological tools suitable for the distribution of a wide variety of observational data and related metadata. There are a number of services that will become available when work stabilizes on these standards. Sensor Web Services are typically used with data that conform to the observations and measurements data model. This model is well suited to time-series data. Therefore, Sensor Web Services may be an appropriate mechanism for serving time-series climate data in the future. For more information, see the OGC documentation for: a) Observations and measurements b) Sensor Observation Services	Optional
8.1.1.5 CF-netCDF	This component involves technology suitable for the provision of a wide variety of gridded or array scientific data written as netCDF files that support the conventions for climate and forecast metadata³. For more information, see: a) Wikipedia article on netCDF b) OGC netCDF standards suite c) The OGC CF-netCDF core and extensions primer ³ In this context, the term metadata refers to a set of fields in the header of a netCDF file that describe the context and format of the array data contained in the CF-netCDF file.	Recommended
8.1.1.6 Geosynchronization	This component concerns a series of technological tools that enable a data publisher to distribute a data product in an environment that supports managed change to the source data. In theory, end-users could subscribe to a data source and have changes to the data replicated in their copy of the data. In summary, geosynchronization services are expected to support several processes, including: 1. Allowing interested parties to subscribe to an authoritative data source. 2. Data entry with validation. 3. Notifying interested parties of changes. 4. Allowing replication of the data provider’s features. It is anticipated that geosynchronization services will support crowdsourced data and processes in addition to authoritative data sources. At present, geosynchronization services are being developed with the current version of a standard that only serves data via a WFS. It is expected that geosynchronization will support additional services in the future, including WCS and WMS. For more information, see: a) OGC overview of geosynchronization services b) The OWS 7 engineering report on the test of geosynchronization services	Optional
8.1.1.7 Web Processing Services	This component covers a range of technological instruments that provide a standards-based framework for developing spatial processing services that operate via an internal network or the Internet. This standard is being used by a number of open-source projects and vendors to develop the building blocks that will support a wide range of spatial analytic processes. The latest version of the standard is being developed to support both synchronous and asynchronous Web Processing Services (WPS). This standard has considerable potential for future CDMS use, for example: 1. To enable an NMHS to establish a suite of services to process and analyse climate data within a services-oriented architecture. 2. To enable a future service-provider to offer CDMS-related services via “the cloud”. For more information, see: a) WPS Wikipedia entry b) OGC WPS documentation	Optional
8.1.1.8 Symbology Encoding	This component represents a range of technological tools that provide rules and a standardized approach for defining alternative visual portrayals of spatial data via an internal network or the Internet. Symbology Encoding, together with Styled Layer Descriptors (SLDs), can be used with WMSs, WFSs and WCSs to enable user-defined symbolization of spatial data from within a collection of published styles. As an example, it is possible to publish several colour classification schemes for a gridded dataset, allowing end-users to select one that is appropriate for their use. For more information, see: a) OGC Symbology Encoding documentation	Optional
8.1.1.9 Styled Layer Descriptors	This component represents a range of technological tools that provide rules and a standardized approach for defining alternative visual portrayals of the spatial data via an internal network or the Internet. Styled Layer Descriptors, together with Symbology Encoding, can be used with WMSs, WFSs and WCSs to enable user-defined symbolization of spatial data from within a collection of published styles. For more information, see: a) OGC SLD documentation	Optional
8.1.2 Geography Markup Language application schema Application schemas (as defined in ISO 19109:2005 Geographic information – Rules for application schema) provide an abstract representation of the content and structure of information resources. The Climate observations application schema component (4.2.3.2) outlines the use of application schemas specifically developed for the climate domain. These abstract representations of information provide the basis for deriving concrete encodings or data formats that allow the information to be serialized for exchange between systems and organizations. Work is proceeding within WMO to harmonize existing TDCFs (FM 92 GRIB Edition 2 and FM 94 BUFR Edition 4) with WMO METCE (described in the METCE component (4.2.3.1)), with the intent to bind those existing data formats to explicit, well-understood semantics. In addition, the application schema can be used to develop XML-based data encodings using widely supported open standards for geographic information. The ISO 19136:2007 Geographic information – Geography Markup Language (GML) standard provides rules for serializing the abstract data model expressed as an application schema via XML encoding to create a GML application schema. In summary: 1. An application schema can be thought of as a logical data model. For more information, see the subsection on WMO logical data models (4.2.3). 2. A GML application schema is a physical model that is derived from a logical data model published using a particular technology - which in this case is GML. For an example, see the Combined climate observations and metadata component (8.1.2.1) below. Deriving a GML application schema from an application schema developed specifically for the climate domain (see Climate observations application schema component (4.2.3.2)) is anticipated to make climate data far more readily consumable for a broader community of users such as those interested in determining the impacts of climate change. While work has been conducted for several years, this task is still in its early stages as at December 2013. It is expected to take a number of years to complete. For an overview of how a logical data model (and associated application schema) could be used with climate data, see Bannerman (2012), pp. 20–26. For a more detailed description of what has been done to date, see Tandy (2013a), which provides an overview of the direction that WMO logical data model work is taking with regard to METCE.
8.1.2.1 Combined climate observations and metadata	This component represents the technology, software, processes and governance needed to support the transmission, consumption and processing of combined climate observations and associated metadata via a future climate observations application schema (or similar name) derived from: 1. The schema outlined in the Climate observations application schema component (4.2.3.2) 2. The schema outlined in the METCE component (4.2.3.1) As discussed in the section above (8.1.2), this work is currently embryonic. However, it is anticipated that it will become a key technological mechanism for exchanging climate observations and metadata in future years in support of data interoperability and platform independence. For more information, see: a) Above section (8.1.2) b) METCE component (4.2.3.1) c) Climate observations application schema component (4.2.3.2)	Optional
8.1.2.2 Taxonomies and registers of authoritative terms	This component is related to the previous component. It represents the technology, software, processes and governance needed to develop an authoritative definition of the concepts and terms referenced in a logical data model such as a future climate observations application schema (or similar name), and to enable the publication of such terms. Following the work of the Task Team on Aviation XML, WMO has established the WMO Codes Registry, which provides a web-based publication of terms from the Manual on Codes (WMO-No. 306). The current coverage of terms is sparse - only covering the aviation-related terms required by the sponsoring activity - but there is commitment from WMO to populate the remaining code tables. The WMO Codes Registry provides a well-defined programmatic application programming interface (API) alongside the web application. Where the need arises for publication of locally managed terms, it is recommended that the registry API be supported. An open-source reference implementation of the registry software is available. For more information, see: a) Above section entitled GML application schema (8.1.2) b) METCE component (4.2.3.1) c) Climate observations application schema component (4.2.3.2) d) Overviews of the WMO Codes Registry (Tandy, 2013b, 2013c)	Optional

8.2 Data discovery
8.2.1 Climate metadata catalogues
8.2.1.1 Discovery metadata catalogue	This component refers to the technology and processes that create a discovery metadata catalogue. This catalogue is used to publish an organization’s data holdings as discovery metadata records, with corresponding records describing which online services may be used to access each dataset. A discovery metadata catalogue allows an organization to participate within the WIS environment.	Required
8.2.1.2 Observations metadata catalogue	This component refers to the technology and processes that create the observations metadata catalogue used to publish an organization’s observations metadata records. It is anticipated that the climate database will be used to store and manage observations metadata. This component will serve as an IT catalogue service for observations metadata. More work is required to define this component.	Optional
8.2.1.3 Data provenance metadata catalogue	This component refers to the technology and processes that create the data provenance metadata catalogue used to publish an organization’s data provenance metadata records. It is anticipated that the climate database will be used to store and manage data provenance metadata. This component will serve as an IT catalogue service for data provenance metadata. More work is required to define this component.	Optional
8.2.2 Linked data
8.2.2.1 Linked data	This component supports semantic search requests such as those used with linked data. This is an emerging requirement that is building considerable momentum within information management communities. The Australian Climate Observations Reference Network – Surface Air Temperature (ACORN-SAT) dataset, published by the Australian government at http://lab.environment.data.gov.au/ provides an example of how linked data may be used for publishing climate data. More work is required to define this component, including its relationship to the approaches adopted by WIS. For more information, see: a) A presentation on linked data (Berners-Lee, 2009) b) Article on the ACORN-SAT linked climate dataset (Lefort et al., 2013)	Optional

8.3 Other formats
8.3.1 Open-source Project for a Network Data Access Protocol
8.3.1.1 OPeNDAP	This component represents technology suitable for the distribution of a wide range of scientific data via the Data Access Protocol (DAP). DAP is used in the world’s scientific community to allow Internet access to a range of scientific data. The protocol does not support the concept of spatial reference systems as defined in the OGC Abstract Specification on spatial referencing by coordinates. This makes it very difficult to reliably integrate data hosted via DAP with other spatial datasets, including those hosted via open spatial services. Therefore, it is considered more appropriate for cases in which researchers and scientists only want the data for numerical analysis. For more information, see: a) OPeNDAP Wikipedia page b) DAP 2.0 Specification	Optional
8.3.2 WMO formats
8.3.2.1 WMO formats	This component represents technology suitable for the distribution of a wide range of climate data via traditional WMO formats. For more information, see: a) FM 94 BUFR Edition 4 b) FM 92 GRIB Edition 2 Other formats also exist, but it is anticipated that they will be phased out in time. For more information, see: c) Manual on Codes (WMO-No. 306), Volume I.2 d) Wikipedia articles on BUFR and GRIB	Required

9 Core IT infrastructure

9.1 Application infrastructure
9.1.1 Identity management
9.1.1.1 Directory	This component provides directory services such as the Lightweight Directory Access Protocol or Active Directory to manage user credentials and details.	Recommended
9.1.1.2 Identity and access management	This component supports policies and functionalities that enable granular user access to the organization’s IT resources and data.	Required
9.1.2 Collaboration
9.1.2.1 E-mail	This component provides secure e-mail access and includes functionalities such as filtering for malware and spam.	Required
9.1.2.2 FTP	This component provides secure services to allow exchange of climate data via the use of the File Transfer Protocol (FTP).	Required
9.1.2.3 Wiki	This component supports a collaborative web environment allowing any member of a team to easily edit content.	Recommended
9.1.3 Web platform
9.1.3.1 Web server	This component provides functionalities that deliver web content to web browsers. In addition to the web server platform, it also refers to services required to support web applications.	Recommended
9.1.3.2 Proxy server	This component routes web traffic and acts as a load balancer and a reverse proxy server to contribute to secure connections to the web server.	Recommended
9.1.4 Database
9.1.4.1 Tabular	This component represents database technology suitable for the storage of a wide range of timeseries climate data in tabular format, typically within a relational database.	Required
9.1.4.2 Spatial	This component deals with technology used to spatially enable time-series climate data, typically within a relational database. The component may consist of a functionality that spatially enables the tabular database component, or it could be a dedicated spatial database that is closely aligned to the climate data stored within the tabular database.	Recommended

9.2 Service operations
9.2.1 Service operations management
9.2.1.1 Scheduling	This component represents technology and processes used to ensure that software processes can be scheduled to run at specific times over a 24-hour basis. This functionality supports activities such as regular data ingest, quality assurance operations, data analysis, derivation and backups. There is also a requirement in more advanced environments to ensure that the dependencies between scheduled jobs are managed.	Required
9.2.1.2 Service desk	This component represents the functionalities, processes and software required to provide support for service operations4, including: 1. Incident and event management to ensure that if an unplanned interruption to an IT service occurs, normal service operation is returned as soon as possible. 2. Problem management to ensure that the root causes of problems are found and where possible, rectified.	Recommended
9.2.1.3 Applications management	This component covers the functionalities, processes and software required to provide application administration tasks and second- and third-level support for CDMS services. Any new IT system implementation and any change to existing IT systems must be undertaken in accordance with the section on governance (3.2).	Required
9.2.1.4 Systems management	This component refers to the functionalities, processes and software required to provide systems administration tasks and second- and third-level support for CDMS services. Any new IT system implementation and any change to existing IT systems must be undertaken in accordance with the section on governance (3.2).	Required

9.3 Computing infrastructure
9.3.1 Networks
9.3.1.1 Internet	This component covers the infrastructure required to support access to the Internet. This includes routers, switches, firewalls, internet service providers, etc.	Required
9.3.1.2 WMO WIS/GTS	This component concerns the infrastructure required to access the WMO Global Telecommunication System. This is essentially a private wide-area network.	Required
9.3.1.3 Internal networks	This component refers to the infrastructure required to support local area networks. This includes switches, firewall(s), services such as domain name servers or the Dynamic Host Configuration Protocol.	Recommended
9.3.1.4 VPN	This component concerns a virtual private network (VPN), which allows a private network to be set up across the publicly available Internet making use of tunnelling and security features. This can result in relatively secure communications.	Recommended
9.3.2 Computing platform
9.3.2.1 Hardware	This component covers all computing hardware, including servers and desktop computers. Organizations are increasingly using virtualization to allow several virtual servers to be deployed on a single physical server, as a way of minimizing hardware and operational costs while increasing operational efficiency.	Required
9.3.2.2 Operating system	This component concerns the operating system required to support computing operations.	Required
9.3.2.3 High-performance computing	This component covers the advanced computing functionalities needed to support highperformance computing, including clusters and grids.	Optional
9.3.3 Security
9.3.3.1 Security	Security is actually an aspect of all components, but is included here for clarity. All software and systems should be implemented with security in mind in order to protect the integrity of climate related systems and data. This component does not just refer to IT security but also to physical security, such as preventing the theft of a server and the resulting loss of the climate database.	Required
9.3.4 Data storage
9.3.4.1 Storage media	This component involves the provision of sufficient storage media to cover operational activities, including the storage of climate data, systems, archives, backups and disaster recovery materials.	Required
9.3.4.2 Data archival	This component handles the secure archival of historical data to ensure that it is available for future generations.	Required
9.3.4.3 Backups	This component covers the regular operational backup and restoration of data and systems.	Required
9.3.5 Disaster recovery
9.3.5.1 Disaster recovery	This component refers to the disaster recovery and business continuance policies, processes, plans and systems required to ensure that CDMS systems and climate data can be recovered in the event of an unforeseen incident. This could be as simple as a server malfunction or as complex as an organization’s city being destroyed due to some unexpected event such as an earthquake, tsunami or military action. This is why off-site storage of backups is required.	Required

3 Governance of climate data management systems

3.1 Data policy

3.1.1 Commitments

3.1.2 Sustainability

3.1.3 Intellectual property

3.1.4 Data delivery

3.1.5 Third-party data

3.1.6 Climatology policy

3.2 Governance

3.2.1 Data governance

3.2.2 IT governance

4 Time-series climate data

4.1 Observations data

4.1.1 Climate observations

4.2 Logical data models

4.2.1 Climate database

4.2.2 Foundation standards

4.2.3 WMO logical data models

4.3 Climate metadata

4.3.1 Observations metadata

4.3.2 Dataset discovery metadata

4.3.3 Data provenance

4.4 WMO standard products

4.4.1 Observation data products

4.4.2 Climate change indices

4.5 Derived climate data

4.5.1 Derived observation data

4.5.2 Gridded spatial distribution of observations

4.5.3 Numerical models

4.6 Ancillary data

4.6.1 Spatial

4.6.2 Climate documentation

4.6.3 Climate software

5 Climate data management

5.1 Ingest and extract

5.1.1 Data ingest

5.1.2 Data extraction

5.2 Data rescue

5.2.1 Imaging

5.2.2 Monitoring

5.2.3 Data entry

5.3 Observations quality control

5.3.1 Quality management

5.4 Quality assessment

5.4.1 Observations quality assessment

5.4.2 Derived-data quality assessment

5.4.3 Quality assurance metrics

5.4.4 Uncertainty

5.5 Climate metadata

5.5.1 Manage climate metadata

6 Climate data analysis

6.1 Analysis

6.1.1 Climate modelling

6.1.2 Generate derived data from climate observations

7 Climate data presentation

7.1 Graphical user interface – time-series data exploration

7.1.1 Tables and charts

7.1.2 Manage content

7.1.3 Visualization

7.1.4 Integrated search of climate data

7.1.5 Data download

8 Climate data delivery services

8.1 Open spatial standards

8.1.1 Open Geospatial Consortium services

8.1.2 Geography Markup Language application schema

8.2 Data discovery

8.2.1 Climate metadata catalogues

8.2.2 Linked data

8.3 Other formats

8.3.1 Open-source Project for a Network Data Access Protocol

8.3.2 WMO formats

9 Core IT infrastructure

9.1 Application infrastructure

9.1.1 Identity management

9.1.2 Collaboration

9.1.3 Web platform

9.1.4 Database

9.2 Service operations

9.2.1 Service operations management

9.3 Computing infrastructure