WORKING GROUP 9: OPERATIONAL IMPLEMENTATION

Table of Contents

Representatives:

  • Geoff DiMego (lead), NCEP
  • Ed Bensman, AFWA
  • Keith Brewster, CAPS
  • Kevin Brundage, FSL
  • Dave Gill, NCAR

Overview of area of focus, objectives and strategies

The focus of WG9 is to navigate the process and to perform the tasks (adaptation, streamlining, testing, etc) necessary for getting the WRF modeling system implemented into the operational environments at NCEP and AFWA. Reliability, efficiency and accuracy are the most important issues for Operations. NCEP and AFWA have responsibilities for 24 hours-a-day, 7 days-a-week continuous and uninterrupted service. These Operations require extremely reliable hardware and software. Since failure is never an option, exhaustive testing and strict change management are required. At NCEP, EMC's staff performs extensive parallel testing of all major code changes and then coordinates all software changes with NCEP Central Operations (NCO) Production Management Branch who act on the Job Implementation Forms (JIFs) that EMC files with them. These JIFs include the UNIX scripts that run the codes in the Operational environment. Version control for everything running in Operations (source code & scripts) is handled by NCO, but they get everything from EMC.

All modeling systems (and their individual component codes) running in Operations must fit into the fixed size parameters of the Operational run-slot. This involves size of executables, number of processors or nodes used and the wall time from start (obs data dump) to finish (all products generated and available for distribution). The most critical parameter is wall time because delivery of Operational products are expected at fixed times with little leeway. In this regard, the number of processors or nodes available to a run may become an issue. If a code is taking longer than its allotted time slot, then you can't just throw more processors or nodes at the problem to finish faster because there will be a limit on the maximum number of processors or nodes that can be used for that model run slot. Due to differences in domain size, forecast range, targetted customer and/or frequency of runs the parameters (including resolution & possibly physics) will be different for the various operational run slots that WRF is planned for: hourly runs (currently RUC based) for the CONUS, 4/day runs for North America (currently Meso Eta based), regional nests, hurricane (currently GFDL model) support runs and eventually ensembles. Each will require the same degree of extensive testing to ensure reliability (and accuracy). Clearly, it is desirable to run the most complete and sophisticated configuration in the computer time/space available. This puts a very high demand on the efficiency and speed of Operational codes. There will be many trade-offs. Most of the desired features in the WRF code that make it suitable for research applications will have to be disabled (or perhaps adapted) for the Operational configuration. Operations won't be changing anything on the fly, even though this capability may exist in the WRF Model and may be invoked frequently by the research community. A single, highly efficient configuration will be settled upon for each Operational application. To the extent possible, the Operational version will be a subset of the research / community version.

The Operational configuration must be fixed and tested prior to implementation and can't be changed without prior approval. At NCEP, the Operational configuration must be approved not only by NCEP, but also by the NWS Committee on Analysis and Forecast Technique Implementation (CAFTI). Results from the extensive testing are presented to form the basis of decisions. Accuracy and reliability are important, but user requirements are paramount here and often other issues pertaining to scheduling, distribution and AWIPS display become overriding.

Summary of current status of development efforts

With the aforementioned Operational issues in mind, WG9 has engaged various other working groups to ensure that our goal of implementing the WRF into NCEP & AFWA Operations can be achieved.

For WG1, we view the possibilities of a more efficient dynamic core from the semi-Lagrangian prototype as being especially useful in getting the most from the fixed time windows and constrained run slots of Operations.

WG2 has been encouraged to consider ways to make parallel post-processing and product generation possible. The ability to produce and write out a binary restart file while the forecast proceeds has now been added to the design.

WG4 will use the Operational collection of observations in prepared BUFR format. These will come from NCEP's real time processing or can come from the NCEP/NCAR 50 year reanalysis. WG5 will include WRF compatible versions of all physics modules used in the Operational Meso Eta.

WG6 is receiving a NetCDF-GRIB converter and a serial post code from AFWA. WG6 is also constructing a WRF post-processing code based on NCEP's scalable / parallel code and using the binary restart file from WG1. This post code will produce myriads of output fields in gridded form on standard pressure surfaces in GRIB format. These grids form the input for NCEP product generator which is necessary to produce all the standard model guidance products for distribution to all customers in NWS, public and private sector users.

This GRIB output will, in turn, be used by WG7 to apply the standard NCEP & AFWA verification package. This package will be used to verify the other real time WRF runs for comparison purposes.

WG12 is the source of Operational observations being used by WG4.

WG13 will apply the WRF and its many diverse configurations in an ensemble mode which will eventually replace NCEP's Short Range Ensemble Forecast (SREF) system.

WG14 is starting with the NOAH Land-Surface model which is used Operationally at both NCEP and AFWA.

The following online documentation has been identified to help define the NCEP standards and practices for Operational Implementation. Most can be found under the following NCEP Central Operations (NCO) PRODUCTION MANAGEMENT BRANCH site which contains links for:

  • NCEP PRODUCTION SUITE
  • NCEP HANDBOOK and Office Notes
  • IBM Implementation Standards
  • GRIB and BUFR
The NCEP Handbook (circa January 1998) contains the following relevant sections:
Section 3.1.1/ Internal Documentation of Programs
Section 3.2.1/ Procedures for Job Implementations and Changes
Section 3.2.1.1/ Required Contents for Operational Codes
Section 3.3.1/ NCEP Standard Source for Operational Dates and Cycle Times
Section 3.3.2/ Standard Formats for Display of Date and Time
Section 3.6.4/ NCEP Standards for I/O Conventions

NCEP has adopted a Model Pre-implementation Check List which provides for the various steps (notification, management briefings, coordination, documentation etc) that are required during a normal implementation of a change to Operations. These steps will need to be followed when the WRF first becomes Operational and at each stage when the Operational version of the WRF is changed. EMC takes the lead on these activities, and the final decision rests with the NCEP director. This document is being added to the above site.

In the future, the community will be able to provide input to these decisions through the evolving mechanism of "national testbeds for NWP". The details of funding and resources aside, the concept calls for testbeds having the ability to reproduce an Operational retrospective series of runs given the input data (or analyses) and lateral boundary conditions used in NCEP's actual Operations. In this day and age of scalable and portable hardware and software, this in not considered a major difficulty since real time availability is not at issue - just the ability to produce an Operational control run in a pseudo-Operational manner. Changes to this Operational control sequence can then be tested and the results verified in a manner nearly identical to that being performed at NCEP's EMC. Evidence / results produced in this manner will carry nearly the full weight of evidence / results actually produced at NCEP/EMC.

Pre-implementation Strategy for WRF Model Testing at NCEP has as its initial goal to perform clean, unambiguous comparisons between operational Meso and operational Meso using WG2's layered parallel WRF design. This will entail converting the existing highly parallel and highly efficient operational Meso Model code into the layered WRF modeling infrastructure which will immerse NCEP (Tom Black) in layered WRF infrastructure. With the runs producing identical answers, we can compare computer performance on current operational machine. If a significant performance penalty is measured, then redesign of the WRF might be called for. If no penalty is measured, then NCEP can immediately implement the layered WRF model design in operations for both nested & continental Meso runs making subsequent testing of replacement WRF components both clean and unambiguous. Tom Black has completed the conversion of the current hydrostatic Meso Eta to the WRF and found only a 4.5% increase in runtime (both 80 km and 22 km version yielded similar results).

AFWA has begun testing the Eulerian prototype with height coordinate. FSL has begun planning with NCEP for the transition of the hourly rapid update cycle from the RUC model to the WRF model under FAA auspices.

Plans for Research and Development

Pre-implementation Strategy for WRF Model Testing at NCEP has as its second goal to perform clean, unambiguous comparisons between operational Meso and WRF prototypes. These tests will focus on accuracy and reliability (varying run times and code robustness). Compare forecast performance of WRF prototypes vs operations with both being run within the WRF modeling infrastructure. These will emphasize REAL-DATA retrospective case studies - current cold season is period is February - early March 2001 and the warm season is mid-August to mid- September 2000. These are coordinated through WG7 and will include both small- and large-domain capabilities examined for nested and continental requirements of NCEP operations.

AFWA also has ambitous plans for testing the WRF prototypes (Eulerian in height & mass coordinate) in real time at various resolutions. WG7 is coordinating exchange of grids for common verification. FSL and NCEP (& NCAR & CAPS) have laid out an implementation schedule for the WRF at NCEP as part of the 7-year planning for the FAA. NCEP expects to acquire a new computer system in FY 2003. On this machine, the WRF model will be first implemented operationally (probably in FY 2004) as a window inside the geographical area covered by the Eta Model. Of course, implementation is dependent on proof of enhanced skill and NCEP milestones focus on the need to establish a clean test environment (using WRF model infrastructure) and to then perform extensive parallel testing as a precursor to implementation. Forecast model evaluation will initially emphasize the performance of the dynamical core, but will rapidly expand to include the myriad choices for the various components of physics to make the WRF model complete. These choices will be constrained by the computer time needed to complete the runs.

NCEP implemented the year-round Threats runs in FY2001. The Threats slot is made up of runs of the GFDL hurricane model whenever there are tropical cyclones threatening and of runs of various fixed and selectable nested Meso domains at all other times throughout the year. A fixed set of nests are now run covering each region of the United States (CONUS & Alaska & Hawaii & Puerto Rico) at least once per day. Selectable nests of smaller domain can be run to focus in on local threatening situations as well (within the time & compute limits of the Threats run slot in NCEP's production suite) and will be added in FY2002. NCEP's task covers the work involved in integrating, testing and implementing the WRF model in the nested slots. The work will be phased with integration of the WRF system into NCEP and testing during FY2003 and with actual implementation during FY2004.

By FY 2004, the WRF model will be running operationally as a window inside the Meso- Eta in the Threats run. In later years, it is our goal to make it ubiquitous. NCEP will also be testing global versions of the WRF. This is one of the main reasons for its pursuit of the semi- Lagrangian approach. If successful, the type of focus achieved at UK Met where a single unified model is used, could be applied to NCEP's entire model production suite. This would require considerable coordination among the global/climate modeling community and the mesoscale modeling / WRF community.

The FSL experience with the 10-12 km RUC up to this point is preparation for this task. The vehicle for rapid updating with the WRF model will be named the WRF RUC, with the understanding that the WRF dynamical core and the WRF 3DVAR will be at the heart of the rapid updating system in its first operational implementation, probably in FY 2005. The computing horsepower at FSL and NCEP will support one more increases in the resolution of the RUC system, to 10-12 km, before the WRF model becomes the main focus of attention. Once the 10-km limit in resolution is reached (anticipated in FY 2003), further development of rapid updating techniques will be carried out within the context of the WRF model, rather than the current RUC model.

FY 2004 will also mark the beginning of a two- to three-year task (largely FSL's) to build into the WRF the functionality for rapid updating. Development of the then current RUC will cease. The following questions will be addressed:

  • What parts of the WRF model can be used for both rapid updating (as in the RUC) and longer term mesoscale forecast functions (as in the Meso Eta)? For example, can both models share a common data assimilation system, or is this precluded by different data cutoff times and different computational constraints?
  • Will the Rapid Update Cycle and the longer-term mesoscale prediction systems of the future share the same grid? At the same resolution? The same geographical domain?
  • Should the physics packages be the same?
  • What is the future conduit from research to operations? How, for example, can university researchers get their ideas incorporated into operations, even though they develop their ideas within the same modeling context? Does a new infrastructure need to be built?

In the years FY 2005-2008, the WRF model should be proliferating within NCEP's operational production suite of model runs: replacing Meso Eta in the continental domain early guidance runs and its data assimilation system; replacing the Regional Spectral Model (RSM) & Meso Eta in the nested Hawaii runs and in the Short Range Ensemble Forecast (SREF) system; replacing the current RUC model and its data assimilation system in the hourly rapid update runs; replacing the RUC in SREF; and replacing the GFDL hurricane model in the Threats runs. Note that RUC as used in this plan refers to a specific body of software comprising a system for rapid updating. Though the vehicle for rapid updating may change (different data assimilation technique, different model), the need for a rapid updating function will not change. This applies also for EMC's Meso model, which may have adopted a hybrid sigma-pressure vertical coordinate instead of the Eta coordinate or changed to WRF and its other modeling system components.

WRF efforts will also strive to make advanced four-dimensional data assimilation techniques Operational at NCEP. It is important to remember, however, that new observing systems will always challenge the ingenuity of modelers to assimilate the data. Projections indicate that global data volumes will increase by five orders of magnitude by 2010.

Timeline, Milestones and Deliverables

October 2002 (NCEP)
Begin integration of WRF model scripts into NCEP's production scripts for running nested Threats runs.
February 2003 (NCEP)
Complete model script integration and begin replacement of nested Meso post-processor and product generation with WRF equivalents. Ensure products have identical WMO headers etc to allow WRF generated look-alike files to be distributed, evaluated and verified.
February 2003 (NCEP)
EMC begins real-time testing of a WRF-based replacement for the movable nested Eta with an appropriately complete but computationally efficient complement of physics packages.
June 2003 (NCEP)
Complete product integration and begin creation of parallel capability so that side-by-side comparisons and objective verifications can be performed between nested WRF and nested Meso runs.
July 2003 (NCEP)
EMC submits test and verification report regarding performance of the nested WRF model relative to the movable nested Eta model.
July 2003 (FSL)
Initiate the replacement of the RUC 3DVAR with the WRF 3DVAR at FSL
August 2003 (NCEP)
Complete construction of parallel testing capability with fixed domain nested and begin occasional test runs. Begin integration within selectable nest scripts etc.
October 2003 (NCEP)
Begin real-time parallel testing of the latest improvements in grid-resolved and subgrid-scale cloud and precipitation processes, including sweeping changes in software to meet WRF specifications (hereafter referred to as "EMC physics packages").
October 2003 (NCEP)
Begin real-time parallel testing of the WRF Model in NCEP's fixed domain Threats runs.
November 2003 (FSL)
Choose which options in the WRF system to include in the first operational implementation of the WRF RUC. Choices will include geographical domain, grid resolution (including number of levels), frequency of assimilation, physics packages, observational data types, the corresponding forward models, and the 3DVAR statistics (which determines the type of filtering).
January 2004 (NCEP)
Begin retrospective parallel testing of EMC physics packages in WRF, covering the warm season, since real-time parallels cover only the cool season.
February 2004 (NCEP)
Conclude parallel testing of EMC physics packages, summarize results and present spring change package results to CAFTI.
February 2004 (FSL)
Parallel test concludes of microphysics for separate drizzle category and conversion of cloud water to rain.
February 2004 (NCEP)
Conclude parallel testing of the WRF Model in NCEP's fixed domain Threats runs, summarize results and present to CAFTI.
March 2004 (NCEP)
Subject to CAFTI approval, implement the WRF Model in NCEP's operational fixed domain Threats runs.
March 2004 (FSL)
All code modules assembled for WRF RUC. Code design is worked out. Code compiles.
March 2004 (NCEP)
Subject to CAFTI approval, implement enhancements to EMC physics packages into operations.
April 2004 (NCEP)
Begin real-time parallel testing of EMC physics packages for fall package, which includes two- moment representations of cloud water, rain, and various species of ice.
April 2004 (FSL)
Testing begins on retrospective case(s) for which assimilation of 10-12 km RUC has been performed.
April 2004 (NCEP)
Begin real-time parallel testing of the WRF Model in NCEP's selectable domain Threats runs.
July 2004 (FSL)
At FSL, assess progress with WRF RUC relative to operational 10-12 km RUC.
July 2004 (NCEP)
Begin retrospective parallel testing of EMC physics packages covering the cool season, since real-time parallels cover only the warm season.
August 2004 (NCEP)
Conclude parallel testing of EMC physics packages, summarize results and present fall change package results to CAFTI.
August 2004 (NCEP)
Conclude parallel testing of the WRF Model in NCEP's selectable domain Threats runs, summarize results and present to CAFTI.
September 2004 (NCEP)
Subject to CAFTI approval, implement the WRF Model in NCEP's operational selectable domain Threats runs.
September 2004 (NCEP)
Subject to CAFTI approval, implement enhancements to EMC physics packages into operations.
September 2004 (FSL)
Decision is made whether to begin regular testing in real time or, if results don't measure up to the then operational RUC, to continue refinement of WRF RUC.
FY 2005
Begin testing of advanced four-dimensional data assimilation techniques (like 4DVAR) from WRF and NCEP efforts.
The WRF model should be running operationally as a window model inside the Meso Eta model. The WRF-RUC will running in test mode at FSL and may be shipped to EMC for parallel testing against the operational RUC before the end of the year.
National test beds for experimenting with innovations to WRF will become critical as the user base expands.
FY 2006
Complete conversion / development and begin testing of a Short Range Ensemble Forecasting system based entirely on WRF.
The WRF RUC becomes operational at EMC.
FY 2007
WRF model runs operationally at NCEP, providing regional early guidance out to 84 hours on continental scale.
Set up system for operational dissemination of aviation products from the WRF model.
Consider focused program for fog prediction and dissipation. Very high-resolution model, horizontally and vertically, needed for this application.
FY 2008
WRF model becomes the operational hurricane model at NCEP.
Time to consider code conversions for next generation computer.
With the prospect of having sufficient computing power to run convection resolving forecasts over the whole continental U.S., further refinements to the model physics will be developed and tested.

Resources (NCEP only)

  1. Identified
    Geoff DiMego 0.10 FTE
    Tom Black 0.75 FTE
    Yi Jin 0.35 FTE
    Matt Pyle 0.15 FTE
    Hui-Ya Chuang 0.15 FTE
  2. Needed

    1.00 FTE to expand scope of effort to include running the retrospective tests of physics configurations, running verification codes, compiling and evaluating results and placing summaries of results into reports and onto the web.

  3. Totals:
    • Dedicated 1.50 FTE (NCEP base and FAA funded)
    • NEEDED 1.00 FTE
This doesn't include AFWA or FSL resources specifically dedicated to WG9 activities.

Send email to gill@ucar.edu for comments on this page.