Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

MPAS refactor notes #753

Open
hkershaw-brown opened this issue Oct 11, 2024 · 35 comments
Open

MPAS refactor notes #753

hkershaw-brown opened this issue Oct 11, 2024 · 35 comments
Assignees
Labels
mpas Model for Prediction Across Scales (MPAS)

Comments

@hkershaw-brown
Copy link
Member

Unresolved issues:

Surface elevation check
! Reject obs if the station height is far way from the model terrain.

but fwd operators (rttov) need these surface values
see discussion #332

Code:

branch (be aware I may force push to this)
https://github.com/NCAR/DART/tree/mpas-refactor

Spec (note 7 v 8)
https://docs.google.com/document/d/1wD4MKM5PquOUbxK3YxIjcbIaFThJfa318heDn_bFUbs/edit

@hkershaw-brown hkershaw-brown self-assigned this Oct 11, 2024
@hkershaw-brown
Copy link
Member Author

In compute_u_with_rbf there is a loop around ens_size, for get_reconstruct but uval(1:ens_size) is set to set to the single value ureconstructzonal/ureconstructmeridional so you are getting the whole ensemble set each step of the loop

whole ensemble == last value from the loop.

do e = 1, ens_size
call get_reconstruct(nedges, lat*deg2rad, lon*deg2rad, &
coeffs_reconstruct, on_a_sphere, veldata(:, e), &
ureconstructx, ureconstructy, ureconstructz, &
ureconstructzonal, ureconstructmeridional)
if (zonal) then
uval = ureconstructzonal
else
uval = ureconstructmeridional
endif
enddo

@hkershaw-brown
Copy link
Member Author

todo can be different across the ensemble

!>@todo lower (upper) could be different levels in pressure
!lowval = x(index1 + (edgelist(i)-1) * nvert + lower(i)-1)
lowindx = int(index1,i8) + int((edgelist(i)-1) * nvert,i8) + int(lower(i,1)-1,i8)
lowval = get_state(lowindx, state_handle)
!uppval = x(index1 + (edgelist(i)-1) * nvert + upper(i)-1)
upindx = int(index1,i8) + int((edgelist(i)-1) * nvert,i8) + int(upper(i,1)-1,i8)
uppval = get_state(upindx, state_handle)
veldata(i, :) = lowval*(1.0_r8 - fract(i, :)) + uppval*fract(i, :)

@hkershaw-brown
Copy link
Member Author

on_bound? not used

logical :: at_surf, do_norm, on_bound

! if the entire ensemble has missing vertical values we can return now.
! otherwise we need to continue to convert the members with good vertical values.
! boundary cells will be updated by the assimilation.
! if all the vertical localization coord values are missing,
! we don't call this routine again, and return.
if (all(zin == missing_r8)) then ! .or. on_bound) then
istatus(:) = 0
return
endif

@hkershaw-brown
Copy link
Member Author

find_vert_indices is a wrap around find_vert_level. Remove

subroutine find_vert_indices (state_handle, ens_size, loc, nc, c, lower, upper, fract, ier)

@hkershaw-brown
Copy link
Member Author

convert_vert_distrib_state (state)
convert_vert_distrib (obs)

super repetitive & according to this comment state is calling the interpolation anyway:

!>@todo FIXME - this is the original code from the
!> observation version which does horizontal interpolation.
!> in this code we know we are on a state vector location
!> so no interp is needed. we should be able to make this
!> more computationally efficient.

@hkershaw-brown
Copy link
Member Author

If rbf is depricated 11 years ago, possibly it is time to remove the code?

Screenshot 2024-10-28 at 6 19 09 PM

! RBF (radial basis function) modules, donated by LANL. currently deprecated

note see comment on incorrect rbf calulcaiton.

hkershaw-brown added a commit that referenced this issue Oct 28, 2024
@hkershaw-brown hkershaw-brown added the mpas Model for Prediction Across Scales (MPAS) label Oct 29, 2024
@hkershaw-brown
Copy link
Member Author

Shells scripts are out of date, user complaints about cycling. Lots of csh (poor).

Documentation is poor, also points people to this MPAS page which sends people to none existent documentation (https://svn-dares-dart.cgd.ucar.edu/)

@hkershaw-brown
Copy link
Member Author

hkershaw-brown commented Oct 29, 2024

Nope:

! try adjusting what static_init_model does, before it is called.

! if set by: call force_u_into_state() *BEFORE* calling static_init_model(),

subroutine static_init_model()
!>@todo FIXME - can we add an optional arg here for update_bc use?

Note, beware of whatever the regional stuff is doing. Seems like some bizarre choices here.

! modify what static_init_model does. this *must* be called before
! calling static_init_model().

more nope:

! Note that force_u_into_state should be called before static_init_model, which is unusual.
call force_u_into_state()
call set_lbc_variables(bdy_template_filename)

@hkershaw-brown
Copy link
Member Author

update_bc.f90 is calling use direct_netcdf_mod,only : read_variables directly. 🙃

@hkershaw-brown
Copy link
Member Author

why is this compile time, rather than runtime:

! compile-time control over whether grid information is written to the
! diagnostic files or not. if it is, the files are self-contained (e.g. for
! ease of plotting), but are also much larger than they would be otherwise.
! change this private variable to control whether the grid information is
! written or not.
logical :: add_static_data_to_diags = .false.

@hkershaw-brown
Copy link
Member Author

why is this compile time, rather than runtime:

! compile-time control over whether grid information is written to the
! diagnostic files or not. if it is, the files are self-contained (e.g. for
! ease of plotting), but are also much larger than they would be otherwise.
! change this private variable to control whether the grid information is
! written or not.
logical :: add_static_data_to_diags = .false.

oh add_static_data_to_diags is not used and there is also a compile time (namelist) option write_grid_to_diag_file

@hkershaw-brown
Copy link
Member Author

! FIXME: it may be desirable to read in xCell(:), yCell(:), zCell(:)
! to keep from having to compute them on demand, especially since we
! have converted the radian lat/lon of the cell centers into degrees.
! we have to convert back, then take a few sin and cos to get xyz.
! time/space/accuracy tradeoff here.

https://github.com/NCAR/DART/blame/e2188646b97573f198c66f7ea63482ebc34afa11/models/mpas_atm/model_mod.f90#L416

@hkershaw-brown
Copy link
Member Author

model size needs to be i8:

integer :: model_size ! the state vector length

@hkershaw-brown
Copy link
Member Author

currently unused (11 years). Not needed for regional? Or should be used?

! currently unused; for a regional model it is going to be necessary to know
! if the grid is continuous around longitudes (wraps in east-west) or not,
! and if it covers either of the poles.
character(len= 64) :: ew_boundary_type, ns_boundary_type

@hkershaw-brown
Copy link
Member Author

This data on edge logic seems overcomplicated

! Do we have any state vector items located on the cell edges?
! If not, avoid reading in or using the edge arrays to save space.
! FIXME: this should be set after looking at the fields listed in the
! namelist which are to be read into the state vector - if any of them
! are located on the edges then this flag should be changed to .true.
! however, the way the code is structured these arrays are allocated
! before the details of field list is examined. since right now the
! only possible field array that is on the edges is the 'u' edge normal
! winds, search specifically for that in the state field list and set
! this based on that. if any other data might be on edges, this routine
! will need to be updated: is_edgedata_in_state_vector()
logical :: data_on_edges = .false.

variable dimension tells you edge vs cell, .e.g

float u(Time, nEdges, nVertLevels) ;
                u:units = "m s^{-1}" ;
                u:long_name = "Horizontal normal velocity at edges" ;

float qs(Time, nCells, nVertLevels) ;
                qs:units = "kg kg^{-1}" ;
                qs:long_name = "Snow mixing ratio" ;

@hkershaw-brown
Copy link
Member Author

"causes mpas namelists to be read" - does not appear to be true

! Set the time step ... causes mpas namelists to be read.
! Ensures model_timestep is multiple of 'dynamics_timestep'
call set_calendar_type( calendar ) ! comes from model_mod_nml

@hkershaw-brown
Copy link
Member Author

This comment, not sure what it is getting at:

!>@todo FIXME: if we want to do increments, we could also add a
! third domain which is the original forecast fields before
! the assimilation (so we can compute increments)

@hkershaw-brown
Copy link
Member Author

boundary domain variables do not get bounds for clamping:

lbc_domid = add_domain(bdy_template_filename, lbc_nfields, &
var_names = lbc_variables)
! FIXME clamp_vals = variable_bounds(1:nfields,:) )

hkershaw-brown added a commit that referenced this issue Oct 29, 2024
see comment #753 (comment)

Remove not used variables:
add_static_data_to_diags  - not used
ew_boundary_type, ns_boundary_type  -not used

 Shortened some very verbose comments.
@hkershaw-brown
Copy link
Member Author

query_vert_localization_coord not used & not requried.

!> pass the vertical localization coordinate to assim_tools_mod
function query_vert_localization_coord()
integer :: query_vert_localization_coord
query_vert_localization_coord = vert_localization_coord
end function query_vert_localization_coord

hkershaw-brown added a commit that referenced this issue Oct 29, 2024
@hkershaw-brown
Copy link
Member Author

get_init_template_filename

public :: get_init_template_filename, &

called but value unused in update_bc ?

call get_init_template_filename(static_filename)

@hkershaw-brown
Copy link
Member Author

Hardcoded domain id:

! use get_num_variables() after setting the domains
! separate number for analysis file, boundary file
nanlvars = get_num_variables(1)
nbdyvars = get_num_variables(2)

@hkershaw-brown
Copy link
Member Author

serial loop around files for update bc:

fileloop: do ! until out of files

@hkershaw-brown
Copy link
Member Author

write model time overloaded

interface write_model_time
module procedure write_model_time_file
module procedure write_model_time_restart
end interface

@hkershaw-brown
Copy link
Member Author

is_global_grid

Only used by mpas_dart_obs_preprocess, then global does not appear to be used

global = is_global_grid()

@hkershaw-brown
Copy link
Member Author

hkershaw-brown commented Oct 29, 2024

Hardcoded Namelist dom_id in update_mpas_states:

integer :: dom_id = 1

@hkershaw-brown
Copy link
Member Author

hkershaw-brown commented Oct 29, 2024

serial loop around files for update_mpas_states

!----------------------------------------------------------------------
! Reads lists of input mpas (prior) and filter (analysis) files
!----------------------------------------------------------------------
filenum = 1
fileloop: do ! until out of files

no dependencies in the loop? - run multiple update_mpas_states in parallel.

update_mpas_states is just a copy of netcdf variables from one file to another, except for wind.
Not sure why the loop is inside update_mpas_states, scripting will need to be updated to change this.

Maybe have 2 arguments, then you can run multiple update_mpas_state programs at once.
./update_mpas_state prior_file posterior_file

@hkershaw-brown
Copy link
Member Author

progvar is removed and replaced with state structure. Notes on done/not done routines

I'm concerned about update_bc and update_mpas_states hard coding domains and using read_variables directory.
Also,

  • memory wise - reading the whole state in

    • clamping - this is already done on writing out the state from filter. Is this additional? (or unneeded) Would you not clamp from filter, but clamp here? Do you need the fail if so?
    • updating winds, you only need to read in the winds, not the whole state
  • performance wise, looping around each restart file in update_mpas_states

@hkershaw-brown
Copy link
Member Author

hkershaw-brown commented Oct 29, 2024

todo docs:

  • remove progvar from docs.
  • whatever is going on with clamping.
  • It would be good to explain "The wind options during a DART assimilation are controlled by combinations of 4 different namelist values." what the 4 are and what the combinations are.

@hkershaw-brown
Copy link
Member Author

oncenters is not binary (3 choices)

find_vert_indices -> find_vert_level( ..., oncenters=.true., ...)

the logic for oncenters is a bit weird to be passing all the way down the call tree.

Here it is hardcoded .true. for pressure otherwise .false.

if (verttype == VERTISPRESSURE) then
! get all cells which share any edges with the 3 cells which share
! the closest vertex.
call make_cell_list(vertexid, 3, ncells, celllist)
call find_vert_level(state_handle, ens_size, loc, ncells, celllist, .true., &
lower, upper, fract, ier)
if (all(ier /= 0)) return
! now have pressure at all cell centers - need to interp to get pressure
! at edge centers.
call move_pressure_to_edges(ncells, celllist, lower, upper, fract, &
nedges, edgelist, ier)
if (all(ier /= 0)) return
else
! need vert index for the vertical level
call find_vert_level(state_handle, ens_size, loc, nedges, edgelist, .false., &
lower, upper, fract, ier)
if (all(ier /= 0)) return
endif

but oncenters ignored for pressure:

if(verttype == VERTISPRESSURE ) then
track_ier = 0
vert_array = vert
! Need to get base offsets for the potential temperature, density, and water
! vapor mixing fields in the state vector
call get_index_range(QTY_POTENTIAL_TEMPERATURE, pt_base_offset)
call get_index_range(QTY_DENSITY, density_base_offset)
call get_index_range(QTY_VAPOR_MIXING_RATIO, qv_base_offset)
do i=1, nc
call find_pressure_bounds(state_handle, ens_size, vert_array, ids(i), nVertLevels, &
pt_base_offset, density_base_offset, qv_base_offset, &
lower(i, :), upper(i, :), fract(i, :), ier)
!if(debug > 9) print '(A,5I5,F10.4)', &
! ' after find_pressure_bounds: ier, i, cellid, lower, upper, fract = ', &
! ier, i, ids(i), lower(i, :), upper(i, :), fract(i, :)
!if(debug > 5 .and. ier(e) /= 0) print '(A,4I5,F10.4,I3,F10.4)', &
!'fail in find_pressure_bounds: ier, nc, i, id, vert, lower, fract: ', &
! ier, nc, i, ids(i), vert_array, lower(i, :), fract(i, :)
! we are inside a loop over each corner. consolidate error codes
! so that we return an error for that ensemble member if any
! of the corners fails the pressure bounds test.
where (ier /= 0 .and. track_ier == 0) track_ier = ier
enddo
ier = track_ier
return
endif

oncenters(/faces/edges) only relevant for HEIGHT?
oncenters/faces/edges Can be queried from the obs_kind.

see

@hkershaw-brown
Copy link
Member Author

related to #757 "VERTISSURFACE describes variables on A surface (not necessarily THE surface)"

! VERTISSURFACE describes variables on A surface (not necessarily THE surface)

hkershaw-brown added a commit that referenced this issue Dec 26, 2024
todo: wind, domain_id
see #753 (comment)
for various notes on update_mpas_states
@hkershaw-brown
Copy link
Member Author

why is update_from_u_reconstruct a model_nml option rather than an update_mpas_state option?

@nancycollins
Copy link
Collaborator

why is update_from_u_reconstruct a model_nml option rather than an update_mpas_state option?

the original code didn't include a separate post-processing program. if the namelist options made the assimilation update only the reconstructed winds, this function updated the edge normal winds before finishing.

@hkershaw-brown
Copy link
Member Author

There are a couple of routines that have an optional logical argument that changes the routine, rather than having two routines.

https://github.com/NCAR/DART/blob/bddda57a7df96b4b956db33cec5238b9850fc825/models/mpas_atm/model_mod.f90#L6633C1-L6637

subroutine uv_cell_to_edges(zonal_wind, meridional_wind, uedge, full_u)

comment does not match the code:

! just replace, no increments
! project the diffs (or increments) onto the edges by calling uv_cell_to_edges.
call uv_cell_to_edges(lbc_ucell, lbc_vcell, lbc_u, .true.)

@hkershaw-brown
Copy link
Member Author

Does the lbc domain need to be in the state?

I can't tell from the comments.

!----------------------------------------------------------------------
! Reads input lbc (prior) and filter (analysis) files
!----------------------------------------------------------------------
ncAnlID = nc_open_file_readonly(next_infile, 'update_bc - open readonly')
ncBdyID = nc_open_file_readwrite(next_outfile, 'update_bc - open readwrite')

!----------------------------------------------------------------------
! Reads input lbc (prior) and filter (analysis) files
!----------------------------------------------------------------------

ncAnlID = nc_open_file_readonly(next_infile, 'update_bc - open readonly') analysis
ncBdyID = nc_open_file_readwrite(next_outfile, 'update_bc - open readwrite') prior

"state vector" is a mixture of prior in lbc_domid and analysis in the anl_domid for update_bc?

d1size = get_domain_size(1)
d2size = get_domain_size(2)
call read_variables(ncAnlID, statevector(1:d1size), 1, nanlvars, domain=1)
call read_variables(ncBdyID, statevector(d1size+1:d1size+d2size), 1, nbdyvars, domain=2)

!> regional mpas only:
!> update the boundary fields based on the analysis file values
!> in the boundary region, which were updated by blending the analysis
!> and the original (e.g., prior) boundary values.
!> There are options to update edge winds directly (by blending 'u'),
!> or to update reconstructed winds at cell centers first then project them onto edges.
!> When the edge wind is updated, it can be replaced by the blended value,
!> or modified with the increments (from cell-center winds).
!> state_vector - read from both ncid_a and ncid_b.
!> ncid_b = lbc to be overwritten with blended fields in the boundary zone.
!> ncid_a = analysis - either init or restart.
subroutine statevector_to_boundary_file(state_vector, ncid_b, ncid_a, &
lbc_update_from_reconstructed_winds, lbc_update_winds_from_increments, idebug)

Looks like lbc is not a different domain, it is different data on the same domain:

nCellsA = nc_get_dimension_size(ncAnlID, 'nCells', 'update_bc') ! Ha
nCellsB = nc_get_dimension_size(ncBdyID, 'nCells', 'update_bc') ! Ha
nVertLevelsA = nc_get_dimension_size(ncAnlID, 'nVertLevels', 'update_bc') ! Ha
nVertLevelsB = nc_get_dimension_size(ncBdyID, 'nVertLevels', 'update_bc') ! Ha
print*,'nCells, nVertLevels:', nCellsA, nVertLevelsA,' in ',trim(next_infile) ! Ha
if((nCellsA /= nCellsB) .or. (nVertLevelsA /= nVertLevelsB)) then ! Ha
print*,'nCells, nVertLevels:', nCellsB, nVertLevelsB,' in ',trim(next_outfile)
write(string1,*) 'Domain size mismatches'
call error_handler(E_ERR,'update_bc',string1,source,revision,revdate)
endif

I think filter does not need to read the lbc file. (and does not because set_lbc_variables is not called).
I think update_bc needs the info from the lbc file and the analysis file, but the actual state is not 2 domains
set_lbc_variables in update_bc is being used to as a pre-static_init_model call to add a domain rather than just reading 2 files.

mpas-bounary-sketch

@nancycollins
Copy link
Collaborator

as you deduced, the lbc fields shouldn't be in the state for filter but are required for update_bc. i think the code was originally set up this way because the names of the fields in the state are set by the namelist. update_bc wanted to use the same namelist to read in the same fields as filter, plus the lbc fields. the regional part of mpas has changed between major model versions, so i'm not sure what the lbc files look like now.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
mpas Model for Prediction Across Scales (MPAS)
Projects
None yet
Development

No branches or pull requests

2 participants