[mesa-users] Show the evolution of the model at each step of the solver iteration

Bill Paxton paxton at kitp.ucsb.edu
Thu Sep 4 13:34:52 EDT 2014


Hi Mathieu,

Here's a sketch of what I do for this.  Needless to say it is an important tool for me in debugging!

It would be easy to extend this to add a hook so you could be called at each iteration to look at the star model as it changes.
Let me know if you think that would be useful.

Meantime, take a look at the following hints.  The output files produced by set_hydro_inspectB_flag are very useful.
btw: inspectB is a routine in star/private/hydro_newton_procs.f that gets called at each iteration to "inspect" the
corrections "B" before they are applied.	That's a natural place to add a hook to call outside routine if we go that route.

Of course, it is a long way from knowing which variables are jumping around to hurt convergence to understanding
why they are jumping around and how to fix the problem.  But at least the tools outlined below can get you oriented.

cheers,
Bill






1) copy star/test_suite/debugging_stuff_for_inlists to the &controls section of your inlist.

2) uncomment report_hydro_solver_progress = .true.

3) run a few steps to get info about the newton iters.   e.g. this is for model 851, hydro_call_number 677 (the run didn't start at model = 1).
the "coeff", "slope", and "f" columns are for cases where newton is using a "line search" scheme; we're not doing that this time.
the main info is in the columns for average and maximum of residuals and corrections.  the rightmost column indicates what criteria
are preventing us from accepting the current iteration as a solution for the new model.   note that in this case the values steadily
decrease at each iteration -- that's good!  it doesn't always do that however.  sometimes the iterations diverge so a retry or backup must happen.
in some rare cases, the residual or correction values get worse for an iteration or two at the beginning, but then converge.
when the code is trying to deal with tough cases, it will resort to doing "line search" in which it only applies a fraction of the correction
that comes from doing the matrix solve.  That fraction is given in the "coeff" column.  In the happy cases where it works, it will
get back on track and finish with an iteration using coeff=1.0 to give a final result.  Often however, this scheme also fails, and we have to backup.

     hydro_call_number, s% dt, dt, dt/secyer, log dt/yr         677    3.5260914675391873D+05    3.5260914675391873D+05    1.1173314878492300D-02   -1.9518179620802401D+00

   851    1  coeff  1.0000  slope  0.000E+00  f  0.000E+00  avg resid  0.600E-04  max resid  0.264E+00  avg corr  0.208E-02  max corr  0.116E+00  lg dt/yr -1.95  avg+max corr, max resid
   851    2  coeff  1.0000  slope  0.000E+00  f  0.000E+00  avg resid  0.156E-04  max resid  0.478E-01  avg corr  0.966E-03  max corr  0.583E-01  lg dt/yr -1.95  avg+max corr, max resid
   851    3  coeff  1.0000  slope  0.000E+00  f  0.000E+00  avg resid  0.243E-05  max resid  0.932E-01  avg corr  0.329E-03  max corr  0.560E-01  lg dt/yr -1.95  avg+max corr, max resid
   851    4  coeff  1.0000  slope  0.000E+00  f  0.000E+00  avg resid  0.867E-06  max resid  0.357E-02  avg corr  0.159E-03  max corr  0.152E-01  lg dt/yr -1.95  avg+max corr
   851    5  coeff  1.0000  slope  0.000E+00  f  0.000E+00  avg resid  0.957E-06  max resid  0.198E-01  avg corr  0.170E-03  max corr  0.132E-01  lg dt/yr -1.95  avg+max corr
   851    6  coeff  1.0000  slope  0.000E+00  f  0.000E+00  avg resid  0.683E-06  max resid  0.176E-01  avg corr  0.113E-03  max corr  0.105E-01  lg dt/yr -1.95  avg+max corr
   851    7  coeff  1.0000  slope  0.000E+00  f  0.000E+00  avg resid  0.476E-06  max resid  0.842E-02  avg corr  0.785E-04  max corr  0.807E-02  lg dt/yr -1.95  avg+max corr
   851    8  coeff  1.0000  slope  0.000E+00  f  0.000E+00  avg resid  0.358E-06  max resid  0.550E-02  avg corr  0.570E-04  max corr  0.624E-02  lg dt/yr -1.95  avg+max corr
   851    9  coeff  1.0000  slope  0.000E+00  f  0.000E+00  avg resid  0.294E-06  max resid  0.419E-02  avg corr  0.469E-04  max corr  0.505E-02  lg dt/yr -1.95  avg+max corr
   851   10  coeff  1.0000  slope  0.000E+00  f  0.000E+00  avg resid  0.224E-06  max resid  0.280E-02  avg corr  0.356E-04  max corr  0.399E-02  lg dt/yr -1.95  avg+max corr
   851   11  coeff  1.0000  slope  0.000E+00  f  0.000E+00  avg resid  0.192E-06  max resid  0.188E-02  avg corr  0.303E-04  max corr  0.334E-02  lg dt/yr -1.95  avg+max corr
   851   12  coeff  1.0000  slope  0.000E+00  f  0.000E+00  avg resid  0.155E-06  max resid  0.199E-02  avg corr  0.249E-04  max corr  0.268E-02  lg dt/yr -1.95  avg+max resid
   851   13  coeff  1.0000  slope  0.000E+00  f  0.000E+00  avg resid  0.128E-06  max resid  0.838E-03  avg corr  0.197E-04  max corr  0.225E-02  lg dt/yr -1.95  avg+max resid
   851   14  coeff  1.0000  slope  0.000E+00  f  0.000E+00  avg resid  0.110E-06  max resid  0.163E-02  avg corr  0.165E-04  max corr  0.188E-02  lg dt/yr -1.95  avg+max resid
   851   15  coeff  1.0000  slope  0.000E+00  f  0.000E+00  avg resid  0.847E-07  max resid  0.432E-03  avg corr  0.131E-04  max corr  0.155E-02  lg dt/yr -1.95  avg+max resid
   851   16  coeff  1.0000  slope  0.000E+00  f  0.000E+00  avg resid  0.760E-07  max resid  0.755E-03  avg corr  0.113E-04  max corr  0.132E-02  lg dt/yr -1.95  avg+max resid
   851   17  coeff  1.0000  slope  0.000E+00  f  0.000E+00  avg resid  0.631E-07  max resid  0.609E-03  avg corr  0.942E-05  max corr  0.111E-02  lg dt/yr -1.95  okay!
        851   7.475113   3994.545   3.877887   3.877902   0.512192   0.000324   0.000000   0.000000   0.325763   0.000403  87.098553   2366     38
  -1.951818   6.333624   2.161252 -16.737030   2.713743  -9.000000   0.511868   0.000000   0.704105   0.020103   0.073832  21.703374     17      2
 2.6261E+05  22.926494   3.681855  -0.594533   1.298600 -10.861405   0.450559   0.276627   0.019222  1.000E+00  9.258E-01 -0.617E-03    varcontrol


the avg correction and max correction values take 17 iters to get below the tolerances
      tol_correction_norm = 3d-5
      tol_max_correction = 3d-3
in the last iteration, it gives avg corr  0.942E-05  max corr  0.111E-02
in next to last, avg corr  0.113E-04  max corr  0.132E-02

so that's nice, but where in the model are the problems coming from that are hurting the convergence?

a quick way to get some info about this is to set hydro_check_everything = .true.
at each iteration it will printout for each equation the location that has the worst residual.
often there will be 1 or 2 cells that dominate these lists, and 1 or 2 equations that have
bad residuals while the rest seem okay.  that's can be a useful hint.  but often we will need more.

for that we get the system to output files to let us visualize the size of corrections for each variable at 
each cell at each iteration -- lots of data, so we need to look at it as plots to see patterns.
here's how I do that.

4) set hydro_dump_call_number to the number given in the terminal output (677 in this case)
and set hydro_inspectB_flag = .true.  (B is the vector of corrections -- we're inspecting them, hence the name)
create directory 'plot_data' in directory where will run.
create directory 'solve_logs' in plot_data.
rerun.   similar terminal output, followed by
STOP debug: dumping hydro_newton
there should now be lots of files in plot_data/solve_logs with extension "log".
the file "names.data" has a list of the names of data files (e.g., "corr_he4" for corr_he4.log)
the file "size.data" gives number of columns and number of rows for each data file.  columns correspond to cells in the models.  rows are newton iterations.

now plot the data to see where the corrections are large.

this tioga file is what i use.   it makes use of a couple of tioga files in mesa/utils.    it is written to assume that it lives in a folder in your work directory -- it expects to find the data in ../plot_data/solve_logs


cheers,
b




-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://lists.mesastar.org/pipermail/mesa-users/attachments/20140904/7c0f887e/attachment.html>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: solve_log.rb
Type: text/x-ruby-script
Size: 1701 bytes
Desc: not available
URL: <https://lists.mesastar.org/pipermail/mesa-users/attachments/20140904/7c0f887e/attachment.bin>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://lists.mesastar.org/pipermail/mesa-users/attachments/20140904/7c0f887e/attachment-0001.html>


More information about the Mesa-users mailing list