[mesa-users] Convergence
Bill Paxton
paxton at kitp.ucsb.edu
Fri May 20 14:04:39 EDT 2016
This is a good teaching opportunity. ;) let's look into the use of the 'report_hydro_solver_progress' control that Rob mentioned.
This example is from the 1M_pre_ms_to_wd test case when it has reached the he core flash at the tip of the RGB. the timestep has to be drastically cut down to get through the flash, but the standard time step controls haven't done the job. the result is a short burst of retries and backups to reduce the timestep the hard way.
let's use report_hydro_solver_progress to look at details for the 1st backup. for this example, i've modified the controls to make the newton do more iterations before giving up. look in controls.defaults if you need a reminder of what these controls do.
newton_iterations_limit = 18 ! this is used for setting timesteps
max_tries = 19
iter_for_resid_tol2 = 18
Here's the terminal output for the newton failure at step 2002: (best viewed with fixed pitch font such as Courier so that the columns line up)
2002 1 coeff 1.0000 slope 0.000E+00 f 0.000E+00 avg resid 7.848E-04 max resid 1.680E+01 avg corr 1.147E-02 max excess 1.402E+02 lg dt/yr 2.51 avg+max corr+resid
2002 2 coeff 1.0000 slope 0.000E+00 f 0.000E+00 avg resid 4.176E-04 max resid 6.278E+00 avg corr 6.388E-03 max excess 4.192E+01 lg dt/yr 2.51 avg+max corr+resid
2002 3 coeff 1.0000 slope 0.000E+00 f 0.000E+00 avg resid 2.049E-01 max resid 6.015E+03 avg corr 5.934E-05 max excess 2.805E+00 lg dt/yr 2.51 avg+max corr+resid
2002 4 coeff 1.0000 slope 0.000E+00 f 0.000E+00 avg resid 1.143E-01 max resid 1.753E+03 avg corr 1.628E-03 max excess 9.842E+00 lg dt/yr 2.51 avg+max corr+resid
2002 5 coeff 1.0000 slope 0.000E+00 f 0.000E+00 avg resid 4.516E-02 max resid 7.629E+02 avg corr 7.307E-05 max excess 4.015E+00 lg dt/yr 2.51 avg+max corr+resid
2002 6 coeff 1.0000 slope 0.000E+00 f 0.000E+00 avg resid 1.417E-02 max resid 2.227E+02 avg corr 2.194E-04 max excess 1.288E+00 lg dt/yr 2.51 avg+max corr+resid
2002 7 coeff 1.0000 slope 0.000E+00 f 0.000E+00 avg resid 4.096E-03 max resid 5.859E+01 avg corr 2.998E-04 max excess 1.803E+00 lg dt/yr 2.51 avg+max corr+resid
2002 8 coeff 1.0000 slope 0.000E+00 f 0.000E+00 avg resid 1.387E-03 max resid 2.024E+01 avg corr 8.025E-05 max excess 9.389E-01 lg dt/yr 2.51 avg corr, avg+max resid
2002 9 coeff 1.0000 slope -6.254E+02 f 5.238E+01 avg resid 3.802E-04 max resid 1.021E+01 avg corr 5.443E-05 max excess 4.255E-01 lg dt/yr 2.51 avg corr, avg+max resid
2002 10 coeff 0.9621 slope 0.000E+00 f 5.995E+02 avg resid 1.278E-03 max resid 3.448E+01 avg corr 8.712E-05 max excess 1.691E+00 lg dt/yr 2.51 avg+max corr+resid
2002 11 coeff 1.0000 slope -8.572E+01 f 5.978E+00 avg resid 2.268E-04 max resid 2.692E+00 avg corr 1.091E-04 max excess 1.154E+00 lg dt/yr 2.51 avg+max corr+resid
2002 12 coeff 0.2192 slope -2.794E+02 f 5.321E+00 avg resid 2.014E-04 max resid 2.760E+00 avg corr 2.220E-05 max excess 1.007E+00 lg dt/yr 2.51 avg+max corr+resid
2002 13 coeff 0.1000 slope -2.290E+02 f 1.077E+01 avg resid 2.637E-04 max resid 4.175E+00 avg corr 2.230E-04 max excess 2.738E+00 lg dt/yr 2.51 avg+max corr+resid
2002 14 coeff 0.1000 slope -2.644E+02 f 1.092E+01 avg resid 2.782E-04 max resid 4.037E+00 avg corr 1.942E-04 max excess 2.411E+00 lg dt/yr 2.51 avg+max corr+resid
2002 15 coeff 1.0000 slope 0.000E+00 f 0.000E+00 avg resid 5.678E-04 max resid 9.860E+00 avg corr 8.524E-05 max excess 9.208E-01 lg dt/yr 2.51 avg corr, avg+max resid
2002 16 coeff 0.2000 slope -4.362E+01 f 4.013E+01 avg resid 4.675E-04 max resid 8.253E+00 avg corr 7.040E-05 max excess 8.586E-01 lg dt/yr 2.51 avg corr, avg+max resid
2002 17 coeff 0.2000 slope -3.361E+01 f 3.153E+01 avg resid 4.028E-04 max resid 7.418E+00 avg corr 7.252E-05 max excess 1.050E+00 lg dt/yr 2.51 avg+max corr+resid
2002 18 coeff 1.0000 slope 0.000E+00 f 0.000E+00 avg resid 2.504E+00 max resid 7.689E+04 avg corr 2.254E-04 max excess 4.758E+00 lg dt/yr 2.51 avg+max corr
2002 19 coeff 0.0309 slope 0.000E+00 f 2.774E+09 avg resid 2.427E+00 max resid 7.449E+04 avg corr 4.662E-03 max excess 7.777E+01 lg dt/yr 2.51 avg+max corr -- give up
hydro_newton_step failed to converge
First thing to notice is that I said "failed to converge" in the last line -- please understand the use of "convergence" in the context of the newton solver means finding an acceptable new model.
The step number is in column 1, the iteration number in column 2. We've set max_tries = 19, so it gives up after that many iterations.
The last column text indicates why the iteration was rejected. Since I've set iter_for_resid_tol2 = 18, it is considering residuals up to iteration 17 and finding them too large.
The avg and max corrections are too large for all of the iterations. In fact they've stopped improving after about 3 iterations.
btw: the "max excess" is equal to the max correction divided by the tol_max_correction, so anything > 1 is bad. (the current public verion of mesa shows max_corr in this column).
Looking at the values for avg and max, for residuals and corrections, you can see that the newton iterations are not leading to improvements -- things are actually getting worse.
This is because we are trying to take too large a timestep, and the assumption of near linear response of residuals to corrections is invalid.
More iterations won't help; the only solution is to retry with a smaller timestep. (drops log dt/yr from 2.5 to 2.2)
When that timestep reduction happens, we see this output from report_hydro_solver_progress for the same starting model:
2002 1 coeff 1.0000 slope -8.472E+01 f 1.282E+00 avg resid 7.291E-05 max resid 1.561E+00 avg corr 5.959E-03 max excess 8.593E+01 lg dt/yr 2.20 avg+max corr+resid
2002 2 coeff 1.0000 slope 0.000E+00 f 0.000E+00 avg resid 2.364E-06 max resid 4.573E-02 avg corr 3.225E-03 max excess 2.139E+01 lg dt/yr 2.20 avg+max corr+resid
2002 3 coeff 1.0000 slope 0.000E+00 f 0.000E+00 avg resid 1.969E-06 max resid 5.548E-02 avg corr 8.639E-05 max excess 5.873E-01 lg dt/yr 2.20 avg corr, avg+max resid
2002 4 coeff 1.0000 slope 0.000E+00 f 0.000E+00 avg resid 2.094E-09 max resid 4.294E-05 avg corr 1.125E-06 max excess 7.163E-03 lg dt/yr 2.20 avg+max resid
2002 5 coeff 1.0000 slope 0.000E+00 f 0.000E+00 avg resid 1.349E-12 max resid 3.595E-09 avg corr 2.432E-08 max excess 1.553E-04 lg dt/yr 2.20 okay!
That's more like it! That's the sort of "convergence" you want to see. Note the rapid drop at each iteration in both residuals and corrections.
So why not always require the residuals to get small just as we require the corrections to get small?
The sad truth is that we run into cases where small corrections lead to large changes in residuals. The accuracy of the partials going into the Jacobian is not good enough to let us find the exact correction that would make the residual get small. The result can be "stalled" residuals that don't keep dropping. The corrections may have gotten small enough to satisfy the tolerances, but even such small corrections are not able to get the small residuals we would like to see. The ideal answer to this problem is to improve the partials! trust me, we tried and are continuing to work on this. but in the meantime, the standard solution (shared with other stellar codes) is to stop considering the residuals and settle for just getting small corrections even if that doesn't give you small residuals.
NOTE: this means that we can and do end up accepting new models that can have substantial residuals, and that means the models have substantial deviations from providing "correct" solutions to the stellar equations. gasp! that's horrible! how can any of this work? good question. what's your answer?
-Bill
On May 20, 2016, at 9:55 AM, Robert Farmer wrote:
> > Also is there a way to figure out after the runs have completed to determine how well the convergence was in reality?
>
> If you haven't already found it, there is this option in controls:
>
> report_hydro_solver_progress = .true.
>
> Which will tell give you information about the newton iterations as the run progresses and how good the acceptance was.
> Rob
>
> On Fri, May 20, 2016 at 9:24 AM, Bill Paxton <paxton at kitp.ucsb.edu> wrote:
>
> On May 20, 2016, at 9:05 AM, Kenny Van wrote:
>
>> One question I had was that does tol_max_correction guarantee that conversion is always no worse than that? Also is there a way to figure out after the runs have completed to determine how well the convergence was in reality?
>
> Hi,
>
> First we need to be careful about the meaning of "convergence"
>
> 1) we speak of the newton iterations converging to a solution for the new model at the end of a timestep
>
> and
>
> 2) we also use convergence to mean the final results of a run converging to (roughly) the same values as the tolerances are tightened forcing more timesteps and more zones.
>
> the 1st kind is done at each timestep, the 2nd is done before publishing (or sooner!) by expert users to check the chance that their results are numerical artifacts of inadequate time or space resolution. there is of course the danger that by reducing timesteps, you'll open up new physics that you would rather stayed hidden (surface pulsations for example). so this process cannot be pushed too far, but it also should not be neglected.
>
>
> for newton iterations, i try to avoid the term "convergence" and talk about "acceptance" instead. the newton generates a series of trial solutions. each is checked for how well it satisfies the equations ("residuals") and how small the difference is from the previous trial sotution ("corrections").
>
> acceptance must happen within a specified number of trials, or the effort is stopped and the system must do a retry with a smaller timestep.
>
> both the residuals and the corrections can be considered in deciding whether or not to accept a trial solution.
>
> in many cases, it is desirable to stop checking residuals after a specified number of iterations and just use corrections to decide acceptance.
>
> these options are given in controls.defaults, so that's where you need to go next. search for "solver controls" and read on.
>
> -Bill
>
>
>
>
>
>
> ------------------------------------------------------------------------------
> Mobile security can be enabling, not merely restricting. Employees who
> bring their own devices (BYOD) to work are irked by the imposition of MDM
> restrictions. Mobile Device Manager Plus allows you to control only the
> apps on BYO-devices by containerizing them, leaving personal data untouched!
> https://ad.doubleclick.net/ddm/clk/304595813;131938128;j
> _______________________________________________
> mesa-users mailing list
> mesa-users at lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/mesa-users
>
>
> ------------------------------------------------------------------------------
> Mobile security can be enabling, not merely restricting. Employees who
> bring their own devices (BYOD) to work are irked by the imposition of MDM
> restrictions. Mobile Device Manager Plus allows you to control only the
> apps on BYO-devices by containerizing them, leaving personal data untouched!
> https://ad.doubleclick.net/ddm/clk/304595813;131938128;j_______________________________________________
> mesa-users mailing list
> mesa-users at lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/mesa-users
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://lists.mesastar.org/pipermail/mesa-users/attachments/20160520/d8aa6aad/attachment.html>
More information about the Mesa-users
mailing list