[mesa-users] Sudden envelope mixing in a low-mass star, and a segmentation fault in debugging info
wball at astro.physik.uni-goettingen.de
Wed Jan 20 05:22:26 EST 2016
> As to what was happening, when the resolution is inadequate I've seen cases where there is a cascade of cells flipping convective state during the newton iterations so that after 20+ iterations the boundary has moved a long distance from where it was at the start of the step --- domino effect for the convective boundary.
In the model that fails, the Newton iterations diverge immediately. The
preceding timesteps converge after 2 or 3 iterations, but at the faulty
timestep the iterations go
hydro_call_number, s% dt, dt, dt/secyer, log dt/yr 1281 1.5779074992000000D+14 1.5779074992000000D+14 5.0000000000000000D+06 6.6989700043360187D+00
1181 1 coeff 1.0000 slope 0.000E+00 f 0.000E+00 avg resid 0.760E-03 max resid 0.347E+00 avg corr 0.127E-01 max corr 0.225E+00 lg dt/yr 6.70 avg+max corr+resid
1181 2 coeff 0.6241 slope 0.000E+00 f 0.996E+68 avg resid 0.146E+31 max resid 0.112E+35 avg corr 0.160E+00 max corr 0.208E+01 lg dt/yr 6.70 avg+max corr+resid
1181 3 coeff 0.6241 slope 0.000E+00 f 0.996E+68 avg resid 0.146E+31 max resid 0.112E+35 avg corr 0.797E+14 max corr 0.766E+17 lg dt/yr 6.70 avg corr too large -- give up
hydro_newton_step failed to converge
For comparison, here's the output for the preceding (successful) timestep.
hydro_call_number, s% dt, dt, dt/secyer, log dt/yr 1280 1.5779074992000000D+14 1.5779074992000000D+14 5.0000000000000000D+06 6.6989700043360187D+00
1180 1 coeff 1.0000 slope 0.000E+00 f 0.000E+00 avg resid 0.915E-05 max resid 0.179E-02 avg corr 0.270E-04 max corr 0.447E-02 lg dt/yr 6.70 max corr
1180 2 coeff 1.0000 slope 0.000E+00 f 0.000E+00 avg resid 0.405E-05 max resid 0.106E-02 avg corr 0.297E-04 max corr 0.282E-03 lg dt/yr 6.70 okay!
The first iteration of the bad timestep has surprisingly large residuals
and corrections (about 100–1000 times larger), and it all goes through the
roof after that.
I don't know if this is because the envelope is set to convective then
unset in the first iteration. Looking through some debug data, the last
profile given for the last successful timestep looks fine, but the first
profile given for the bad timestep already has the mixed envelope. So it
looks like that first iteration does the mixing.
Not sure if that's useful information to someone. I'll try to dig in to
see exactly where such a problem can creep in, and if there are some
controls to prevent such mixing in one Newton step. In particular, I'll
take a close look at anything that happens to the model between the end of
one set of Newton iterations and the start of the next set.
>> On Mon, 18 Jan 2016, Bill Paxton wrote:
>>> 1st, concerning the segfault: check the terminal output just before it happens -- I get lines like this:
>>> failed to open plot_data/solve_logs/names.data
>>> failed in append_data for plot_data/solve_logs/corr_lnPgas.log
>>> failed in append_data for plot_data/solve_logs/corr_lnT.log
>>> It is assuming the existence of some directories and crashes when they aren't there.
>>> You can get the necessary stuff by copying the directories 'plot_data' and 'plotters' from any of the test_suite cases.
>>> 2nd, concerning the jump in envelope mixing. The general rule when encountering something like this is to crank up the resolution to see what happens. Mesa isn't magic -- it isn't smart enough to notice that it is producing results that are bogus because it is taking timesteps that are too large or has a grid that is too coarse. It is up to the users to check the results and make sure they aren't artifacts of inadequate resolution. I know that you know all of this -- I'm just taking advantage of this chance to preach to mesa-users again! ;)
>>> For this case, the jump happens along with the 1st retry, so might set max_number_retries = 1 to get it to stop when the problem happens. Then do multiple restarts back about 50 steps before the problem, each time with reduced values for max_years_for_timestep and mesh_delta_coeff. The inlist is currently setting max_years_for_timestep = 5d6, so keep reducing that until it is down by at least a factor of 100. Similarly the inlist has mesh_delta_coeff = 1; decrease that until you have at least 2000 grid points (instead of the 800 or so you have now). Then once you have better resolution in time and space, let the run continue beyond the problem long enough to make plots to compare to what you have now. Sometimes the problem just goes away when the resolution is increased. But in other cases, it stubbornly stays around even at high resolution. Then it gets interesting! Let us know what you find.
>>> On Jan 18, 2016, at 7:58 AM, Warrick Ball wrote:
>>>> Hi all,
>>>> Earlier, Earl Bellinger asked me about a suspicious HR track he found in his modelling. I've attached a plot showing the main sequence: you can see the jumps around the middle (logT ~= 3.88). I've also attached the relevant inlist. Just watch out: it saves a *lot* of data (several GB) in the form of all the profiles. (This is MESA revision 7624.)
>>>> We've been looking at the output and found that the issue appears at model number 1180. I've attached a plot with the hydrogen abundance at models 1180 and 1181, which are before and after the first jump. As can be seen, the outer 28% (by radius) of model 1181 appears to be mixed, even though all the diffusion coefficients of mixing are effectively nil. This also corresponds to a convergence failure, so the first question we have is why the star is behaving this way.
>>>> Diffusion is on and all the metals and helium have settled out of the envelope, but I don't see why this should be a numerical problem. The back and forth seems to occur as the star mixes part of the envelope, the metals and helium drain out, then the star mixes part of the envelope again (remixes the envelope?), and so on. But the sudden mixing near the surface is a mystery.
>>>> The second issue is related. Following Bill Wolf's excellent tutorial , I tried to get the debug data for the hydro solver. You can activate this by uncommenting the last three lines in &controls:
>>>> ! report_hydro_solver_progress = .true.
>>>> ! hydro_inspectB_flag = .true.
>>>> ! hydro_dump_call_number = 1281
>>>> Much to my surprise, this causes a segfault on my machine at hydro call 1272. I've attached the last ~150 lines from the terminal as "segfault.txt". The backtrace reads:
>>>> Program received signal SIGSEGV: Segmentation fault - invalid memory reference.
>>>> Backtrace for this error:
>>>> #0 0x7FAB239F4557
>>>> #1 0x7FAB239F4B6E
>>>> #2 0x7FAB22AEBD9F
>>>> #3 0x6B9447 in __hydro_newton_procs_MOD_write_solve_logs
>>>> #4 0x6BA03C in __hydro_newton_procs_MOD_inspectb
>>>> #5 0x69EAA8 in __star_newton_MOD_do_newton
>>>> #6 0x6A1779 in __star_newton_MOD_newton
>>>> #7 0x617D29 in newt.10440 at solve_hydro.f90:0
>>>> #8 0x618F61 in __solve_hydro_MOD_hydro_newton_step
>>>> #9 0x61A697 in __solve_hydro_MOD_do_hydro_newton
>>>> #10 0x61BE46 in __solve_hydro_MOD_do_hydro_converge
>>>> #11 0x627567 in __struct_burn_mix_MOD_do_struct_burn_mix
>>>> #12 0x5200DC in __evolve_MOD_do_evolve_step_part2
>>>> #13 0x40B291 in __star_lib_MOD_star_evolve_step
>>>> #14 0x41D364 in __run_star_support_MOD_run1_star
>>>> #15 0x406AD2 in __run_star_MOD_do_run_star
>>>> #16 0x406B6F in MAIN__ at run.f:0
>>>> ./rn: line 9: 8120 Segmentation fault ./star
>>>> So the second question we have is why the code segfaults. I haven't yet dug down into where this is coming from, and I'll try to in the next few days if I have a chance. For now, any help is as always very welcome and hugely appreciated!
>>>>  http://wmwolf.github.io/projects/mesa_debugging/
>>>> Warrick Ball
>>>> Postdoc, Institut für Astrophysik Göttingen
>>>> wball at astro.physik.uni-goettingen.de
>>>> +49 (0) 551 39 5069<inlist_1.0><earl_HR.png><earls_bug.png><segfault.txt>------------------------------------------------------------------------------
>>>> Site24x7 APM Insight: Get Deep Visibility into Application Performance
>>>> APM + Mobile APM + RUM: Monitor 3 App instances at just $35/Month
>>>> Monitor end-to-end web transactions and take corrective actions now
>>>> Troubleshoot faster and improve end-user experience. Signup Now!
>>>> mesa-users mailing list
>>>> mesa-users at lists.sourceforge.net
>> Warrick Ball
>> Postdoc, Institut für Astrophysik Göttingen
>> wball at astro.physik.uni-goettingen.de
>> +49 (0) 551 39 5069
Postdoc, Institut für Astrophysik Göttingen
wball at astro.physik.uni-goettingen.de
+49 (0) 551 39 5069
More information about the Mesa-users