Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Documenting CSM and inference steps for debugging #443

Open
dehann opened this issue Nov 20, 2019 · 19 comments
Open

Documenting CSM and inference steps for debugging #443

dehann opened this issue Nov 20, 2019 · 19 comments

Comments

@dehann
Copy link
Member

dehann commented Nov 20, 2019

EDIT, some of these code snippets are out of date.

Console

# save clique subgraphs during inference at critical steps
getSolverParams(fg).dbg = true
# record CSM steps for later analysis
tree, smt, hist = solveTree!(fg, recordcliqs=ls(fg))

# print to screen the CSM steps for clique with frontal :x2
printCliqHistorySummary(tree, :x2)

# where is everything saved???
getLogPath(fg)

clid = getCliq(tree, :x5).index
STEP = 8
fncAS = hist[clid][STEP][3]
dataAS = deepcopy(hist[clid][STEP][4])
fncAS(dataAS)
  • solveTree! will always generate content at getLogPath(fg) location.
  • More or less will be stored there depending getSolverParams(fg).dbg.
  • recordcliqs=[:x1;:x2;...] will save the CSM history to RAM.

Animate CSM Video

Can also make concurrent video of CSM at work:

# new interface
csmAnimate(tree, hist, frames=1000)

run(`ffmpeg -r 10 -i /tmp/caesar/csmCompound/csm_%d.png -c:v libtheora -vf fps=25 -pix_fmt yuv420p -vf "scale=trunc(iw/2)*2:trunc(ih/2)*2" -q 10 /tmp/caesar/csmCompound/out.ogv`)

@async run(`totem /tmp/caesar/csmCompound/out.ogv`)

Also see csmAnimate's own documentation.

EDIT

new Caesar.writevideo function directly engages with ffmpeg which is easier:

https://github.com/JuliaRobotics/Caesar.jl/blob/b514d18cbb6cc3b5557f67617ed08918c4260a26/src/images/images.jl#L6-L50


Also see example of debug process in #754

@dehann
Copy link
Member Author

dehann commented Nov 23, 2019

Note that when using getSolverParams(fg).dbg=true each of the major stages of the CSM clique inference subgraphs are stored to the getLogPath(fg) destination -- including:

  • fg_build, as copied from main fg object
  • fg_beforeupsolve
  • fg_afterupsolve
  • fg_beforedownsolve
  • fg_afterdownsolve

which shows the belief messages passed up or down in each case added as MsgPriors to the clique subgraphs.

@Affie
Copy link
Member

Affie commented Aug 20, 2020

Is there a way to debug with a failed solve (where it errors and doesn't return)?

@dehann
Copy link
Member Author

dehann commented Aug 20, 2020

Hi @Affie , yeah so the best I have been able to do so far is as follows:

First thing is to find where the logs are going to land:

@show getLogPath(fg)

Then solve with all debug options:

getSolverParams(fg).dbg = true
getSolverParams(fg).async = true

tree, smt, hist = solveTree!(fg, verbose=true, recordcliqs=ls(fg));

Verbose is very useful, but only prints to the console so that you can get an idea of how far each CSM gets and in which functions the error might occur.

With this you will then get process 1 context back while the solver works in the background -- do not start a second solve while the first is still working in the background.

Once the failures occur, there are three things to do:

  • check the logpath for all details, the error stack will be printed out there in text files.
  • check smt on which CSM failed, and you should be able to see the error stack there as well (but not always),
  • for the CSM objects that have not failed but did complete, you can fetch the recordcliq history with hist4 = fetch(smt[4]).

One more thing you can do is to limit the number of iterations for all CSM:

getSolverParams(fg).limititers = 13

which allows you to force early termination of CSM and thereby get the csmc object nearest the failure. It is actually possible to get even closer, one CSM at a time:

fg = generateCanonicaFG_Hexagonal()
@show getLogPath(fg)

tree = wipeBuildNewTree!(fg)
drawTree(tree, show=true)

getSolverParams(fg).dbg = true
getSolverParams(fg).async = true

getSolverParams(fg).limititers = 13
cliqtask = solveCliq!(fg, tree, :x2, verbose=true, recordcliq=true, async=true)

# wait for that task to finish
sleep(10)

getSolverParams(fg).limititers = 9
cliqtask = solveCliq!(fg, tree, :x0, verbose=true, recordcliq=true, async=true)
...

@dehann
Copy link
Member Author

dehann commented Aug 20, 2020

Actually there is one more you can do with hist::Dict (automatically fetched for all CSM if async=false), use printCliqHistorySequential to see output like this:

using RoME
fg = generateCanonicalFG_Hexagonal()
tree, smt, hist = solveTree!(fg, recordcliqs=ls(fg));
printCliqHistorySequential(hist)

semicolon is recommended :-)

julia> printCliqHistorySequential(hist)
1.1    10:27:13.868 testCliqCanRecycle 0 | null        P ----         C 2null 3null 4null |S| 
1.2    10:27:14.303 testCliqCanIncremt 0 | null        P ----         C 2null 3null 4null |S| 
1.3    10:27:14.31  isCliqUpSolved     0 | null        P ----         C 2null 3null 4null |S| 
1.4    10:27:14.362 buildCliqSubgraph  0 | null        P ----         C 2null 3null 4null |S| 
1.5    10:27:15.617 isCliqNull         0 | null        P ----         C 2null 3null 4null |S| 
1.6    10:27:15.908 doesCliqNeeddownms 0 | null        P ----         C 2null 3null 4null |S| 
1.7    10:27:15.917 determineCliqNeedD 0 | null        P ----         C 2null 3null 4null |S| 
1.8    10:27:15.955 slowCliqIfChildren 0 | null        P ----         C 2null 3null 4null |S| 
2.1    10:27:15.987 testCliqCanRecycle 0 | null        P 1null        C 5null 6null |S| 3null 4null 
2.2    10:27:15.991 testCliqCanIncremt 0 | null        P 1null        C 5null 6null |S| 3null 4null 
2.3    10:27:15.993 isCliqUpSolved     0 | null        P 1null        C 5null 6null |S| 3null 4null 
2.4    10:27:15.995 buildCliqSubgraph  0 | null        P 1null        C 5null 6null |S| 3null 4null 
2.5    10:27:15.999 isCliqNull         0 | null        P 1null        C 5null 6null |S| 3null 4null 
2.6    10:27:16.003 doesCliqNeeddownms 0 | null        P 1null        C 5null 6null |S| 3null 4null 
2.7    10:27:16.007 checkIfCliqNullBlo 0 | null        P 1null        C 5null 6null |S| 3null 4null 
2.8    10:27:16.086 blockCliqSiblingsP 0 | null        P 1null        C 5null 6null |S| 3null 4null 
2.9    10:27:16.151 determineCliqNeedD 0 | null        P 1null        C 5null 6null |S| 3null 4null 
2.10   10:27:16.155 slowCliqIfChildren 0 | null        P 1null        C 5null 6null |S| 3null 4null 
3.1    10:27:16.159 testCliqCanRecycle 0 | null        P 1null        C ---- |S| 2null 4null 
3.2    10:27:16.165 testCliqCanIncremt 0 | null        P 1null        C ---- |S| 2null 4null 
3.3    10:27:16.168 isCliqUpSolved     0 | null        P 1null        C ---- |S| 2null 4null 
3.4    10:27:16.17  buildCliqSubgraph  0 | null        P 1null        C ---- |S| 2null 4null 
3.5    10:27:16.337 isCliqNull         0 | null        P 1null        C ---- |S| 2null 4null 
3.6    10:27:16.342 doesCliqNeeddownms 0 | null        P 1null        C ---- |S| 2null 4null 
3.7    10:27:16.347 checkIfCliqNullBlo 0 | null        P 1null        C ---- |S| 2null 4null 
3.8    10:27:16.352 blockCliqSiblingsP 0 | null        P 1null        C ---- |S| 2null 4null 
3.9    10:27:16.358 determineCliqNeedD 0 | null        P 1null        C ---- |S| 2null 4null 
3.10   10:27:16.363 getBetterName7b    0 | null        P 1null        C ---- |S| 2null 4null 
4.1    10:27:16.39  testCliqCanRecycle 0 | null        P 1null        C ---- |S| 2null 3null 
4.2    10:27:16.392 testCliqCanIncremt 0 | null        P 1null        C ---- |S| 2null 3null 
4.3    10:27:16.394 isCliqUpSolved     0 | null        P 1null        C ---- |S| 2null 3null 
4.4    10:27:16.397 buildCliqSubgraph  0 | null        P 1null        C ---- |S| 2null 3null 
4.5    10:27:16.449 isCliqNull         0 | null        P 1null        C ---- |S| 2null 3null 
4.6    10:27:16.454 doesCliqNeeddownms 0 | null        P 1null        C ---- |S| 2null 3null 
4.7    10:27:16.459 checkIfCliqNullBlo 0 | null        P 1null        C ---- |S| 2null 3null 
4.8    10:27:16.465 blockCliqSiblingsP 0 | null        P 1null        C ---- |S| 2null 3null 
4.9    10:27:16.47  determineCliqNeedD 0 | null        P 1null        C ---- |S| 2null 3null 
4.10   10:27:16.476 getBetterName7b    0 | null        P 1null        C ---- |S| 2null 3null 
5.1    10:27:16.481 testCliqCanRecycle 0 | null        P 2null        C ---- |S| 6null 
5.2    10:27:16.483 testCliqCanIncremt 0 | null        P 2null        C ---- |S| 6null 
5.3    10:27:16.486 isCliqUpSolved     0 | null        P 2null        C ---- |S| 6null 
5.4    10:27:16.488 buildCliqSubgraph  0 | null        P 2null        C ---- |S| 6null 
5.5    10:27:16.493 isCliqNull         0 | null        P 2null        C ---- |S| 6null 
5.6    10:27:16.499 doesCliqNeeddownms 0 | null        P 2null        C ---- |S| 6null 
5.7    10:27:16.504 checkIfCliqNullBlo 0 | null        P 2null        C ---- |S| 6null 
5.8    10:27:16.51  blockCliqSiblingsP 0 | null        P 2null        C ---- |S| 6null 
5.9    10:27:16.515 determineCliqNeedD 0 | null        P 2null        C ---- |S| 6null 
5.10   10:27:16.52  getBetterName7b    0 | null        P 2null        C ---- |S| 6null 
6.1    10:27:16.525 testCliqCanRecycle 0 | null        P 2null        C ---- |S| 5null 
6.2    10:27:16.528 testCliqCanIncremt 0 | null        P 2null        C ---- |S| 5null 
6.3    10:27:16.53  isCliqUpSolved     0 | null        P 2null        C ---- |S| 5null 
6.4    10:27:16.532 buildCliqSubgraph  0 | null        P 2null        C ---- |S| 5null 
6.5    10:27:16.537 isCliqNull         0 | null        P 2null        C ---- |S| 5null 
6.6    10:27:16.543 doesCliqNeeddownms 0 | null        P 2null        C ---- |S| 5null 
6.7    10:27:16.548 checkIfCliqNullBlo 0 | null        P 2null        C ---- |S| 5null 
6.8    10:27:16.553 blockCliqSiblingsP 0 | null        P 2null        C ---- |S| 5null 
6.9    10:27:16.559 determineCliqNeedD 0 | null        P 2null        C ---- |S| 5null 
6.10   10:27:16.564 getBetterName7b    0 | null        P 2null        C ---- |S| 5null 
3.11   10:27:16.57  attemptCliqInitUp  0 | null        P 1null        C ---- |S| 2null 4null 
3.12   10:27:16.598 mustInitUpCliq     0 | null        P 1null        C ---- |S| 2null 4null 
3.13   10:27:19.439 doCliqUpSolveIniti 0 | null        P 1null        C ---- |S| 2null 4null 
4.11   10:27:23.505 attemptCliqInitUp  0 | null        P 1null        C ---- |S| 2null 3null 
4.12   10:27:23.511 mustInitUpCliq     0 | null        P 1null        C ---- |S| 2null 3null 
4.13   10:27:23.516 doCliqUpSolveIniti 0 | null        P 1null        C ---- |S| 2null 3null 
5.11   10:27:29.764 attemptCliqInitUp  0 | null        P 2null        C ---- |S| 6null 
5.12   10:27:29.769 mustInitUpCliq     0 | null        P 2null        C ---- |S| 6null 
5.13   10:27:29.774 doCliqUpSolveIniti 0 | null        P 2null        C ---- |S| 6null 
6.11   10:27:35.021 attemptCliqInitUp  0 | null        P 2null        C ---- |S| 5null 
6.12   10:27:35.027 mustInitUpCliq     0 | null        P 2null        C ---- |S| 5null 
6.13   10:27:35.032 doCliqUpSolveIniti 0 | null        P 2null        C ---- |S| 5null 
3.14   10:27:40.464 rmUpLikeliSaveSubF 0 | upsolved    P 1null        C ---- |S| 2null 4null 
3.15   10:27:40.523 finishCliqSolveChe 0 | upsolved    P 1null        C ---- |S| 2null 4null 
4.14   10:27:40.841 rmUpLikeliSaveSubF 0 | upsolved    P 1null        C ---- |S| 2null 3upsolved 
4.15   10:27:40.846 finishCliqSolveChe 0 | upsolved    P 1null        C ---- |S| 2null 3upsolved 
5.14   10:27:40.887 rmUpLikeliSaveSubF 0 | upsolved    P 2null        C ---- |S| 6null 
5.15   10:27:40.892 finishCliqSolveChe 0 | upsolved    P 2null        C ---- |S| 6null 
6.14   10:27:40.899 rmUpLikeliSaveSubF 0 | upsolved    P 2null        C ---- |S| 5upsolved 
6.15   10:27:40.904 finishCliqSolveChe 0 | upsolved    P 2null        C ---- |S| 5upsolved 
3.16   10:27:40.955 determineCliqIfDow 0 | upsolved    P 1null        C ---- |S| 2null 4upsolved 
2.11   10:27:41.187 isCliqNull         0 | null        P 1null        C 5upsolved 6upsolved |S| 3upsolved 4upsolved 
2.12   10:27:41.191 doesCliqNeeddownms 0 | null        P 1null        C 5upsolved 6upsolved |S| 3upsolved 4upsolved 
2.13   10:27:41.194 checkIfCliqNullBlo 0 | null        P 1null        C 5upsolved 6upsolved |S| 3upsolved 4upsolved 
2.14   10:27:41.198 blockCliqSiblingsP 0 | null        P 1null        C 5upsolved 6upsolved |S| 3upsolved 4upsolved 
2.15   10:27:41.202 determineCliqNeedD 0 | null        P 1null        C 5upsolved 6upsolved |S| 3upsolved 4upsolved 
2.16   10:27:41.206 getBetterName7b    0 | null        P 1null        C 5upsolved 6upsolved |S| 3upsolved 4upsolved 
4.16   10:27:41.21  determineCliqIfDow 0 | upsolved    P 1null        C ---- |S| 2null 3upsolved 
5.16   10:27:41.215 determineCliqIfDow 0 | upsolved    P 2null        C ---- |S| 6upsolved 
6.16   10:27:41.221 determineCliqIfDow 0 | upsolved    P 2null        C ---- |S| 5upsolved 
2.17   10:27:41.311 attemptCliqInitUp  0 | null        P 1null        C 5upsolved 6upsolved |S| 3upsolved 4upsolved 
2.18   10:27:41.327 mustInitUpCliq     0 | null        P 1null        C 5upsolved 6upsolved |S| 3upsolved 4upsolved 
2.19   10:27:42.015 doCliqUpSolveIniti 0 | null        P 1null        C 5upsolved 6upsolved |S| 3upsolved 4upsolved 
2.20   10:27:42.99  rmUpLikeliSaveSubF 0 | upsolved    P 1null        C 5upsolved 6upsolved |S| 3upsolved 4upsolved 
2.21   10:27:43.061 finishCliqSolveChe 0 | upsolved    P 1null        C 5upsolved 6upsolved |S| 3upsolved 4upsolved 
1.9    10:27:43.065 isCliqNull         0 | null        P ----         C 2upsolved 3upsolved 4upsolved |S| 
1.10   10:27:43.069 doesCliqNeeddownms 0 | null        P ----         C 2upsolved 3upsolved 4upsolved |S| 
1.11   10:27:43.074 determineCliqNeedD 0 | null        P ----         C 2upsolved 3upsolved 4upsolved |S| 
1.12   10:27:43.086 getBetterName7b    0 | null        P ----         C 2upsolved 3upsolved 4upsolved |S| 
2.22   10:27:43.091 determineCliqIfDow 0 | upsolved    P 1null        C 5upsolved 6upsolved |S| 3upsolved 4upsolved 
1.13   10:27:43.193 attemptCliqInitUp  0 | null        P ----         C 2upsolved 3upsolved 4upsolved |S| 
1.14   10:27:43.201 mustInitUpCliq     0 | null        P ----         C 2upsolved 3upsolved 4upsolved |S| 
1.15   10:27:43.211 doCliqUpSolveIniti 0 | null        P ----         C 2upsolved 3upsolved 4upsolved |S| 
1.16   10:27:44.979 rmUpLikeliSaveSubF 0 | upsolved    P ----         C 2upsolved 3upsolved 4upsolved |S| 
1.17   10:27:44.983 finishCliqSolveChe 0 | upsolved    P ----         C 2upsolved 3upsolved 4upsolved |S| 
1.18   10:27:44.999 determineCliqIfDow 0 | upsolved    P ----         C 2upsolved 3upsolved 4upsolved |S| 
3.17   10:27:45.195 doCliqDownSolve    0 | upsolved    P 1downsolved  C ---- |S| 2upsolved 4upsolved 
4.17   10:27:46.427 doCliqDownSolve    0 | upsolved    P 1downsolved  C ---- |S| 2upsolved 3downsolved 
2.23   10:27:47.372 doCliqDownSolve    0 | upsolved    P 1downsolved  C 5upsolved 6upsolved |S| 3downsolved 4downsolved 
1.19   10:27:48.323 exitStateMachine   0 | downsolved  P ----         C 2downsolved 3downsolved 4downsolved |S| 
3.18   10:27:48.35  exitStateMachine   0 | downsolved  P 1downsolved  C ---- |S| 2downsolved 4downsolved 
5.17   10:27:48.355 doCliqDownSolve    0 | upsolved    P 2downsolved  C ---- |S| 6upsolved 
6.17   10:27:49.298 doCliqDownSolve    0 | upsolved    P 2downsolved  C ---- |S| 5downsolved 
4.18   10:27:50.247 exitStateMachine   0 | downsolved  P 1downsolved  C ---- |S| 2downsolved 3downsolved 
2.24   10:27:50.251 exitStateMachine   0 | downsolved  P 1downsolved  C 5downsolved 6downsolved |S| 3downsolved 4downsolved 
5.18   10:27:50.257 exitStateMachine   0 | downsolved  P 2downsolved  C ---- |S| 6downsolved 
6.18   10:27:50.348 exitStateMachine   0 | downsolved  P 2downsolved  C ---- |S| 5downsolved 

@dehann
Copy link
Member Author

dehann commented Aug 20, 2020

The most pressing developer tool here is to better manage how CSM terminates on error. The two options I have come up with are either (sure there are more options available):

  • Propagate the error all the way (less desirable)
  • Build a dedicated error state in CSM which records and closes out a CSM on error and then notify other CSMs to stop as well.

see #758

@dehann
Copy link
Member Author

dehann commented Aug 20, 2020

reminder that the revamped getSolverParams(fg).useMsgLikelihoods = true is default false at this time and will be made default true in future.

@dehann
Copy link
Member Author

dehann commented Aug 20, 2020

Another trick that doesn't come up too often but none the less: getSolverParams(fg).multiproc = false -- is default true. Similarly, you can skip up or down stages by getSolverParams(fg).dodownsolve = false. All the parameter options are:

getSolverParams(fg) |> typeof |> fieldnames

@dehann
Copy link
Member Author

dehann commented Aug 20, 2020

Also worth stating again here, the purpose of recordcliqs is to allow repeat steps:

# either from hist4 = hist[4] or
hist4 = fetch(cliqtask4)

# the next function that would have been called at step 11
fnc11! = hist4[11][3]
csmc11_ = deepcopy(hist4[11][4])

# must reset the condition
getSolveCondition(csmc11_.cliq) = Condition()

# now redo that step (likely with debugger)
fnc12_ = fnc11!(csmc11_)

# note the csmc11_ object will now be changed after fnc11!
## The design intends
fnc13_ = fnc12_(csmc11_)

# ...

@dehann
Copy link
Member Author

dehann commented Aug 29, 2020

added a new tool limittercliqs::Vector{Pair{Symbol, Int}} that allows early stopping of any specific CSMs per user desire, making it much easier to debug how CSMs work:

tree, smt, hist = solveTree!(fg, recordcliqs=[:x0;], limititercliqs=[(:x0=>2);])

@dehann
Copy link
Member Author

dehann commented Sep 10, 2020

@dehann
Copy link
Member Author

dehann commented Sep 11, 2020

Commits 426e98f , 1dbe967 , 030d75b add a new swim lane type CSM printout:

julia> IIF.printCSMHistoryLogical(hists)
    |     x0             |     x4             |     l1             |     x1             |     x5             |     x3             
----+--------------------+--------------------+--------------------+--------------------+--------------------+--------------------
1   | 1   testCliqC null | 10  testCliqC null | 21  testCliqC null | 32  testCliqC null | 43  testCliqC null | 54  testCliqC null 
2   | 2   testCliqC null | 11  testCliqC null | 22  testCliqC null | 33  testCliqC null | 44  testCliqC null | 55  testCliqC null 
3   | 3   isCliqUpS null | 12  isCliqUpS null | 23  isCliqUpS null | 34  isCliqUpS null | 45  isCliqUpS null | 56  isCliqUpS null 
4   | 4   buildCliq null | 13  buildCliq null | 24  buildCliq null | 35  buildCliq null | 46  buildCliq null | 57  buildCliq null 
5   | 5   canCliqMa null | 14  canCliqMa null | 25  canCliqMa null | 36  canCliqMa null | 47  canCliqMa null | 58  canCliqMa null 
6   | 6   blockUnti null | 15  blockUnti null | 26  blockUnti null | 37  blockUnti null | 48  blockUnti null | 59  blockUnti null 
7   | 7   trafficRe null | 16  trafficRe null | 27  trafficRe null | 38  trafficRe null | 49  trafficRe null | 60  trafficRe null 
8   | 8   determine null | 17  checkIfCl null | 28  checkIfCl null | 39  checkIfCl null | 50  checkIfCl null | 61  checkIfCl null 
9   | 9   childCliq null | 18  blockCliq null | 29  blockCliq null | 40  blockCliq null | 51  blockCliq null | 62  blockCliq null 
10  | 103 trafficRe null | 19  determine null | 30  determine null | 41  determine null | 52  determine null | 63  determine null 
11  | 104 determine null | 20  childCliq null | 31  towardUpO null | 42  towardUpO null | 53  towardUpO null | 64  towardUpO null 
12  | 105 towardUpO null | 85  trafficRe null | 65  attemptCl null | 68  attemptCl null | 71  attemptCl null | 74  attemptCl null 
...

@dehann
Copy link
Member Author

dehann commented Sep 13, 2020

as of FSM v0.2.8 a new injectDelayBefore as well as a watchdog timeout features have been added to help with the debugging process. Note that watchdog timeout is not really meant for mainstream use in CSM just yet, more to help development when CSMs don't terminate by themselves.

@dehann
Copy link
Member Author

dehann commented Sep 18, 2020

Just copying a recent example here for the record:

using Caesar, RoME


fg = generateCanonicalFG_Hexagonal(graphinit=false)

getSolverParams(fg).treeinit = true
getSolverParams(fg).graphinit = false
getSolverParams(fg).limititers = 100

getSolverParams(fg).drawtree = true
getSolverParams(fg).showtree = true
getSolverParams(fg).dbg = true
getSolverParams(fg).async = true


# limitcliqs = [:x0=>8;:x4=>12;:l1=>21;:x1=>21;:x5=>50;:x3=>50] # breaks
# limitcliqs = [:x0=>8;:x4=>13;:l1=>21;:x1=>21;:x5=>60;:x3=>60] # 50 # doesnt break, blocks

# injectDelayBefore=[2=>(canCliqMargRecycle_StateMachine=>5); ] # step 8
# injectDelayBefore=[5=>(canCliqMargRecycle_StateMachine=>5); ]
# injectDelayBefore=[6=>(towardUpOrDwnSolve_StateMachine=>10); ]
# injectDelayBefore = nothing


mkpath(getLogPath(fg))
verbosefid = open(joinLogPath(fg, "csmVerbose.log"),"w")
# verbosefid = stdout
tree, smt, hists = solveTree!(fg, recordcliqs=ls(fg), verbose=true, verbosefid=verbosefid, timeout=50 ) #, timeout=40 , injectDelayBefore=injectDelayBefore ) #, limititercliqs=limitcliqs);

flush(verbosefid)
close(verbosefid)
# for .async = true (because .drawTree=true)
smt[7] |> x->schedule(x, InterruptException(), error=true)

open(joinLogPath(fg, "csmLogicalReconstructMax.log"),"w") do io
  IIF.reconstructCSMHistoryLogical(getLogPath(fg), fid=io)
end

# async case
fetchCliqHistoryAll!(smt, hists)

open(joinLogPath(fg, "csmSequ.log"),"w") do fid
  printCliqHistorySequential(hists, nothing, fid)
end

open(joinLogPath(fg, "csmLogi.log"),"w") do fid
  printCSMHistoryLogical(hists, fid)
end

# printCliqHistorySequential(hists)
# printCliqHistorySequential(hists, 1=>10)
# printCliqHistorySequential(hists, [1,4,6]=>11:15)
# printCliqHistorySequential(hists, [1=>9:16; 2=>20:34; 4=>29:34])
# printCliqHistorySequential(hists, [5=>12:21;6=>12:21])

printCSMHistoryLogical(hists)

# also see dbg logs at this path for more info
# @show getLogPath(fg)

using Images
csmAnimateSideBySide(tree, hists, encode=true, nvenc=true, show=true)

@Affie
Copy link
Member

Affie commented Sep 28, 2020

Enable clique history recording and provide smtasks, then retrieve with fetchCliqHistoryAll! even if an exception is thrown.

smtasks = Task[]
solveTree!(fg; smtasks, verbose=true, timeout=10, recordcliqs=ls(fg));
hists = fetchCliqHistoryAll!(smtasks);

We can automate it further in the future.

@dehann
Copy link
Member Author

dehann commented Sep 29, 2020

Will the internal fetchCliqHistoryAll! work in general or should we definitely call it after a solve?
See it is commented out:

# assignTreeHistory!(tree, cliqHistories)

@Affie
Copy link
Member

Affie commented Sep 29, 2020

Will the internal fetchCliqHistoryAll! work in general or should we definitely call it after a solve?

For now, it has to be called after the solve with smtasks passed in by reference. I was thinking to still return it if dbg is on.
So it would not throw an error but @error and return the normal tree, smtasks, hists

I don't know about the commented out call.

@Affie
Copy link
Member

Affie commented Oct 27, 2020

Everything logged with logCSM as debug level can be displayed on the console by enabling debug printing on the clique, for example, enable CSM 2 with ENV["JULIA_DEBUG"] = :csm_2

@Affie
Copy link
Member

Affie commented Feb 25, 2021

Another usefull workflow for debugging CSM freezes is:

getSolverParams(fg).async = true
smtasks = Task[]
solveTree!(fg; smtasks)

Then if it hangs call:

IIF.throwIntExcToAllTasks(smtasks)

@Affie
Copy link
Member

Affie commented Feb 25, 2021

getSolverParams(fg).dbg=true also saves the csmc and statemachine in task storage.
it is saved in:
smtasks[i].storage[:csmc]
smtasks[i].storage[:statemachine]

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants