Update for DynamicPPL 0.39 #2715

penelopeysm · 2025-11-14T01:08:14Z

The main change in DPPL 0.39 is OnlyAccsVarInfo and faster evaluation.

This PR uses fast evaluation in MCMC sampling where it can. MCMC sampling mostly works as can be seen from the tests.

Of note

MCMCChains no longer stores an overall chain.logevidence field

The reason is because we now use the bundle_samples method in DynamicPPL, which has no way of reliably determining the log-evidence from the transition. If we wanted to fix this, we would have to add a getlogevidence function in AbstractMCMC.

I personally don't consider this a problem. The reason why log-evidence used to be stored was because chains did not provide enough information for people to calculate this themselves (specifically, chains only stored logp, not loglikelihood). Now that chn[:likelihood] contains the likelihood, it should be ok for people to calculate this themselves.

github-actions · 2025-11-14T01:10:21Z

Turing.jl documentation for PR #2715 is available at:
https://TuringLang.github.io/Turing.jl/previews/PR2715/

codecov · 2025-12-01T19:15:47Z

Codecov Report

❌ Patch coverage is 98.19820% with 2 lines in your changes missing coverage. Please review.
✅ Project coverage is 85.44%. Comparing base (1d10d2d) to head (61a5266).

Files with missing lines	Patch %	Lines
src/mcmc/gibbs.jl	90.00%	1 Missing ⚠️
src/mcmc/particle_mcmc.jl	90.90%	1 Missing ⚠️

Additional details and impacted files

@@             Coverage Diff              @@
##           breaking    #2715      +/-   ##
============================================
- Coverage     86.33%   85.44%   -0.90%     
============================================
  Files            21       21              
  Lines          1383     1257     -126     
============================================
- Hits           1194     1074     -120     
+ Misses          189      183       -6

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:

❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

penelopeysm · 2025-12-01T18:24:49Z

src/mcmc/Inference.jl

-######################
-# Default Transition #
-######################


All this stuff has basically been upstreamed to DynamicPPL and/or AbstractMCMC.

penelopeysm · 2025-12-01T18:27:16Z

src/mcmc/prior.jl

+    accs = DynamicPPL.AccumulatorTuple((
+        DynamicPPL.ValuesAsInModelAccumulator(true),
+        DynamicPPL.LogPriorAccumulator(),
+        DynamicPPL.LogLikelihoodAccumulator(),
+    ))
+    vi = DynamicPPL.OnlyAccsVarInfo(accs)
+    _, vi = DynamicPPL.init!!(rng, model, vi, DynamicPPL.InitFromPrior())
+    return DynamicPPL.ParamsWithStats(vi), nothing


Actually quite neat that the whole Prior sampler is just defined with DynamicPPL stuff now.

penelopeysm · 2025-12-01T21:34:30Z

ext/TuringDynamicHMCExt.jl

As has been the case in past PRs of this sort, this file provides a gentle introduction of the kinds of changes being made.

Generally, the current status is that: MCMC states often bundle a varinfo, not for the purpose of actually being an accurate varinfo, but more as a 'home' to unflatten a vector of parameters into. (See #2642) The logp is usually not updated, because the only thing that's needed is for the next iteration to do vi[:].

This PR generally attempts to remove these varinfos from states, and only ever store the parameter vector + the LDF. Often the only reason why we carried around varinfo was so that we could re-evaluate with ValuesAsInModelAcc. However, because ParamsWithStats now has a method that takes the vector + LDF and returns the values-as-in-model, we can use that without needing a varinfo now.

... That's the ideal, at least ...

The reality is that most samplers still need to carry around a varinfo, specifically so that samplers can be used inside Gibbs. (DynamicHMCExt doesn't need to, because it's not 'Gibbs-enabled'.) This suggests that a potential, and immediate, way of decoupling varinfo from the individual samplers would be to have Gibbs handle this extra varinfo overhead (i.e. make gibbs store a (varinfo, state) tuple, rather than just state).

That's probably one for the (near-ish) future. For now, at least the scope of the varinfo argument has been reduced by quite a bit: it's no longer used in the actual AbstractMCMC.step implementations of most samplers.

penelopeysm · 2025-12-01T21:37:27Z

src/mcmc/emcee.jl

+    n_walkers = _get_n_walkers(spl)
+    chains = map(1:n_walkers) do i
+        this_walker_samples = [s[i] for s in samples]
+        AbstractMCMC.bundle_samples(
+            this_walker_samples, model, spl, state, chain_type; kwargs...
+        )


This is probably more inefficient than the old code, but I am not particularly fussed since chain construction isn't a bottleneck, and it's also way cleaner.

penelopeysm · 2025-12-01T21:40:30Z

src/mcmc/mh.jl

+"""
+    set_namedtuple!(vi::VarInfo, nt::NamedTuple)
+
+Places the values of a `NamedTuple` into the relevant places of a `VarInfo`.
+"""
+function set_namedtuple!(vi::DynamicPPL.VarInfoOrThreadSafeVarInfo, nt::NamedTuple)
+    for (n, vals) in pairs(nt)
+        vns = vi.metadata[n].vns
+        if vals isa AbstractVector
+            vals = unvectorize(vals)
+        end
+        if length(vns) == 1
+            # Only one variable, assign the values to it
+            DynamicPPL.setindex!(vi, vals, vns[1])
+        else
+            # Spread the values across the variables
+            length(vns) == length(vals) || error("Unequal number of variables and values")
+            for (vn, val) in zip(vns, vals)
+                DynamicPPL.setindex!(vi, val, vn)
+            end
+        end
+    end
 end


This function is exactly the same as the one that's 'deleted' above. GitHub diffs being weird, sorry.

penelopeysm · 2025-12-01T21:42:11Z

src/optimisation/Optimisation.jl

unfortunately I had to do some surgery to the optimisation interface. I would have preferred to leave it for another time but the optim interface frequently assumed that LDF carried a varinfo with it.

penelopeysm · 2025-12-01T21:47:30Z

I believe there is still a Mooncake failure on 1.12 with ADTypeCheckContext, but otherwise everything on CI should pass, unless I messed something up terribly. My suspicion is that it's a Mooncake issue, not Turing; however I'll only look into this later.

Edit: Confirmed locally, the test passes on 1.11 and fails on 1.12. chalk-lab/Mooncake.jl#871

penelopeysm · 2025-12-02T11:21:46Z

Still needs a changelog (also more bullet points for the changelog are welcome), but the code can be reviewed :)

mhauru

Happy with the code, just needs the HISTORY.md entry. A few small questions.

Do I understand correctly that the old Transition and the new ParamsWithStats will (typically?) cause the same number of evaluations, though the latter may be a bit faster due to use of OnlyAccsVarInfo? So a small positive performance change would be expected from that?

mhauru · 2025-12-02T11:48:35Z

src/mcmc/gibbs_conditional.jl

-            """
-        throw(ArgumentError(msg))
-    end
+function get_gibbs_global_varinfo(context::GibbsContext)


Any particular reason to this change other than code style?

NodeTrait is gone, so I had to rewrite it and I think this is just what I ended up with. Separating the GibbsContext method is optional though - would you rather keep that inside the AbstractParentContext method?

Ah, I was thinking of the refactor from if-else to method dispatch. There's something elegant about doing it with dispatch, but I sometimes find it more readable when there's just a single method with an if-else logic (that gets compiled away). Curious if you have a reason to prefer one. Regardless, happy to leave this as-is.

If the intent is for the compiler to optimise it away, then I think that method dispatch is a more direct way of expressing that. Is it a documented guarantee that the compiler will optimise if x isa T branches away?

In terms of style, I think I gravitate towards method dispatch because it's more declarative than imperative. Same reason why return if foo; x; else y; end over if foo; return x; else return y; end.

Is it a documented guarantee that the compiler will optimise if x isa T branches away

Good point, I think not. In practice I strongly expect it to happen in simple cases like these, but there's no guarantee.

I think "more declarative" is a more precise way of expressing what I meant by "more elegant". The downside is that you may have to read the code in a weird order, and the different declarations could be scattered all over your code base.

Anyway, good chat, but no code changes needed.

mhauru · 2025-12-02T12:00:52Z

test/mcmc/is.jl


        @test all(isone, chains[:x])
-        @test chains.logevidence ≈ -2 * log(2)
+        logevidence = log(mean(exp.(chains[:loglikelihood])))


Isn't this the same as the above logsumexp(chain[:loglikelihood]) - log(N), but maybe less numerically stable or fast?

Yes, it's probably both less numerically stable and slower. But it's imo clearer, avoids pulling in an extra import, and doesn't cause an issue in the test.

I have a slight preference for the specialised function just because it's generally good practice to use it and the tests could lead by example. This is at disagreement level 1.5.

Haha to me it's a microoptimisation rather than good practice 😅

I'll change it but leave in a comment saying that "this is equivalent to .... but more numerically stable"

Seems it makes very little difference for speed, but just guards against over- and underflow:

julia> function f(x) display(logsumexp(x)) display(@b logsumexp(x)) display(log(sum(exp.(x)))) display(@b log(sum(exp.(x)))) return nothing end f (generic function with 1 method) julia> f(randn(10_000)) 9.67801864521181 37.958 μs 9.678018645211806 35.291 μs (3 allocs: 96.062 KiB) julia> f(randn(10_000)*1000) 3682.3316144326304 43.666 μs Inf 77.375 μs (3 allocs: 96.062 KiB)

Yeah, but the chain isn't generating those values. Not a big deal, changed now.

test/Project.toml

Project.toml

penelopeysm · 2025-12-02T12:18:27Z

Do I understand correctly that the old Transition and the new ParamsWithStats will (typically?) cause the same number of evaluations, though the latter may be a bit faster due to use of OnlyAccsVarInfo? So a small positive performance change would be expected from that?

Yup, that's right. The performance difference is probably not very big - in my opinion the nice thing is that the behaviour is encapsulated in DPPL.

penelopeysm · 2025-12-02T17:46:51Z

Changelog added.

penelopeysm · 2025-12-02T17:48:00Z

... Darn, forgot to mention the logevidence thing.

mhauru

One optional addition to HISTORY.md, happy to approve.

I'm very excited for fast LDF to hit the streets and see people go screaming.

mhauru · 2025-12-02T19:31:32Z

HISTORY.md

+
+  - your model has other kinds of parallelism but does not include tilde-statements inside;
+  - or you are using `MCMCThreads()` or `MCMCDistributed()` to sample multiple chains in parallel, but your model itself does not use parallelism.
+


Suggested change

If your model does include parallelised tilde-statements or `@addlogprob!` calls, and you evaluate it/sample from it without setting `setthreadsafe(model, true)`, then you may get statistically incorrect results without any warnings or errors.

mhauru · 2025-12-02T19:33:18Z

HISTORY.md

+
+When sampling using MCMCChains, the chain object will no longer have its `chain.logevidence` field set.
+Instead, you can calculate this yourself from the log-likelihoods stored in the chain.
+For SMC samplers, the log-evidence of the entire trajectory is stored in `chain[:logevidence]` (which is the same for every particle in the 'chain').


Just to check, is this only for SMC or also for PG?

Having thought about this for 3 more seconds, this probably makes no sense for PG. Ignore me.

penelopeysm added 4 commits November 13, 2025 20:23

Use FastLDF and removal of NodeTrait

4862d7b

Update ADTypeCheckContext

73c5d05

Use ParamsWithStats

98c4c11

Fixes for DPPL

db57a1d

github-actions bot assigned penelopeysm Nov 14, 2025

(some) Compatibility with DynamicPPL 0.39

507d814

penelopeysm force-pushed the py/dppl039 branch from 24142ee to 507d814 Compare November 25, 2025 11:50

penelopeysm added 2 commits November 29, 2025 14:54

point to new branch, fix logevidence tests

15699b2

Fix all logevidence things

89dc4ad

penelopeysm changed the title ~~DPPL 0.39~~ Update for DynamicPPL 0.39 Nov 29, 2025

penelopeysm added 9 commits December 1, 2025 18:06

Merge branch 'breaking' into py/dppl039

4ca6533

remove ldf_default_varinfo

f21bb8f

fix dangling fast_evaluate!! reference

4625f26

remove unnecessary test

8725546

SimpleVarInfo not gone yet

1929f61

add a test for PG/SMC with threadsafe models

4c90838

fix import

783a35d

fix rename

6077e90

fix branch name

03841d7

penelopeysm mentioned this pull request Dec 1, 2025

PG silently gives incorrect results with Threads.@threads #2658

Open

penelopeysm added 4 commits December 1, 2025 18:59

:lp -> :logjoint

3963242

changelog

65ab93b

fix nodetrait merge

8f47995

fix interface for ParamsWithStats

324b76a

penelopeysm added 2 commits December 1, 2025 19:32

fix externalsampler bug

2f4794b

Fix optimisation interface

16d9dfd

penelopeysm commented Dec 1, 2025

View reviewed changes

penelopeysm marked this pull request as ready for review December 1, 2025 21:58

penelopeysm added 3 commits December 2, 2025 01:10

Fix extra type parameter

10a6438

re-export setthreadsafe

accbe80

Disable failing 1.12 Mooncake test

34a47ba

penelopeysm force-pushed the py/dppl039 branch from 05604ad to 34a47ba Compare December 2, 2025 01:25

penelopeysm added 2 commits December 2, 2025 02:33

add link to Mooncake issue

255dab4

remove stale dep

bd1c2b6

penelopeysm requested a review from mhauru December 2, 2025 11:20

mhauru reviewed Dec 2, 2025

View reviewed changes

penelopeysm added 3 commits December 2, 2025 13:15

use new DPPL + Mooncake releases

538f077

use logsumexp

ea811c0

Changelog

90655db

penelopeysm requested a review from mhauru December 2, 2025 17:46

Mention log-evidence in changelog

61a5266

mhauru approved these changes Dec 2, 2025

View reviewed changes


		- your model has other kinds of parallelism but does not include tilde-statements inside;
		- or you are using `MCMCThreads()` or `MCMCDistributed()` to sample multiple chains in parallel, but your model itself does not use parallelism.



	If your model does include parallelised tilde-statements or `@addlogprob!` calls, and you evaluate it/sample from it without setting `setthreadsafe(model, true)`, then you may get statistically incorrect results without any warnings or errors.

Update for DynamicPPL 0.39 #2715

Are you sure you want to change the base?

Update for DynamicPPL 0.39 #2715

Conversation

penelopeysm commented Nov 14, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Of note

Uh oh!

github-actions bot commented Nov 14, 2025

Uh oh!

codecov bot commented Dec 1, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

penelopeysm Dec 1, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

penelopeysm commented Dec 1, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

penelopeysm commented Dec 2, 2025

Uh oh!

mhauru left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

penelopeysm commented Dec 2, 2025

Uh oh!

penelopeysm commented Dec 2, 2025

Uh oh!

penelopeysm commented Dec 2, 2025

Uh oh!

mhauru left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

penelopeysm commented Nov 14, 2025 •

edited

Loading

codecov bot commented Dec 1, 2025 •

edited

Loading

penelopeysm Dec 1, 2025 •

edited

Loading

penelopeysm commented Dec 1, 2025 •

edited

Loading