Skip to content

TTree/RNTuple attribute to keep track of the loaded branches #1489

@ikrommyd

Description

@ikrommyd

In Coffea, we would like a TTree/RNTuple attribute to keep track of the loaded branches similar to how we can track the number of requested bytes from a file:

@dataclasses.dataclass
class SourcePerformanceCounters:
"""Container for performance counters"""
num_requested_bytes: int
num_requests: int
num_requested_chunks: int
def asdict(self) -> dict[str, int]:
return dataclasses.asdict(self)

This branch deserialization log should contain duplicates (i.e. if a branch is deserialized twice, it should be added twice to the log) and the log should be ordered in the way the branches were deserialized in time. However, I don't think it should contain duplicates if the branch was taken from the uproot cache. Overall, I think it should contain all the branches that uproot actually spent time deserializing in order. The simplest thing would be a list of strings I believe or you can make it a more complicated object to keep track of the number of bytes for example of each deserialization and other extra info you may want.

cc @pfackeldey @lgray

Metadata

Metadata

Assignees

Labels

featureNew feature or request

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions