-
Notifications
You must be signed in to change notification settings - Fork 85
Description
In Coffea, we would like a TTree/RNTuple attribute to keep track of the loaded branches similar to how we can track the number of requested bytes from a file:
uproot5/src/uproot/source/chunk.py
Lines 45 to 54 in b684d67
| @dataclasses.dataclass | |
| class SourcePerformanceCounters: | |
| """Container for performance counters""" | |
| num_requested_bytes: int | |
| num_requests: int | |
| num_requested_chunks: int | |
| def asdict(self) -> dict[str, int]: | |
| return dataclasses.asdict(self) |
This branch deserialization log should contain duplicates (i.e. if a branch is deserialized twice, it should be added twice to the log) and the log should be ordered in the way the branches were deserialized in time. However, I don't think it should contain duplicates if the branch was taken from the uproot cache. Overall, I think it should contain all the branches that uproot actually spent time deserializing in order. The simplest thing would be a list of strings I believe or you can make it a more complicated object to keep track of the number of bytes for example of each deserialization and other extra info you may want.