Skip to content

Optimize inner policy maps with BPF_F_NO_PREALLOC to reduce memory usage #4249

@kyledong-suse

Description

@kyledong-suse

Is there an existing issue for this?

  • I have searched the existing issues

Is your feature request related to a problem?

As mentioned part of #4191, while testing Tetragon’s policy filter implementation (pkg/policyfilter/map.go), I noticed that the inner per-policy maps (policy_%d_map) currently use a fixed size(32768) and the default preallocated hash map mode:

// addPolicyMap adds and initializes a new policy map
func (m PfMap) newPolicyMap(polID PolicyID, cgIDs []CgroupID) (polMap, error) {
	name := fmt.Sprintf("policy_%d_map", polID)
	innerSpec := &ebpf.MapSpec{
		Name:       name,
		Type:       ebpf.Hash,
		KeySize:    uint32(unsafe.Sizeof(CgroupID(0))),
		ValueSize:  uint32(1),
		MaxEntries: uint32(polMapSize),   // currently const = 32768
	}
        ...
}

This causes significant memory preallocation even when the number of tracked cgroups per policy is small.

I tested using the following tracing policy:

apiVersion: cilium.io/v1alpha1
kind: TracingPolicy
metadata:
  name: "policy-1"
spec:
  podSelector:
    matchLabels:
      app: "ubuntu"
  kprobes:
  - call: "security_bprm_creds_for_exec"
    syscall: false
    args:
    - index: 0
      type: "linux_binprm"
    selectors:
    - matchArgs:
      - index: 0
        operator: "NotEqual"
        values:
        - "/usr/bin/sleep"
        - "/usr/bin/cat"
        - "/usr/bin/my-server-1"
      matchActions:
      - action: Override
        argError: -1
  options:
  - name: disable-kprobe-multi
    value: "1"

After applying the policy, the resulting map allocation was:

278096: hash  name policy_1_map  flags 0x0
	key 8B  value 1B  max_entries 32768  memlock 2622752B
	pids tetragon(1208075)

This corresponds to ~2.6 MB of memory preallocated for a single inner map, which is excessive given that only a handful of cgroups are typically tracked per policy.

Describe the feature you would like

We want to optimize the pre-allocation memory only for actual entries, reducing the footprint.

Describe your proposed solution

I wonder if we can enable BPF_F_NO_PREALLOC for these inner maps—similar to how Tetragon already handles certain maps in https://github.com/cilium/tetragon/blob/main/pkg/sensors/tracing/selectors.go.

This change ensures that each inner policy map (policy_%d_map) uses lazy allocation instead of preallocating all hash buckets upfront. It will not affect outer map creation or map-of-maps semantics.

With this optimization, memory usage improves significantly. For example, with the same configuration and tracing policy:

279555: hash  name policy_1_map  flags 0x1
	key 8B  value 1B  max_entries 32768  memlock 525312B
	pids tetragon(1350594)

This represents only ~0.5 MB of memory, compared to ~2.6 MB without the flag—an approximate 80% reduction in memory preallocation per policy map in this example.

Code of Conduct

  • I agree to follow this project's Code of Conduct

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions