Skip to content

runtime failed with core dumped #59

@ZJLi2013

Description

@ZJLi2013

Problem Description

  • env:
GPU Marketing Name:          AMD Radeon Graphics
GPU Name:                    amdgcn-amd-amdhsa--gfx942:sramecc+:xnack-
base image: rocm/pytorch-private:vllm0.4.3_06_21_alibaba_fixwvsplitk_ll2_train_csrikris_06_25
  • runtime steps
chmod +x /usr/local/lib/runTracher.sh 
cd rocpd_python/
python3 -m pip install .
runTracer.sh python3 /workspace/vllm/benchmarks/benchmark_latency.py --model /workspace/Qwen2-7B/ --input-len 32 --output-len 128  --batch-size 16 --dtype float16 --enforce-eager --num-iters-warmup 2 --num-iters 10 
  • output errors:
...
rocpd_op: 0
rocpd_api_ops: 0
rocpd_kernelapi: 0
rocpd_copyapi: 0
rocpd_api: 0
rocpd_string: 0
rpd_tracer: finalized in 1221.910687 ms
double free or corruption (!prev)
/usr/local/bin/runTracer.sh: line 43:  5233 Aborted                 (core dumped) LD_PRELOAD=librpd_tracer.so "$@"

there is core dumped error when run tracing. can someone help to verify ? thanks a lot

Operating System

Ubuntu 20.04.6 LTS (Focal Fossa)

CPU

Intel(R) Xeon(R) Platinum 8480C

GPU

AMD Instinct MI300X

ROCm Version

ROCm 6.1.0

ROCm Component

No response

Steps to Reproduce

No response

(Optional for Linux users) Output of /opt/rocm/bin/rocminfo --support

No response

Additional Information

No response

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions