Design universal API for a TRT-based inference invoked from PyFunc

Must accept:

1. GpuMat, CuPy, or Torch GPU tensor in a batched format.
2. Build and load model engines.
3. Image Preprocessor.
4. Output Postprocessor.

It is beneficial for complex inference scenarios when we want to infer on historical data allocated in a GPU, not on streaming data.