**Describe the feature** - Support OpenClip embedding to unify text and image vector embedding space, and support cross modality search **Motivation and use case** - Multi-modal search **Additional context** - to be filled