Douglas Hanley
llama : allow for user specified embedding pooling type (#5849)
LLM inference in C/C++
C++
master