Sleeping veureu-svision 🦎 Process images and videos to generate descriptions, face embeddings, and scene cuts