Scalable LLM & RAG Deployment

—

Pending

💰 USD 500–2000 👤 Unknown 🕒 18d ago status: new

Required Skills

Cloud Computing PostgreSQL DevOps Large Language Model AI Development Retrieval-Augmented Generation (RAG)

Project Description

I need solution for LLM (selectd by client)+ RAG deployed on own server(recommended by freelancer) with automatic scalable to 1000 or more converations the same time. Instances /pods should be added and removed automatically to save costs(for now only online dedicated serwers /clauds) later hibdrid of GPU server on premis + online servers Currenly additional information aboout users we have in postgresql only , we want to give user option to talk with RAG data and LLM model System also should count usages, store inforamtion when conversation started and finished in our database. If there is better solution recommended to talk wih the data I am open for it . In future I would like to add sending voice to this server and getting it back (except text). Please share price,timeplan for text only and text+voice + fiull support after going live and during testing and documentation

Actions

↗ View on Freelancer