Project Description
Title: Set up offline AI research and drafting tool on macOS (Ollama + Kotaemon + local RAG)
Project — "Legal X"
I am building a fully offline, zero-API-cost AI research and drafting assistant for legal work, called Legal X. It will run entirely on my MacBook Air M-series (16 GB unified RAM, 10-core GPU, 512 GB), with no cloud dependency and no ongoing subscription cost.
The system needs to do three things over a private corpus of approximately 40,000 readable PDF and HTML files (~5 GB):
1. Answer research queries with numbered footnotes, citing the exact source file and highlighting the passage in the original document for verification.
2. Produce drafts grounded in retrieved source material, in a consistent style.
3. Support long-form writing (articles, book chapters) drawing from the same corpus as source material.
I need a freelancer to set this up remotely on my machine. I will provide remote access (AnyDesk / TeamViewer / Chrome Remote Desktop) and full cooperation throughout.
Scope of work
1. Install and configure Ollama. Pull and verify three models: qwen2.5:14b, mistral, nomic-embed-text. Confirm Metal GPU acceleration is active.
2. Install Kotaemon from the official GitHub release run_macos.sh installer. Change default credentials.
3. Connect Kotaemon to local Ollama: register Qwen 2.5 14B as primary LLM, Mistral as secondary, and nomic-embed-text as the embedding model. Configure and test hybrid (BM25 + vector) search.
4. Create a dedicated project workspace inside Kotaemon. Run a sanity-test index on 50 sample files and verify highlighted-passage citations work end to end.
5. Run full indexing of the ~40,000-file corpus (PDF + HTML). Confirm successful completion and disk-resident index.
6. Tune retrieval settings — chunk size, top-K, hybrid weighting. Run 10 test queries I will provide and confirm at least 8/10 produce accurate footnoted answers with verifiable source highlighting.
7. Set up a preferences/style file inside the index (content I will provide) so outputs reflect my drafting style and citation format.
8. Pull and register deepseek-r1:14b as a third selectable LLM.
9. Write a short handover document: how to start and stop the system, how to add new files, how to re-index, how to switch models, common troubleshooting.
Out of scope
Cross-session memory layer (Letta / MemGPT) — to be added later in a separate engagement.
Deliverables
A fully working Legal X installation on my machine, all three use cases tested. Plain-English handover document (Word or Markdown). One 30-minute video walkthrough at the end of the engagement.
Requirements
- Hands-on experience with Ollama on Apple Silicon.
- Prior Kotaemon, AnythingLLM, LangChain, or LlamaIndex deployment experience — please cite a specific past project.
- Comfort with macOS Terminal, Python virtual environments, and remote-access setup.
What I provide
Remote access to the machine throughout. The full corpus already on disk. The 10 test queries. The preferences file content. Prompt response on Slack/WhatsApp during your working hours.
Budget and bidding
Fixed-price preferred. Please quote your price, expected total hours, and your timezone. Do not bid if you have not deployed a local RAG system end to end before — I will ask for proof.
To apply
In your first message, tell me:
1. One prior local-RAG project you built, and what stack.
2. Your plan for verifying Metal GPU acceleration on Ollama.
3. How you would handle a failed embedding step mid-indexing without restarting from zero.
Generic AI-written pitches will be ignored.