End-to-End Email Spam Classifier

—

Pending

💰 USD 250–750 👤 Unknown 🕒 16d ago status: new

Required Skills

Python Machine Learning (ML) Data Science FastAPI Natural Language Processing

Project Description

I will hand you a two-column CSV of real email bodies and their spam/ham labels. From that single file I need a full production-ready pipeline: • Text preprocessing that cleans each message, tokenises it and converts it to TF-IDF vectors. • Training loops for Logistic Regression, Multinomial Naïve Bayes and linear-kernel SVM, with code that automatically picks the best performer. • A FastAPI service exposing /predict so any caller can POST raw email text and receive the predicted class plus a probability score. • A clear evaluation notebook or script that prints accuracy, F1, ROC-AUC and a confusion matrix so I can verify performance. • Repository structured for easy hand-off, including requirements.txt and a concise README explaining setup, training and API usage. I would appreciate—but do not strictly require—a small Streamlit dashboard that lets me paste an email and see the prediction in real time; feel free to propose how you would add that. Acceptance is based on: 1. Re-running train.py on my machine reproduces the best model. 2. Calling /predict with sample emails returns correct spam/ham flags with confidences. 3. metrics_report.txt (or notebook) matches or exceeds the scores you quote. 4. Project installs cleanly inside a fresh virtual environment with one pip install -r requirements.txt. If any clarifications around the email domain data are needed, just ask; otherwise please outline your approach and timeline and we can get started right away.

Actions

↗ View on Freelancer