Project Description
I have a clean, ready-to-use dataset and now need a complete predictive analysis pipeline built around it. The job is straightforward: take my data, explore it, engineer the right features, train and tune an appropriate model, and hand back production-ready code that I can rerun whenever new records arrive.
Python is my preferred language, and I’m comfortable with common libraries such as pandas, scikit-learn, XGBoost, TensorFlow or PyTorch—feel free to choose whichever stack best fits the problem once you’ve inspected the data. Please keep everything reproducible: a single script or notebook, a clear requirements.txt, and comments that explain why each step was taken.
Acceptance criteria
• End-to-end code executes without errors on my machine.
• Model delivers a measurable accuracy, F1, or other agreed metric that clearly improves on a naïve baseline.
• Brief read-me outlines assumptions, feature choices, and how to retrain the model with fresh data.
If you’re able to turn solid predictive insight out of raw numbers and document the process clearly, this should be a quick, rewarding engagement for both of us.