Project Description
AI Enabled Data Scraping Engineer – Junior / Mid Level
Experience: 1 to 4 Years
Location: Remote (Work from Home) / Bangalore / India
Mode of Engagement: Full-time
No of Positions: 3
Educational Qualification: B.E / B.Tech / MCA / Computer Science / IT
Industry: AI / Data Engineering / Automation / SaaS
Notice Period: Immediate / 15 Days Preferred
What We Are Looking For:
1–4 years of experience in Python-based web scraping, browser automation, and data extraction projects.
Hands-on experience with Scrapy, Selenium, Playwright, Requests, BeautifulSoup, or similar scraping frameworks.
Basic to intermediate understanding of AI/LLM-powered automation workflows using ChatGPT, OpenAI APIs, Claude, Gemini, or LangChain.
Experience handling dynamic websites, login sessions, cookies, browser automation, and structured/unstructured data extraction.
Familiarity with APIs, JSON/XML handling, databases, automation scripting, Git, Docker, or Linux environments.
Good analytical, debugging, and problem-solving skills with the ability to work in fast-paced environments.
Responsibilities:
Develop and maintain web scraping and browser automation scripts for extracting structured and unstructured web data.
Build scraping workflows using Scrapy, Selenium, Playwright, APIs, and Python automation libraries.
Assist in AI-powered data extraction and enrichment workflows using LLMs and automation tools.
Perform data cleaning, validation, transformation, and storage for downstream analytics and AI applications.
Monitor scraping jobs, debug failures, optimize crawlers, and maintain data quality standards.
Collaborate with AI teams, product teams, and senior engineers on scalable data acquisition projects.
Qualifications:
Bachelor’s degree in Computer Science, Engineering, IT, or related field.
Strong hands-on knowledge of Python programming and scraping frameworks such as Scrapy, Selenium, Playwright, or BeautifulSoup.
Good understanding of APIs, automation workflows, databases, JSON/XML handling, and cloud concepts.
Familiarity with AI tools, LLM APIs, browser automation, and modern scraping techniques will be an added advantage.
Familiarity with Git, Docker, Linux, or cloud platforms is a plus.