Data Acquisition
Selected scraping and automation systems.
Recorded proof of production scraping and automation systems: multi-source extraction, JS-heavy browser automation, source fallback, operator tooling, and downstream structured outputs.
Contact: kanad.rishiraj@gmail.com
Featured Case Study
NomNomy
End-to-end restaurant discovery, platform mapping, parallel menu scraping, and operator-side consolidation.
Designed for JS-heavy pages, multi-platform schema differences, extraction reliability, and operator-side review before finalization.
Step 1: Discovery, Validation, and Platform Mapping
This run starts from a seed restaurant, discovers nearby restaurants, extracts and validates Google Maps details, persists normalized restaurant records, and then discovers DoorDash, Grubhub, and Uber Eats platform URLs for downstream scraping.
Step 2: Parallel Multi-Platform Menu Extraction
The platform scrapers run in parallel to extract prices, item details, and image assets from DoorDash, Uber Eats, and Grubhub. Image assets are downloaded directly and stored in Amazon S3 so the system retains stable artifacts instead of repeatedly hotlinking platform images.
Step 3: Streamlit Finalization and Consolidation
After scraping, the operator-facing Streamlit editor preselects prices, images, and details from the scraped platform data, while still allowing manual adjustments. The lower section exposes consolidated per-platform details for the item so the final output stays traceable.
Production Workflow Proof
MovieSaints
Production scraping workflows tied to real operator use: social lead discovery and FX data collection with source fallback.
Instagram Outreach Search to Google Sheets
This workflow starts from a target hashtag, logs into Instagram with humanized waits and scrolling, inspects posts, extracts creator profile details plus related hashtags, and then updates the Google Sheet-backed outreach queue with newly discovered leads and search terms.
FX Rate Extraction with Source Fallback
This run fetches cross-currency FX rates with x-rates as the primary source and Wise as fallback. When one source fails or returns incomplete data, the scraper shifts to the alternative source and still produces structured rate output for reliable business-side FX ingestion.
Applied Workflow Automation
FitJobs
Browser-side job extraction that turns a live posting into structured fields for downstream review.
LinkedIn Job Extraction and Extension Auto-Population
This browser-side FitJobs flow reads the active LinkedIn job page, extracts structured fields such as title, company, location, URL, and description, and auto-populates the extension UI so the result can be reviewed or passed into downstream fit-analysis logic.
More
Across these systems, I focus on resilience under upstream changes: selector fallbacks, source failover, structured normalization, and human review where needed.
Code examples and additional technical detail are available on request. Relevant GitHub repo: selenium-web-automation-utils