Exhaustive Job Extractor
Chrome Extension for Multi-Source Job Scraping
12+ Job Boards
LinkedIn, Indeed, Glassdoor, Dice, and more
AI-Powered
Automatic skill extraction and enrichment
Export Ready
JSON, CSV, and bulk operations
Installation
-
Download the Extension
git clone https://github.com/yourusername/get-me-the-job.git cd get-me-the-job/exhaustive-job-extractor npm install && npm run build -
Open Chrome Extensions Page
Navigate to
chrome://extensions/in your browser -
Enable Developer Mode
Toggle the switch in the top-right corner
-
Load Unpacked Extension
Click "Load unpacked" and select the
dist/folder insideexhaustive-job-extractor/ -
Pin the Extension
Click the puzzle icon in Chrome toolbar and pin "Exhaustive Job Extractor"
Authentication & Sync Setup
Connect the extension to your tracker so captured jobs sync automatically.
Step 1 — Get your API token
-
Sign in to your tracker
at
http://localhost:8000/login -
After signing in, visit
http://localhost:8000/auth/profileand click Copy API Token. The token looks likeeyJ....
Step 2 — Configure the extension
- Click the extension icon → gear icon (Options).
-
Set API Base URL to your tracker address:
Local development
http://127.0.0.1:8000Productionhttp://localhost:8000 - Paste your token into the Bearer Token field and click Save.
- Click Test Connection — you should see a green ✓ confirming the tracker is reachable and your token is valid.
Step 3 — Capture and sync a job
- Navigate to any supported job page (e.g., a LinkedIn job detail page).
- Click the extension icon and press Save to Tracker.
- Open your tracker at
http://localhost:8000/jobs— the job should appear within seconds.
/login
and update the token in Options.
chrome.storage.local
which has a hard cap of 10 MB.
If the cache fills up, the extension falls back to in-memory storage for that session. Export or sync regularly to avoid hitting this limit.
chrome.storage.local job cache, removing any locally cached job data (title, company, description) from your browser.
Your tracker data on the server is unaffected.
Supported Job Boards
Major Platforms
- LinkedIn - Single job & multi-page search
- Indeed - Comprehensive extraction
- Glassdoor - Company reviews included
- Dice - Tech jobs specialist
- FreelancerMap - Multi-language (EN/DE/ES)
Specialized Platforms
- Built In - Startup jobs
- Wellfound - Startup equity info
- DevJobsScanner - Developer jobs
- Rete Informatica Lavoro - Italian IT jobs
- IProgrammatori - Italian dev jobs
- The Local - Expat jobs
How to Use
- Navigate to any supported job posting (e.g., LinkedIn job detail page)
- Click the extension icon in your browser toolbar
- Click the "Extract Job" button
- Wait for extraction to complete (~3 seconds)
- Review the extracted data in the side panel
- Click "Download JSON" to save the job data
- Go to LinkedIn and perform a job search
- Click the extension icon
- Click "Crawl Search Results"
- When prompted, enter how many pages to scrape (1-40)
- The extension will:
- Phase 1: Collect all job URLs from multiple pages
- Phase 2: Extract detailed data from each job
- Click "Download All" when complete
Enhance your job data with AI-powered analysis using Google Gemini:
- Get a free API key from Google AI Studio
- Click the extension icon and go to "Options" (gear icon)
- Paste your Gemini API key
- Enable "AI enrichment" checkbox in the side panel
- Extract jobs as normal - AI analysis will be added automatically
AI Enrichment adds:
- Enhanced skill extraction
- Experience level detection
- Required vs. preferred qualifications
- Job responsibilities breakdown
- Company culture insights
Save time by filtering jobs before extracting full details:
- Enable "Filter jobs before scraping" checkbox
- Set your filters:
- Required skills: e.g., "python, react"
- Excluded keywords: e.g., "senior, lead"
- Min/Max salary: e.g., $80K - $120K
- Location: e.g., "Remote, San Francisco"
- Start crawling - only matching jobs will be extracted
Export Formats:
JSON Format
Complete data structure with all fields
CSV Format
Spreadsheet-ready format
Filename Verification:
- Checked: Chrome will ask where to save and let you rename files
- Unchecked: Files auto-save to Downloads folder with timestamps
The extension automatically detects duplicate jobs:
- After scraping, duplicates are highlighted in yellow
- Click "View Duplicates" to see details
- Options:
- Delete duplicates: Keep only unique jobs
- Export report: Save duplicate analysis as JSON
- Keep all: Proceed with duplicates included
- Job ID matching (most reliable)
- URL matching
- Title + Company similarity (fuzzy matching)
Key Features
Speed Control
Adjust scraping speed (Fast/Medium/Slow) to avoid rate limits
Pause/Resume
Pause multi-page scraping and resume later
Local Storage
All scraped jobs saved locally - view anytime
Search & Filter
Search through saved jobs by title, company, or location
Tips & Best Practices
Do's:
- Use "Medium" or "Slow" speed for large scraping jobs
- Enable filters to reduce unnecessary API calls
- Regularly export your data as backup
- Check for duplicates before downloading
- Use AI enrichment for better skill extraction
Don'ts:
- Don't scrape more than 50 jobs at once without delays
- Don't use "Fast" speed on LinkedIn (risk of rate limiting)
- Don't close the extension during multi-page scraping
- Don't scrape the same search results repeatedly
- Don't share your Gemini API key publicly
Troubleshooting
Sync failure states
| Extension message | What it means | Fix |
|---|---|---|
| Not authenticated | No token saved in Options, or token field is blank. | Open Options → paste your bearer token from /auth/profile. |
| Token expired (401) | Your 30-minute token has expired. | Sign in again at /login → copy the new token → update Options. |
| Network unreachable | The extension cannot reach the configured API Base URL. | Check the app is running at http://localhost:8000 and the URL in Options is correct (no trailing slash). |
| Bad payload (422) | The extension sent a payload the server could not parse (schema mismatch). | Update the extension to the latest version — the payload contract may have changed. Check Options for the current schema version. |
| Rate limited (429) | The tracker API is rate-limiting your IP (100 req/min). | Wait 60 seconds and retry. Avoid bulk syncing at high speed. |
General issues
| Issue | Solution |
|---|---|
| "This page is not supported" | Make sure you're on a job detail page, not a search results page |
| Extension icon doesn't appear | Go to chrome://extensions/ and click "Reload" |
| No data extracted | Refresh the job page and wait 3 seconds before extracting |
| AI enrichment not working | Verify your Gemini API key in Options and check console for errors |
| LinkedIn rate limiting | Wait 1 hour, use slower speed, or scrape in smaller batches |
| CSV export empty | Make sure you have jobs in the extension storage first |
| Storage full warning | The 10 MB chrome.storage.local cap has been reached. Sync or export jobs, then clear the extension cache from Options. |
Technical Details
Technology Stack:
- Manifest V3 Chrome Extension
- Content Scripts for data extraction
- Service Worker for background tasks
- Google Gemini AI API integration
- Local Chrome Storage API
Data Extracted:
- Job title, company, location
- Salary range (if available)
- Skills and technologies
- Experience requirements
- Company details and size
- Full job description (HTML)