Exhaustive Job Extractor

Chrome Extension for Multi-Source Job Scraping

Version 1.8.0

12+ Job Boards

LinkedIn, Indeed, Glassdoor, Dice, and more

AI-Powered

Automatic skill extraction and enrichment

Export Ready

JSON, CSV, and bulk operations

Installation

Note: This extension is currently in development. Follow the steps below to install it locally.
  1. Download the Extension
    git clone https://github.com/yourusername/get-me-the-job.git
    cd get-me-the-job/exhaustive-job-extractor
    npm install && npm run build
  2. Open Chrome Extensions Page

    Navigate to chrome://extensions/ in your browser

  3. Enable Developer Mode

    Toggle the switch in the top-right corner

  4. Load Unpacked Extension

    Click "Load unpacked" and select the dist/ folder inside exhaustive-job-extractor/

  5. Pin the Extension

    Click the puzzle icon in Chrome toolbar and pin "Exhaustive Job Extractor"

Authentication & Sync Setup

Connect the extension to your tracker so captured jobs sync automatically.

Step 1 — Get your API token
  1. Sign in to your tracker at http://localhost:8000/login
  2. After signing in, visit http://localhost:8000/auth/profile and click Copy API Token. The token looks like eyJ....
Step 2 — Configure the extension
  1. Click the extension icon → gear icon (Options).
  2. Set API Base URL to your tracker address:
    Local development http://127.0.0.1:8000
    Production http://localhost:8000
  3. Paste your token into the Bearer Token field and click Save.
  4. Click Test Connection — you should see a green ✓ confirming the tracker is reachable and your token is valid.
Step 3 — Capture and sync a job
  1. Navigate to any supported job page (e.g., a LinkedIn job detail page).
  2. Click the extension icon and press Save to Tracker.
  3. Open your tracker at http://localhost:8000/jobs — the job should appear within seconds.
Token lifetime: Tokens expire after 30 minutes. When a sync fails with "401 Unauthorized", sign in again at /login and update the token in Options.
Storage limit: The extension caches jobs in chrome.storage.local which has a hard cap of 10 MB. If the cache fills up, the extension falls back to in-memory storage for that session. Export or sync regularly to avoid hitting this limit.
Logging out: Clicking Log out in the extension Options clears both the saved bearer token and the chrome.storage.local job cache, removing any locally cached job data (title, company, description) from your browser. Your tracker data on the server is unaffected.

Supported Job Boards

Major Platforms
Specialized Platforms

How to Use

  1. Navigate to any supported job posting (e.g., LinkedIn job detail page)
  2. Click the extension icon in your browser toolbar
  3. Click the "Extract Job" button
  4. Wait for extraction to complete (~3 seconds)
  5. Review the extracted data in the side panel
  6. Click "Download JSON" to save the job data
Tip: The extension automatically extracts skills, salary, company details, and more!

  1. Go to LinkedIn and perform a job search
  2. Click the extension icon
  3. Click "Crawl Search Results"
  4. When prompted, enter how many pages to scrape (1-40)
  5. The extension will:
    • Phase 1: Collect all job URLs from multiple pages
    • Phase 2: Extract detailed data from each job
  6. Click "Download All" when complete
Note: LinkedIn may rate-limit after ~50 requests. Use delays and be respectful!

Enhance your job data with AI-powered analysis using Google Gemini:

  1. Get a free API key from Google AI Studio
  2. Click the extension icon and go to "Options" (gear icon)
  3. Paste your Gemini API key
  4. Enable "AI enrichment" checkbox in the side panel
  5. Extract jobs as normal - AI analysis will be added automatically
AI Enrichment adds:
  • Enhanced skill extraction
  • Experience level detection
  • Required vs. preferred qualifications
  • Job responsibilities breakdown
  • Company culture insights

Save time by filtering jobs before extracting full details:

  1. Enable "Filter jobs before scraping" checkbox
  2. Set your filters:
    • Required skills: e.g., "python, react"
    • Excluded keywords: e.g., "senior, lead"
    • Min/Max salary: e.g., $80K - $120K
    • Location: e.g., "Remote, San Francisco"
  3. Start crawling - only matching jobs will be extracted
Filters are applied during Phase 1, significantly reducing scraping time!

Export Formats:
JSON Format

Complete data structure with all fields

CSV Format

Spreadsheet-ready format

Filename Verification:
  • Checked: Chrome will ask where to save and let you rename files
  • Unchecked: Files auto-save to Downloads folder with timestamps

The extension automatically detects duplicate jobs:

  1. After scraping, duplicates are highlighted in yellow
  2. Click "View Duplicates" to see details
  3. Options:
    • Delete duplicates: Keep only unique jobs
    • Export report: Save duplicate analysis as JSON
    • Keep all: Proceed with duplicates included
Duplicate Detection Methods:
  • Job ID matching (most reliable)
  • URL matching
  • Title + Company similarity (fuzzy matching)

Key Features

Speed Control

Adjust scraping speed (Fast/Medium/Slow) to avoid rate limits

Pause/Resume

Pause multi-page scraping and resume later

Local Storage

All scraped jobs saved locally - view anytime

Search & Filter

Search through saved jobs by title, company, or location

Tips & Best Practices

Do's:
  • Use "Medium" or "Slow" speed for large scraping jobs
  • Enable filters to reduce unnecessary API calls
  • Regularly export your data as backup
  • Check for duplicates before downloading
  • Use AI enrichment for better skill extraction
Don'ts:
  • Don't scrape more than 50 jobs at once without delays
  • Don't use "Fast" speed on LinkedIn (risk of rate limiting)
  • Don't close the extension during multi-page scraping
  • Don't scrape the same search results repeatedly
  • Don't share your Gemini API key publicly

Troubleshooting

Sync failure states
Extension message What it means Fix
Not authenticated No token saved in Options, or token field is blank. Open Options → paste your bearer token from /auth/profile.
Token expired (401) Your 30-minute token has expired. Sign in again at /login → copy the new token → update Options.
Network unreachable The extension cannot reach the configured API Base URL. Check the app is running at http://localhost:8000 and the URL in Options is correct (no trailing slash).
Bad payload (422) The extension sent a payload the server could not parse (schema mismatch). Update the extension to the latest version — the payload contract may have changed. Check Options for the current schema version.
Rate limited (429) The tracker API is rate-limiting your IP (100 req/min). Wait 60 seconds and retry. Avoid bulk syncing at high speed.
General issues
Issue Solution
"This page is not supported" Make sure you're on a job detail page, not a search results page
Extension icon doesn't appear Go to chrome://extensions/ and click "Reload"
No data extracted Refresh the job page and wait 3 seconds before extracting
AI enrichment not working Verify your Gemini API key in Options and check console for errors
LinkedIn rate limiting Wait 1 hour, use slower speed, or scrape in smaller batches
CSV export empty Make sure you have jobs in the extension storage first
Storage full warning The 10 MB chrome.storage.local cap has been reached. Sync or export jobs, then clear the extension cache from Options.

Technical Details

Technology Stack:
  • Manifest V3 Chrome Extension
  • Content Scripts for data extraction
  • Service Worker for background tasks
  • Google Gemini AI API integration
  • Local Chrome Storage API
Data Extracted:
  • Job title, company, location
  • Salary range (if available)
  • Skills and technologies
  • Experience requirements
  • Company details and size
  • Full job description (HTML)
Privacy: All data is stored locally in your browser. Nothing is sent to external servers except for optional AI enrichment via Gemini API.

Ready to supercharge your job search?

Install Extension View on GitHub