AIJobScraper
This project enables to scrape jobs from LinkedIn, Glassdoor, Indeed, and ZipRecruiter. The jobs are sent to an AI LLM to determine their suitability for you and organize them in Excel file.
Install / Use
/learn @elchananvol/AIJobScraperREADME
Job Scraper with AI Filtering
This project enables you to scrape job postings from LinkedIn, Glassdoor, Indeed, and ZipRecruiter. The job descriptions are sent to an AI LLM model to determine their suitability for you and organize all jobs for you in Excel file.
Features
- Multi-platform Job Scraping: Automatically scrape jobs from LinkedIn, Glassdoor, Indeed, and ZipRecruiter.
- AI-based Filtering: The scraped job descriptions are sent to an AI model for evaluation based on your predefined criteria.
- Duplicate Prevention: Built-in mechanism to prevent sending the same job to the AI more than once.
- Simple and Adjustable: The code is straightforward and easy to modify to suit your needs.
Installation
-
Install the required packages:
pip install -r requirements.txttested with python 3.12
-
Obtain OpenAI API Key:
- Get your OpenAI API key.
-
Create a
.envfile:- Use the
.env-examplefile provided as a template. - Replace the placeholders with your actual values.
- Use the
-
Write Instructions for the AI:
- Write your criteria or preferences for the AI in the
instructions.txtfile. - An example is provided in the
instructions-example.txtfile.
- Write your criteria or preferences for the AI in the
Usage
-
Run the scraping and AI filtering process:
python jobs.py -
Output:
The results will be saved in an
jobs.xlsxfile in the project directory.
Notes
- Make sure your
.envandinstruction.txtfiles are properly configured before running the script. - You can reuse the AI assistant in your project by printing the assistant instance ID in main function for the first time and then configur it in the
.envfile for the next running. - LinkedIn typically blocks scraping after a certain period. Try running it again or adjust the
scrape_and_filter_aifunction by modifying the offset or the loop. You can also save the results to a temporary file to avoid losing data if the program gets stuck. - The project is designed to be easily customizable, so feel free to adjust the scraping and filtering logic as needed.
Support
If you found this project helpful, please consider giving it a ⭐ on GitHub. Your support is much appreciated!
If you'd like to buy me a coffee, you can do so here. ☕
