LinkedinScraper
Python Scrapy project parse people profiles of Linkedin Search and arrange result content in Excel and Json file
Install / Use
/learn @khaleddallah/LinkedinScraperREADME
Linkedin Scraper using Scrapy

- Scrape number of profiles that exist in result of Linkedin searchUrl.
- Export the content of profiles to Excel and Json files.
Installation
- Use the package manager pip to install Scrapy.
(Anaconda Recomended)
cd LinkedinScraperProject
pip install -r requirements.txt
- clone the project
git clone https://github.com/khaleddallah/GoogleImageScrapyDownloader.git
Usage
- get into the directory of the project:
cd LinkedinScraperProject
- to get help :
python LinkedinScraper -h
<pre>
<b>usage:</b>
python LinkedinScraper [-h] [-n NUM] [-o OUTPUT] [-p] [-f format] [-m excelMode] (searchUrl or profilesUrl)
<b>positional arguments:</b>
searchUrl URL of Linkedin search URL or Profiles URL
<b>optional arguments:</b>
-h, --help show this help message and exit
-n NUM num of profiles
** the number must be lower or equal of result number
'page' will parse profiles of url page (10 profiles) (Default)
-o OUTPUT Output file
-p Enable Parse Profiles
-f FORMAT json Json output file
excel Excel file output
all Json and Excel output files
-m EXCELMODE 1 to make each profile in Excel file appear in one row
m to make each profile in Excel file appear in multi row
</pre>
Examples
- Parse <b>(</b> https://www.linkedin.com/in/khaled-dallah/ and https://www.linkedin.com/in/linustorvalds/ <b>) profiles</b> and export the result content to <b>ABC.xlsx</b> and <b>ABC.json</b> <br>(<b>-p</b>) because of parsing single profiles
python LinkedinScraper -p -o 'ABC' 'https://www.linkedin.com/in/khaled-dallah/' 'https://www.linkedin.com/in/linustorvalds/'
- Parse <b>23</b> profiles of searchUrl https://www.linkedin.com/.../?keywords=Robotic&...& <br>if you don't set output name by (-o), Name of result files will be value of keywords (<b>Robotic</b>)
python LinkedinScraper -n 23 'https://www.linkedin.com/search/results/all/?keywords=Robotic&origin=GLOBAL_SEARCH_HEADER'
- Parse <b>17</b> profiles of searchUrl https://www.linkedin.com/.../?keywords=Robotic&...& <br>and get output as <b>excel</b> file and put the information of each profile in <b>one row</b>
python LinkedinScraper -n 17 -f excel -m 1 'https://www.linkedin.com/search/results/all/?keywords=Robotic&origin=GLOBAL_SEARCH_HEADER'
Built with
- Python 3.7
- Scrapy
- openpyxl
Author
- Khaled Dallah - Software Engineer | Python/c++ Developer
khaled.dallah0@gmail.com
Issues:
Report bugs and feature requests here.
Contribute
Contributions are always welcome!
License
This project is licensed under the LGPL-V3.0 License - see the LICENSE.md file for details
Related Skills
qqbot-channel
345.9kQQ 频道管理技能。查询频道列表、子频道、成员、发帖、公告、日程等操作。使用 qqbot_channel_api 工具代理 QQ 开放平台 HTTP 接口,自动处理 Token 鉴权。当用户需要查看频道、管理子频道、查询成员、发布帖子/公告/日程时使用。
claude-opus-4-5-migration
106.4kMigrate prompts and code from Claude Sonnet 4.0, Sonnet 4.5, or Opus 4.1 to Opus 4.5
docs-writer
100.0k`docs-writer` skill instructions As an expert technical writer and editor for the Gemini CLI project, you produce accurate, clear, and consistent documentation. When asked to write, edit, or revie
model-usage
345.9kUse CodexBar CLI local cost usage to summarize per-model usage for Codex or Claude, including the current (most recent) model or a full model breakdown. Trigger when asked for model-level usage/cost data from codexbar, or when you need a scriptable per-model summary from codexbar cost JSON.
