LinkedinScraper

Python Scrapy project parse people profiles of Linkedin Search and arrange result content in Excel and Json file

Generate Convert Improve

Install / Use

/learn @khaleddallah/LinkedinScraper

About this skill

Quality Score

0/100

README

Linkedin Scraper using Scrapy

Scrape number of profiles that exist in result of Linkedin searchUrl.
Export the content of profiles to Excel and Json files.

Installation

Use the package manager pip to install Scrapy.
(Anaconda Recomended)

cd LinkedinScraperProject     
pip install -r requirements.txt

clone the project

git clone https://github.com/khaleddallah/GoogleImageScrapyDownloader.git

Usage

get into the directory of the project:

cd LinkedinScraperProject

to get help :

python LinkedinScraper -h

<pre> usage: python LinkedinScraper [-h] [-n NUM] [-o OUTPUT] [-p] [-f format] [-m excelMode] (searchUrl or profilesUrl) positional arguments: searchUrl URL of Linkedin search URL or Profiles URL optional arguments: -h, --help show this help message and exit -n NUM num of profiles ** the number must be lower or equal of result number 'page' will parse profiles of url page (10 profiles) (Default) -o OUTPUT Output file -p Enable Parse Profiles -f FORMAT json Json output file excel Excel file output all Json and Excel output files -m EXCELMODE 1 to make each profile in Excel file appear in one row m to make each profile in Excel file appear in multi row </pre>

Examples

Parse ( https://www.linkedin.com/in/khaled-dallah/ and https://www.linkedin.com/in/linustorvalds/ ) profiles and export the result content to ABC.xlsx and ABC.json (-p) because of parsing single profiles

python LinkedinScraper -p -o 'ABC' 'https://www.linkedin.com/in/khaled-dallah/' 'https://www.linkedin.com/in/linustorvalds/'

Parse 23 profiles of searchUrl https://www.linkedin.com/.../?keywords=Robotic&...& if you don't set output name by (-o), Name of result files will be value of keywords (Robotic)

python LinkedinScraper -n 23 'https://www.linkedin.com/search/results/all/?keywords=Robotic&origin=GLOBAL_SEARCH_HEADER'

Parse 17 profiles of searchUrl https://www.linkedin.com/.../?keywords=Robotic&...& and get output as excel file and put the information of each profile in one row

python LinkedinScraper -n 17 -f excel -m 1 'https://www.linkedin.com/search/results/all/?keywords=Robotic&origin=GLOBAL_SEARCH_HEADER'

Built with

Python 3.7
Scrapy
openpyxl

Author

Khaled Dallah - Software Engineer | Python/c++ Developer
khaled.dallah0@gmail.com

Issues:

Report bugs and feature requests here.

Contribute

Contributions are always welcome!

License

This project is licensed under the LGPL-V3.0 License - see the LICENSE.md file for details

Related Skills

qqbot-channel

345.9k

QQ 频道管理技能。查询频道列表、子频道、成员、发帖、公告、日程等操作。使用 qqbot_channel_api 工具代理 QQ 开放平台 HTTP 接口，自动处理 Token 鉴权。当用户需要查看频道、管理子频道、查询成员、发布帖子/公告/日程时使用。

claude-opus-4-5-migration

106.4k

Migrate prompts and code from Claude Sonnet 4.0, Sonnet 4.5, or Opus 4.1 to Opus 4.5

docs-writer

100.0k

`docs-writer` skill instructions As an expert technical writer and editor for the Gemini CLI project, you produce accurate, clear, and consistent documentation. When asked to write, edit, or revie

model-usage

345.9k

Use CodexBar CLI local cost usage to summarize per-model usage for Codex or Claude, including the current (most recent) model or a full model breakdown. Trigger when asked for model-level usage/cost data from codexbar, or when you need a scriptable per-model summary from codexbar cost JSON.

khaleddallah

View profile

View on GitHub

GitHub Stars6

CategoryContent

Updated1y ago

Forks2

khaleddallah/LinkedinScraper

Languages

Python

Security Score

75/100

Audited on Apr 25, 2024

No findings