SkillAgentSearch skills...

EdaSQL

edaSQL is a python library to bridge the SQL with Exploratory Data Analysis where you can connect to the Database and insert the queries. The query results can be passed to the EDA tool which can give greater insights to the user.

Install / Use

/learn @selva221724/EdaSQL

README

<p align="center"> <img src="https://raw.githubusercontent.com/selva221724/edaSQL/main/readme_src/sql_logo_smaller.png" width="70%" height="70%" > <br><br> </p>

<img src="https://img.shields.io/pypi/v/edaSQL"> <img src="https://img.shields.io/readthedocs/edasql"> <img src="https://img.shields.io/static/v1?label=license&message=MIT&color=green"> <img src="https://img.shields.io/pypi/wheel/edaSQL"> <img src = "https://img.shields.io/pypi/pyversions/edaSQL"> <img src = "https://img.shields.io/github/commit-activity/w/selva221724/edaSQL"> <img src = "https://img.shields.io/github/languages/code-size/selva221724/edaSQL">

SQL Bridge Tool to Exploratory Data Analysis

edaSQL is a library to link SQL to Exploratory Data Analysis and further more in the Data Engineering. This will solve many limitations in the SQL studios available in the market. Use the SQL Query language to get your Table Results.

Installation

Install dependency Packages before installing edaSQL

pip install pyodbc
pip install ipython

Optional dependency for better visualization - Jupyter Notebook

pip install notebook

Now Install using pip . Offical Python Package Here!!

pip install edaSQL

(OR)

Clone this Repository. Run this from the root directory to install

python setup.py install

Documentation

<img src="https://blog.readthedocs.com/_static/logo-opengraph.png" width="20%" height="20%">

Read the detailed documentation in readthedocs.io (still under the development)

License

The license for edaSQL is MIT license

Need help?

Stuck on your edaSQL code or problem? Any other questions? Don't hestitate to send me an email (selva221724@gmail.com).

edaSQL Jupyter NoteBook Tutorial

Access the sample Jupyter Notebook here!!

Access the Sample Data Used in this Repo

edaSQL for DataFrame: If you are using the CSV or Excel as a source , Read using the Pandas & start from the 3. Data Overview

Import Packages

import edaSQL
import pandas as pd

1. Connect to the DataBase

edasql = edaSQL.SQL()
edasql.connectToDataBase(server='your server name', 
                         database='your database', 
                         user='username', 
                         password='password',
                         sqlDriver='ODBC Driver 17 for SQL Server')
<img src="https://raw.githubusercontent.com/selva221724/edaSQL/main/readme_src/notebook_results/db_connected.png">

2. Query Data

sampleQuery = "select  * from INX"
data = pd.read_sql(sampleQuery, edasql.dbConnection)
<img src="https://raw.githubusercontent.com/selva221724/edaSQL/main/readme_src/notebook_results/data_sample.png"> <div id="Chapter1"></div>

3. Data Overview

insights =  edaSQL.EDA(dataFrame=data,HTMLDisplay=True)
dataInsights =insights.dataInsights()
<img src="https://raw.githubusercontent.com/selva221724/edaSQL/main/readme_src/notebook_results/1.png">
deepInsights = insights.deepInsights()
<img src="https://raw.githubusercontent.com/selva221724/edaSQL/main/readme_src/notebook_results/2.png">

4. Correlation

eda = edaSQL.EDA(dataFrame=data)
eda.pearsonCorrelation()
<img src="https://raw.githubusercontent.com/selva221724/edaSQL/main/readme_src/notebook_results/3.png">
eda.spearmanCorrelation()
<img src="https://raw.githubusercontent.com/selva221724/edaSQL/main/readme_src/notebook_results/4.png">
eda.kendallCorrelation()
<img src="https://raw.githubusercontent.com/selva221724/edaSQL/main/readme_src/notebook_results/5.png">

5. Missing Values

eda.missingValuesPlot(plot ='matrix')
<img src="https://raw.githubusercontent.com/selva221724/edaSQL/main/readme_src/notebook_results/6.png">
eda.missingValuesPlot(plot ='bar')
<img src="https://raw.githubusercontent.com/selva221724/edaSQL/main/readme_src/notebook_results/7.png">
eda.missingValuesPlot(plot ='heatmap')
<img src="https://raw.githubusercontent.com/selva221724/edaSQL/main/readme_src/notebook_results/8.png">
eda.missingValuesPlot(plot ='dendrogram')
<img src="https://raw.githubusercontent.com/selva221724/edaSQL/main/readme_src/notebook_results/9.png">

6. Outliers

eda.outliersVisualization(plot = 'box')
<img src="https://raw.githubusercontent.com/selva221724/edaSQL/main/readme_src/notebook_results/10.png">
eda.outliersVisualization(plot = 'scatter')
<img src="https://raw.githubusercontent.com/selva221724/edaSQL/main/readme_src/notebook_results/11.png">
outliers = eda.getOutliers()
<img src="https://raw.githubusercontent.com/selva221724/edaSQL/main/readme_src/notebook_results/12.png">

Related Skills

View on GitHub
GitHub Stars10
CategoryData
Updated1y ago
Forks1

Languages

Python

Security Score

80/100

Audited on Jun 30, 2024

No findings