SkillAgentSearch skills...

SQLQueryLineage

SQLQueryLineage is a tool that parses SQL stored procedures and generates column-level lineage in JSON format. The tool extracts information from the stored procedures and outputs a JSON file that contains information about the columns and their dependencies.

Install / Use

/learn @SimenNielsen/SQLQueryLineage
About this skill

Quality Score

0/100

Supported Platforms

Universal

README

SQL Query Lineage

.NET Workflow

This project is a .NET console application that parses MS SQL stored procedures and generates column-level lineage in JSON format. The tool extracts information from the stored procedures and outputs a JSON file that contains information about the columns and their dependencies.

Be sure to check out the demo.

Table of Contents

Installation

Download that latest release from Releases.

Usage

Sample usage:

sqllineage parse --filepath="D:\dev\sp_sample2.sql" --schema=testschema --database=testdb --output-filepath="C:\test.json"

Will parse the file specified with the -file parameter and output the result to the file specified with the -output-file parameter. The -schema and -database parameters are optional and will default to dbo and master respectively.

The json contains a list of events found in a specified file. Event types are:

  • SELECT = 0
  • SELECT INTO = 1
  • INSERT = 2
  • UPDATE = 3
  • EXECUTE = 4
  • MERGE = 5

Some queries require a connection string to the query database, e.g. when the upstream is unclear when column alias is not mentioned. To allow connection to a specific database we need to set these 4 environment variables:

  • SOURCE_HOST
  • SOURCE_PORT
  • SOURCE_USER
  • SOURCE_PASS <br>

To set environment variables in Windows Powershell: $env:SOURCE_HOST="localhost".

Sample Output

The output file generated by the tool will contain a JSON array with information about each column and its dependencies. Given following query, here is the output json content.

Input

Using the following parameters: sqllineage parse --filepath="C:\Users\User\Documents\SQL\testing\sample_query.sql" --schema="dbo" --database="TestDB" --output-filepath="C:\Users\User\Documents\SQL\testing\output\out.json" sample_query.sql:

CREATE PROCEDURE get_order_info @order_id INT
AS
BEGIN
    SELECT o.id AS order_id,
           o.customer_id,
           c.name AS customer_name,
           c.email AS customer_email,
           o.date AS order_date
    FROM orders o
    INNER JOIN customers c ON o.customer_id = c.id
    WHERE o.id = @order_id
END

Output

{
  "ProcedureEvents": [
    {
      "Type": 1,
      "Columns": [
        {
          "name": "id",
          "alias": "order_id",
          "tableAlias": {
            "alias": "o",
            "tableName": "orders",
            "schemaName": "dbo",
            "databaseName": "testdb"
          },
          "logic": null,
          "sourceColumns": [],
          "upstreamReferenceAlias": "o"
        },
        {
          "name": "customer_id",
          "alias": "customer_id",
          "tableAlias": {
            "alias": "o",
            "tableName": "orders",
            "schemaName": "dbo",
            "databaseName": "testdb"
          },
          "logic": null,
          "sourceColumns": [],
          "upstreamReferenceAlias": "o"
        },
        {
          "name": "name",
          "alias": "customer_name",
          "tableAlias": {
            "alias": "c",
            "tableName": "customers",
            "schemaName": "dbo",
            "databaseName": "testdb"
          },
          "logic": null,
          "sourceColumns": [],
          "upstreamReferenceAlias": "c"
        },
        {
          "name": "email",
          "alias": "customer_email",
          "tableAlias": {
            "alias": "c",
            "tableName": "customers",
            "schemaName": "dbo",
            "databaseName": "testdb"
          },
          "logic": null,
          "sourceColumns": [],
          "upstreamReferenceAlias": "c"
        },
        {
          "name": "date",
          "alias": "order_date",
          "tableAlias": {
            "alias": "o",
            "tableName": "orders",
            "schemaName": "dbo",
            "databaseName": "testdb"
          },
          "logic": null,
          "sourceColumns": [],
          "upstreamReferenceAlias": "o"
        }
      ],
      "Target": null
    }
  ]
}

Contributing

Contributions to this project are welcome. If you have any suggestions or would like to report a bug, please open an issue on the GitHub repository.

License

This project is licensed under the MIT License. See the LICENSE file for details.

View on GitHub
GitHub Stars5
CategoryData
Updated2y ago
Forks0

Languages

C#

Security Score

75/100

Audited on Mar 16, 2024

No findings