SkillAgentSearch skills...

Ecommercetools

EcommerceTools is a Python data science toolkit for ecommerce, marketing science, and technical SEO analysis and modelling and was created by Matt Clarke.

Install / Use

/learn @practical-data-science/Ecommercetools

README

EcommerceTools

EcommerceTools

EcommerceTools is a data science toolkit for those working in technical ecommerce, marketing science, and technical seo and includes a wide range of features to aid analysis and model building. The package is written in Python and is designed to be used with Pandas and works within a Jupyter notebook environment or in standalone Python projects.

Installation

You can install EcommerceTools and its dependencies via PyPi by entering pip3 install ecommercetools in your terminal, or !pip3 install ecommercetools within a Jupyter notebook cell.


Modules


Transactions

  1. Load sample transaction items data

If you want to get started with the transactions, products, and customers features, you can use the load_sample_data() function to load a set of real world data. This imports the transaction items from widely-used Online Retail dataset and reformats it ready for use by EcommerceTools.

from ecommercetools import utilities

transaction_items = utilities.load_sample_data()
transaction_items.head()
<table border="1" class="dataframe"> <thead> <tr style="text-align: right;"> <th></th> <th>order_id</th> <th>sku</th> <th>description</th> <th>quantity</th> <th>order_date</th> <th>unit_price</th> <th>customer_id</th> <th>country</th> <th>line_price</th> </tr> </thead> <tbody> <tr> <th>0</th> <td>536365</td> <td>85123A</td> <td>WHITE HANGING HEART T-LIGHT HOLDER</td> <td>6</td> <td>2010-12-01 08:26:00</td> <td>2.55</td> <td>17850.0</td> <td>United Kingdom</td> <td>15.30</td> </tr> <tr> <th>1</th> <td>536365</td> <td>71053</td> <td>WHITE METAL LANTERN</td> <td>6</td> <td>2010-12-01 08:26:00</td> <td>3.39</td> <td>17850.0</td> <td>United Kingdom</td> <td>20.34</td> </tr> <tr> <th>2</th> <td>536365</td> <td>84406B</td> <td>CREAM CUPID HEARTS COAT HANGER</td> <td>8</td> <td>2010-12-01 08:26:00</td> <td>2.75</td> <td>17850.0</td> <td>United Kingdom</td> <td>22.00</td> </tr> <tr> <th>3</th> <td>536365</td> <td>84029G</td> <td>KNITTED UNION FLAG HOT WATER BOTTLE</td> <td>6</td> <td>2010-12-01 08:26:00</td> <td>3.39</td> <td>17850.0</td> <td>United Kingdom</td> <td>20.34</td> </tr> <tr> <th>4</th> <td>536365</td> <td>84029E</td> <td>RED WOOLLY HOTTIE WHITE HEART.</td> <td>6</td> <td>2010-12-01 08:26:00</td> <td>3.39</td> <td>17850.0</td> <td>United Kingdom</td> <td>20.34</td> </tr> </tbody> </table>
  1. Create a transaction items dataframe

The utilities module includes a range of tools that allow you to format data, so it can be used within other EcommerceTools functions. The load_transaction_items() function is used to create a Pandas dataframe of formatted transactional item data. When loading your transaction items data, all you need to do is define the column mappings, and the function will reformat the dataframe accordingly.

import pandas as pd
from ecommercetools import utilities

transaction_items = utilities.load_transaction_items('transaction_items_non_standard_names.csv',
                                 date_column='InvoiceDate',
                                 order_id_column='InvoiceNo',
                                 customer_id_column='CustomerID',
                                 sku_column='StockCode',
                                 quantity_column='Quantity',
                                 unit_price_column='UnitPrice'
                                 )
transaction_items.to_csv('transaction_items.csv', index=False)
print(transaction_items.head())
<table> <thead> <tr style="text-align: right;"> <th></th> <th>order_id</th> <th>sku</th> <th>description</th> <th>quantity</th> <th>order_date</th> <th>unit_price</th> <th>customer_id</th> <th>country</th> <th>line_price</th> </tr> </thead> <tbody> <tr> <th>0</th> <td>536365</td> <td>85123A</td> <td>WHITE HANGING HEART T-LIGHT HOLDER</td> <td>6</td> <td>2010-12-01 08:26:00</td> <td>2.55</td> <td>17850.0</td> <td>United Kingdom</td> <td>15.30</td> </tr> <tr> <th>1</th> <td>536365</td> <td>71053</td> <td>WHITE METAL LANTERN</td> <td>6</td> <td>2010-12-01 08:26:00</td> <td>3.39</td> <td>17850.0</td> <td>United Kingdom</td> <td>20.34</td> </tr> <tr> <th>2</th> <td>536365</td> <td>84406B</td> <td>CREAM CUPID HEARTS COAT HANGER</td> <td>8</td> <td>2010-12-01 08:26:00</td> <td>2.75</td> <td>17850.0</td> <td>United Kingdom</td> <td>22.00</td> </tr> <tr> <th>3</th> <td>536365</td> <td>84029G</td> <td>KNITTED UNION FLAG HOT WATER BOTTLE</td> <td>6</td> <td>2010-12-01 08:26:00</td> <td>3.39</td> <td>17850.0</td> <td>United Kingdom</td> <td>20.34</td> </tr> <tr> <th>4</th> <td>536365</td> <td>84029E</td> <td>RED WOOLLY HOTTIE WHITE HEART.</td> <td>6</td> <td>2010-12-01 08:26:00</td> <td>3.39</td> <td>17850.0</td> <td>United Kingdom</td> <td>20.34</td> </tr> </tbody> </table>
  1. Create a transactions dataframe

The get_transactions() function takes the formatted Pandas dataframe of transaction items and returns a Pandas dataframe of aggregated transaction data, which includes features identifying the order number.

import pandas as pd
from ecommercetools import customers

transaction_items = pd.read_csv('transaction_items.csv')
transactions = transactions.get_transactions(transaction_items)
transactions.to_csv('transactions.csv', index=False)
print(transactions.head())
<table> <thead> <tr style="text-align: right;"> <th></th> <th>order_id</th> <th>order_date</th> <th>customer_id</th> <th>skus</th> <th>items</th> <th>revenue</th> <th>replacement</th> <th>order_number</th> </tr> </thead> <tbody> <tr> <th>0</th> <td>536365</td> <td>2010-12-01 08:26:00</td> <td>17850.0</td> <td>7</td> <td>40</td> <td>139.12</td> <td>0</td> <td>1</td> </tr> <tr> <th>1</th> <td>536366</td> <td>2010-12-01 08:28:00</td> <td>17850.0</td> <td>2</td> <td>12</td> <td>22.20</td> <td>0</td> <td>2</td> </tr> <tr> <th>2</th> <td>536367</td> <td>2010-12-01 08:34:00</td> <td>13047.0</td> <td>12</td> <td>83</td> <td>278.73</td> <td>0</td> <td>1</td> </tr> <tr> <th>3</th> <td>536368</td> <td>2010-12-01 08:34:00</td> <td>13047.0</td> <td>4</td> <td>15</td> <td>70.05</td> <td>0</td> <td>2</td> </tr> <tr> <th>4</th> <td>536369</td> <td>2010-12-01 08:35:00</td> <td>13047.0</td> <td>1</td> <td>3</td> <td>17.85</td> <td>0</td> <td>3</td> </tr> </tbody> </table>

Products

1. Get product data from transaction items

products_df = products.get_products(transaction_items)
products_df.head()
<table> <thead> <tr style="text-align: right;"> <th></th> <th>sku</th> <th>first_order_date</th> <th>last_order_date</th> <th>customers</th> <th>orders</th> <th>items</th> <th>revenue</th> <th>avg_unit_price</th> <th>avg_quantity</th> <th>avg_revenue</th> <th>avg_orders</th> <th>product_tenure</th> <th>product_recency</th> </tr> </thead> <tbody> <tr> <th>0</th> <td>10002</td> <td>2010-12-01 08:45:00</td> <td>2011-04-28 15:05:00</td> <td>40</td> <td>73</td> <td>1037</td> <td>759.89</td> <td>1.056849</td> <td>14.205479</td> <td>10.409452</td> <td>1.82</td> <td>3749</td> <td>3600</td> </tr> <tr> <th>1</th> <td>10080</td> <td>2011-02-27 13:47:00</td> <td>2011-11-21 17:04:00</td> <td>19</td> <td>24</td> <td>495</td> <td>119.09</td> <td>0.376667</td> <td>20.625000</td> <td>4.962083</td> <td>1.26</td> <td>3660</td> <td>3393</td> </tr> <tr> <th>2</th> <td>10120</td> <td>2010-12-03 11:19:00</td> <td>2011-12-04 13:15:00</td> <td>25</td> <td>29</td> <td>193</td> <td>40.53</td> <td>0.210000</td> <td>6.433333</td> <td>1.351000</td> <td>1.16</td> <td>3746</td> <td>3380</td> </tr> <tr> <th>3</th> <td>10123C</td> <td>2010-12-03 11:19:00</td> <td>2011-07-15 15:05:00</td> <td>3</td> <td>4</td> <td>-13</td> <td>3.25</td> <td>0.487500</td> <td>-3.250000</td> <td>0.812500</td> <td>1.33</td> <td>3746</td> <td>3522</td> </tr> <tr> <th>4</th> <td>10123G</td> <td>2011-04-08 11:13:00</td> <td>2011-04-08 11:13:00</td> <td>0</td> <td>1</td> <td>-38</td> <td>0.00</td> <td>0.000000</td> <t
View on GitHub
GitHub Stars264
CategoryData
Updated10h ago
Forks52

Languages

Python

Security Score

100/100

Audited on Mar 30, 2026

No findings