SkillAgentSearch skills...

Mars

Mars is a tensor-based unified framework for large-scale data computation which scales numpy, pandas, scikit-learn and Python functions.

Install / Use

/learn @mars-project/Mars

README

.. image:: https://raw.githubusercontent.com/mars-project/mars/master/docs/source/images/mars-logo-title.png

|PyPI version| |Docs| |Build| |Coverage| |Quality| |License|

Mars is a tensor-based unified framework for large-scale data computation which scales numpy, pandas, scikit-learn and many other libraries.

Documentation, 中文文档

Installation

Mars is easy to install by

.. code-block:: bash

pip install pymars

Installation for Developers


When you want to contribute code to Mars, you can follow the instructions below to install Mars
for development:

.. code-block:: bash

    git clone https://github.com/mars-project/mars.git
    cd mars
    pip install -e ".[dev]"

More details about installing Mars can be found at
`installation <https://mars-project.readthedocs.io/en/latest/installation/index.html>`_ section in
Mars document.


Architecture Overview
---------------------

.. image:: https://raw.githubusercontent.com/mars-project/mars/master/docs/source/images/architecture.png


Getting Started
---------------

Starting a new runtime locally via:

.. code-block:: python

    >>> import mars
    >>> mars.new_session()

Or connecting to a Mars cluster which is already initialized.

.. code-block:: python

    >>> import mars
    >>> mars.new_session('http://<web_ip>:<ui_port>')


Mars Tensor
-----------

Mars tensor provides a familiar interface like Numpy.

+-----------------------------------------------+-----------------------------------------------+
| **Numpy**                                     | **Mars tensor**                               |
+-----------------------------------------------+-----------------------------------------------+
|.. code-block:: python                         |.. code-block:: python                         |
|                                               |                                               |
|    import numpy as np                         |    import mars.tensor as mt                   |
|    N = 200_000_000                            |    N = 200_000_000                            |
|    a = np.random.uniform(-1, 1, size=(N, 2))  |    a = mt.random.uniform(-1, 1, size=(N, 2))  |
|    print((np.linalg.norm(a, axis=1) < 1)      |    print(((mt.linalg.norm(a, axis=1) < 1)     |
|          .sum() * 4 / N)                      |            .sum() * 4 / N).execute())         |
|                                               |                                               |
+-----------------------------------------------+-----------------------------------------------+
|.. code-block::                                |.. code-block::                                |
|                                               |                                               |
|    3.14174502                                 |     3.14161908                                |
|    CPU times: user 11.6 s, sys: 8.22 s,       |     CPU times: user 966 ms, sys: 544 ms,      |
|               total: 19.9 s                   |                total: 1.51 s                  |
|    Wall time: 22.5 s                          |     Wall time: 3.77 s                         |
|                                               |                                               |
+-----------------------------------------------+-----------------------------------------------+

Mars can leverage multiple cores, even on a laptop, and could be even faster for a distributed setting.


Mars DataFrame
--------------

Mars DataFrame provides a familiar interface like pandas.

+-----------------------------------------+-----------------------------------------+
| **Pandas**                              | **Mars DataFrame**                      |
+-----------------------------------------+-----------------------------------------+
|.. code-block:: python                   |.. code-block:: python                   |
|                                         |                                         |
|    import numpy as np                   |    import mars.tensor as mt             |
|    import pandas as pd                  |    import mars.dataframe as md          |
|    df = pd.DataFrame(                   |    df = md.DataFrame(                   |
|        np.random.rand(100000000, 4),    |        mt.random.rand(100000000, 4),    |
|        columns=list('abcd'))            |        columns=list('abcd'))            |
|    print(df.sum())                      |    print(df.sum().execute())            |
|                                         |                                         |
+-----------------------------------------+-----------------------------------------+
|.. code-block::                          |.. code-block::                          |
|                                         |                                         |
|    CPU times: user 10.9 s, sys: 2.69 s, |    CPU times: user 1.21 s, sys: 212 ms, |
|               total: 13.6 s             |               total: 1.42 s             |
|    Wall time: 11 s                      |    Wall time: 2.75 s                    |
+-----------------------------------------+-----------------------------------------+


Mars Learn
----------

Mars learn provides a familiar interface like scikit-learn.

+---------------------------------------------+----------------------------------------------------+
| **Scikit-learn**                            | **Mars learn**                                     |
+---------------------------------------------+----------------------------------------------------+
|.. code-block:: python                       |.. code-block:: python                              |
|                                             |                                                    |
|    from sklearn.datasets import make_blobs  |    from mars.learn.datasets import make_blobs      |
|    from sklearn.decomposition import PCA    |    from mars.learn.decomposition import PCA        |
|    X, y = make_blobs(                       |    X, y = make_blobs(                              |
|        n_samples=100000000, n_features=3,   |        n_samples=100000000, n_features=3,          |
|        centers=[[3, 3, 3], [0, 0, 0],       |        centers=[[3, 3, 3], [0, 0, 0],              |
|                 [1, 1, 1], [2, 2, 2]],      |                  [1, 1, 1], [2, 2, 2]],            |
|        cluster_std=[0.2, 0.1, 0.2, 0.2],    |        cluster_std=[0.2, 0.1, 0.2, 0.2],           |
|        random_state=9)                      |        random_state=9)                             |
|    pca = PCA(n_components=3)                |    pca = PCA(n_components=3)                       |
|    pca.fit(X)                               |    pca.fit(X)                                      |
|    print(pca.explained_variance_ratio_)     |    print(pca.explained_variance_ratio_)            |
|    print(pca.explained_variance_)           |    print(pca.explained_variance_)                  |
|                                             |                                                    |
+---------------------------------------------+----------------------------------------------------+

Mars learn also integrates with many libraries:

- `TensorFlow <https://mars-project.readthedocs.io//en/latest/user_guide/learn/tensorflow.html>`_
- `PyTorch <https://mars-project.readthedocs.io/en/latest/user_guide/learn/pytorch.html>`_
- `XGBoost <https://mars-project.readthedocs.io/en/latest/user_guide/learn/xgboost.html>`_
- `LightGBM <https://mars-project.readthedocs.io/en/latest/user_guide/learn/lightgbm.html>`_
- `Joblib <https://mars-project.readthedocs.io/en/latest/user_guide/learn/joblib.html>`_
- `Statsmodels <https://mars-project.readthedocs.io/en/latest/user_guide/learn/statsmodels.html>`_

Mars remote
-----------

Mars remote allows users to execute functions in parallel.

+-------------------------------------------+--------------------------------------------+
| **Vanilla function calls**                | **Mars remote**                            |
+-------------------------------------------+--------------------------------------------+
|.. code-block:: python                     |.. code-block:: python                      |
|                                           |                                            |
|    import numpy as np                     |    import numpy as np                      |
|                                           |    import mars.remote as mr                |
|                                           |                                            |
|    def calc_chunk(n, i):                  |    def calc_chunk(n, i):                   |
|        rs = np.random.RandomState(i)      |        rs = np.random.RandomState(i)       |
|        a = rs.uniform(-1, 1, size=(n, 2)) |        a = rs.uniform(-1, 1, size=(n, 2))  |
|        d = np.linalg.norm(a, axis=1)      |        d = np.linalg.norm(a, axis=1)       |
|        return (d < 1).sum()               |        return (d < 1).sum()                |
|                                           |                                            |
|    def calc_pi(fs, N):                    |    def calc_pi(fs, N):                     |
|        return sum(fs) * 4 / N             |        return sum(fs) * 4 / N              |
|                                           |                                            |
|    N = 200_000_000                        |    N = 200_000_000                         |
|    n = 10_000_000                         |    n = 10_000_000                          |
|                                           |                                            |
|    fs = [calc_chunk(n, i)                 |    fs = [mr.spawn(calc_chunk, args=(n, i)) |
|          for i in range(N // n)]          |          for i in range(N // n)]           |
|    pi = calc_pi(fs, N)                    |    pi = mr.spawn(calc_

Related Skills

View on GitHub
GitHub Stars2.7k
CategoryEducation
Updated5d ago
Forks324

Languages

Python

Security Score

100/100

Audited on Mar 21, 2026

No findings