SkillAgentSearch skills...

LightCurvesClassifier

Package for machine learning of astronomical objects such as light curves

Install / Use

/learn @mavrix93/LightCurvesClassifier

README

Light Curves Classifier

DOI

Travis Travis Travis Travis

Introduction

The Light Curve Classifier is a Python package for classifying astronomical objects. It is accomplished mainly by their light curves (time serie), but there are no limits to achieve that by any other attribute of stars. The package can used for several tasks:

  • Download light curves from implemented databases by using common query interface
  • Create pipeline for extracting features from data
  • Train filters from the train sample
  • Run systematic search by using filter to find new objects of interest
  • Show distribution of objects of interest in chosen feature space
  • Visualize natural separation of data by using unsupervised clustering

New filters, database connectors or classifiers can be easily implemented thanks to class interfaces (see "Implementing new classes" section). However there are many of them already included. Package can be used in two ways:

  • Using the package
  • Using Web Interface
  • Running the web interface locally via docker image
  • Via command line API

The easiest way how to start is to use Web Interface. There are also section "Guide" with instructions how to use the site. However for more sophisticated tasks is using the package directly as Python package. The package has been designed to be developed easily, so there no limitations.

Release notes

Please note that the package is still in development..

19.04.2018: MR cli_fix: - CLI is now working - CLI tests

16.04.2018: MR python3_comp: - Package refactored to Python 3.6 - CLI need to be still refactored - Merged with project for web interface

Installation

Pypi

pip install lcc

Also lcc entrypoint will be installed into PATH so CLI commands will be accessible from any path. See CLI part of the README bellow.

Docker

Docker image with running web interface can be launched by:

docker run -d -p 80:80 mavrix93/lcc_web

Then you can find the website on http://localhost/lcc. It will create default user admin with password nimda.

Dockerfile is part of the git repo, so the image be rebuilded if needed. Also it is possible to use docker container as environment for lcc - docker run -it mavrix93/lcc_web python.

Philosophy of the program

Let's say that one has data of objects of interest and one would like to find other of these objects in huge databases. No matter what these objects are and what they have in common - all we have to do is to specify few parameters and the program will do all the magic for us.

Workflow

Description of the stars

Stars can be described by many attributes like: distance, temperature, coordinates, variance, dissimilarity from our template curve, color indexes etc. For particular tasks these "properties of interest" have to be chosen - for example if one desires to classify members of a cluster of stars one would use distance and coordinates as values which describes particular stars. Another example could be distinguishing variable stars from non-variable, for this task one could use something like variance or for example the slopes of fitted light curves (with reduced dimension) by linear function.

Descriptors

Objects/tools which obtain features for an inspected object from the given data. Example descriptors:

Curves Shape Descriptor

Light curves are transformed into words by SAX and compared to the template light curves. The dissimilarity of these two light curves is assigned as the feature to the inspected star.

good_lc bad_lc

Histogram Shape Descriptor

Histograms of light curves are shifted to have mean magnitude 0 and transformed to have standart deviation 1. Then it is transformed into words by SAX and compared to the template histograms. The dissimilarity of these two light curves is assigned as the feature to the inspected star.

good_hist bad_hist

Variogram Shape Descriptor

Time serie which represents variation of brightness in different time lags. It is also transformed into SAX and compared with template variogram.

good_vario bad_vario

Classifying

Data of "stars of interest" and some other contamination data can be used as train sample. By chosing descriptive properties of stars we can transform all stars into parametric coordinates. These values can be used for training some supervised machine methods. After that they are able to decide if an inspected star belongs to the search group of stars.

Searching

There are many connectors to astronomical databases such as: OgleII, Kepler, Asas, Corot and Macho. All one need to do is specify the queries for the selected database.

For systematic searches can be used sequential StarsSearcher or StarsSearcherRedis which uses redis queue (rq) or StarsSearcher for sequential executing. For the redis option it is needed to run redis server and rq worker:

$ redis-server
$ rq worker lcc

Installation

The package can be easily installed via pip:

pip install lcc

Package

Fundamental objects

The basic object for processing data is "Star" object (lcc.entities.star.Star). It carries all possible information about particular astronomical bodies. Main attributes are:

ident : dict
        Dictionary of identifiers of the star. Each key of the dict
        is name of a database and its value is another dict of database
        identifiers for the star (e.g. 'name') which can be used
        as an unique identifier for querying the star. For example:
            ident = {"OgleII" : {"name" : "LMC_SC1_1",
                                "db_ident" : {"field_num" : 1,
                                              "starid" : 1,
                                              "target" : "lmc"},
                                              ...}
        Please keep convention as is shown above. Star is able to
        be queried again automatically if ident key is name of
        database connector and it contains dictionary called
        "db_ident". This dictionary contains unique query for
        the star in the database.
        
name : str
    Optional name of the star across the all databases
    
coo : astropy.coordinates.sky_coordinate.SkyCoord
    Coordinate of the star
    
more : dict
    Additional informations about the star in dictionary. This
    attribute can be considered as a container. These parameters
    can be then used for filtering. For example it can contains
    color indexes:
        more = { "b_mag" : 17.56, "v_mag" : 16.23 }
        
star_class : str
    Name of category of the star e.g. 'cepheid', 'RR Lyrae', etc.
    
light_curves : list
    Light curve objects of the star
    

"Star" objects is the standard input/output of all methods working with star-like data. This unification allows compatible of the whole package with any kind of data (it even don't have to be stars data). They be loaded from dat or fits files (first extension contains metadata and second binary extension contains light curve). Also they can be downloaded by using database connectors or created manually.

Creating a Star object manually and exporting to FITS

import numpy as np

from lcc.entities.star import Star
from lcc.utils.stars import saveStars

## Preparation of data of the star
# Name of the star
star_name = "LMC_SC_1_1"

# Identifier of the star (names of the same object in different databases)
# In our example no counterpart in other catalogs is know so just one entry is saved
# "db_ident" key is query dict which can be used to query the object in particular databases
ident = {"OgleII" : {"name" : "LMC_SC_1_1",
                     "db_ident" : {"field_num" : 1,
                                   "starid" : 1,
                                   "target" : "lmc"}}}

# Coordinates of the star in degrees. Also it can be astropy SkyCoord object
coordinates = (83.2372045, -70.55790)
         
# All other information about the object
# This values are just demonstrative (not real)
other_info = {"b_mag" : 14.28,
             "i_mag" : 13.54,
             "mass_sun" : 1.12,
             "distance_pc" : 346.12,
             "period_days" : 16.57}

# Light curve created from from 3 arrays (list or other iterable)
time = np.linspace(1, 200, 20)
mag = np.sin(time)
error = np.random.random_sample(20)

# Create Star object
star = Star(name=star_name, ident=ident, coo=coordinates, more=other_info)

# Put light curve into the star object
star
View on GitHub
GitHub Stars15
CategoryData
Updated1y ago
Forks4

Languages

Python

Security Score

80/100

Audited on Feb 26, 2025

No findings