LightCurvesClassifier
Package for machine learning of astronomical objects such as light curves
Install / Use
/learn @mavrix93/LightCurvesClassifierREADME
Light Curves Classifier
Introduction
The Light Curve Classifier is a Python package for classifying astronomical objects. It is accomplished mainly by their light curves (time serie), but there are no limits to achieve that by any other attribute of stars. The package can used for several tasks:
- Download light curves from implemented databases by using common query interface
- Create pipeline for extracting features from data
- Train filters from the train sample
- Run systematic search by using filter to find new objects of interest
- Show distribution of objects of interest in chosen feature space
- Visualize natural separation of data by using unsupervised clustering
New filters, database connectors or classifiers can be easily implemented thanks to class interfaces (see "Implementing new classes" section). However there are many of them already included. Package can be used in two ways:
- Using the package
- Using Web Interface
- Running the web interface locally via docker image
- Via command line API
The easiest way how to start is to use Web Interface. There are also section "Guide" with instructions how to use the site. However for more sophisticated tasks is using the package directly as Python package. The package has been designed to be developed easily, so there no limitations.
Release notes
Please note that the package is still in development..
19.04.2018: MR cli_fix:
- CLI is now working
- CLI tests
16.04.2018: MR python3_comp:
- Package refactored to Python 3.6
- CLI need to be still refactored
- Merged with project for web interface
Installation
Pypi
pip install lcc
Also lcc entrypoint will be installed into PATH so CLI commands will be accessible from any path.
See CLI part of the README bellow.
Docker
Docker image with running web interface can be launched by:
docker run -d -p 80:80 mavrix93/lcc_web
Then you can find the website on http://localhost/lcc. It will create default user admin with password nimda.
Dockerfile is part of the git repo, so the image be rebuilded if needed. Also it is possible to use docker container as
environment for lcc - docker run -it mavrix93/lcc_web python.
Philosophy of the program
Let's say that one has data of objects of interest and one would like to find other of these objects in huge databases. No matter what these objects are and what they have in common - all we have to do is to specify few parameters and the program will do all the magic for us.

Description of the stars
Stars can be described by many attributes like: distance, temperature, coordinates, variance, dissimilarity from our template curve, color indexes etc. For particular tasks these "properties of interest" have to be chosen - for example if one desires to classify members of a cluster of stars one would use distance and coordinates as values which describes particular stars. Another example could be distinguishing variable stars from non-variable, for this task one could use something like variance or for example the slopes of fitted light curves (with reduced dimension) by linear function.
Descriptors
Objects/tools which obtain features for an inspected object from the given data. Example descriptors:
Curves Shape Descriptor
Light curves are transformed into words by SAX and compared to the template light curves. The dissimilarity of these two light curves is assigned as the feature to the inspected star.

Histogram Shape Descriptor
Histograms of light curves are shifted to have mean magnitude 0 and transformed to have standart deviation 1. Then it is transformed into words by SAX and compared to the template histograms. The dissimilarity of these two light curves is assigned as the feature to the inspected star.

Variogram Shape Descriptor
Time serie which represents variation of brightness in different time lags. It is also transformed into SAX and compared with template variogram.

Classifying
Data of "stars of interest" and some other contamination data can be used as train sample. By chosing descriptive properties of stars we can transform all stars into parametric coordinates. These values can be used for training some supervised machine methods. After that they are able to decide if an inspected star belongs to the search group of stars.
Searching
There are many connectors to astronomical databases such as: OgleII, Kepler, Asas, Corot and Macho. All one need to do is specify the queries for the selected database.
For systematic searches can be used sequential StarsSearcher or StarsSearcherRedis which uses redis queue (rq) or StarsSearcher for
sequential executing. For the redis option it is needed to run redis server and rq worker:
$ redis-server
$ rq worker lcc
Installation
The package can be easily installed via pip:
pip install lcc
Package
Fundamental objects
The basic object for processing data is "Star" object (lcc.entities.star.Star). It carries all possible information about particular astronomical bodies. Main attributes are:
ident : dict
Dictionary of identifiers of the star. Each key of the dict
is name of a database and its value is another dict of database
identifiers for the star (e.g. 'name') which can be used
as an unique identifier for querying the star. For example:
ident = {"OgleII" : {"name" : "LMC_SC1_1",
"db_ident" : {"field_num" : 1,
"starid" : 1,
"target" : "lmc"},
...}
Please keep convention as is shown above. Star is able to
be queried again automatically if ident key is name of
database connector and it contains dictionary called
"db_ident". This dictionary contains unique query for
the star in the database.
name : str
Optional name of the star across the all databases
coo : astropy.coordinates.sky_coordinate.SkyCoord
Coordinate of the star
more : dict
Additional informations about the star in dictionary. This
attribute can be considered as a container. These parameters
can be then used for filtering. For example it can contains
color indexes:
more = { "b_mag" : 17.56, "v_mag" : 16.23 }
star_class : str
Name of category of the star e.g. 'cepheid', 'RR Lyrae', etc.
light_curves : list
Light curve objects of the star
"Star" objects is the standard input/output of all methods working with star-like data. This unification allows compatible of the whole package with any kind of data (it even don't have to be stars data). They be loaded from dat or fits files (first extension contains metadata and second binary extension contains light curve). Also they can be downloaded by using database connectors or created manually.
Creating a Star object manually and exporting to FITS
import numpy as np
from lcc.entities.star import Star
from lcc.utils.stars import saveStars
## Preparation of data of the star
# Name of the star
star_name = "LMC_SC_1_1"
# Identifier of the star (names of the same object in different databases)
# In our example no counterpart in other catalogs is know so just one entry is saved
# "db_ident" key is query dict which can be used to query the object in particular databases
ident = {"OgleII" : {"name" : "LMC_SC_1_1",
"db_ident" : {"field_num" : 1,
"starid" : 1,
"target" : "lmc"}}}
# Coordinates of the star in degrees. Also it can be astropy SkyCoord object
coordinates = (83.2372045, -70.55790)
# All other information about the object
# This values are just demonstrative (not real)
other_info = {"b_mag" : 14.28,
"i_mag" : 13.54,
"mass_sun" : 1.12,
"distance_pc" : 346.12,
"period_days" : 16.57}
# Light curve created from from 3 arrays (list or other iterable)
time = np.linspace(1, 200, 20)
mag = np.sin(time)
error = np.random.random_sample(20)
# Create Star object
star = Star(name=star_name, ident=ident, coo=coordinates, more=other_info)
# Put light curve into the star object
star
