SkillAgentSearch skills...

Starparser

Manipulate and mine Relion .star files

Install / Use

/learn @sami-chaaban/Starparser
About this skill

Quality Score

0/100

Supported Platforms

Universal

README

starparser

Use this package to manipulate Relion star files, including counting, modifying, plotting, and sifting the data. At the very least, this is a useful alternative to awk commands, which can get awkward. Below is a description of the command-line options with some examples. Alternatively, use starparser within Relion or load the modules in your own Python scripts.

Some of the options below are already available in Relion with "relion_star_handler".

  1. Installation
  2. Command-line options
  3. Tips
  4. Limitations
  5. Relion GUI usage
  6. Scripting
  7. Examples
  8. License

Installation<a name="installation"></a>

  • Set up a fresh conda environment with Python >= 3.8: conda create -n sp python=3.8

  • Activate the environment: conda activate sp.

  • Install starparser: pip install starparser

Command-line options<a name="cmdops"></a>

Usage

starparser [input] [options]

Typically, you just need to pass the star file starparser --i input.star followed by the desired option and its arguments.

starparser --i particles.star --count
starparser --i particles.star --list_column OriginX

For some options, a second star file can also be passed as input --f secondfile.star.

starparser --i particles1.star --find_shared MicrographName --f particles2.star

The list of options are organized by Data Mining, Modifications, and Plots. Arguments that are not required are surrounded by parentheses in the descriptions below. Do not include the parentheses in your arguments.

Input

--i filename.star

Path to the star file. This is a required input.

--f filename.star

Path to a second file, if necessary.

Data Mining Options<a name="mining"></a>

--extract --c column --q query (--e)

Find particles that match a column header --c and query --q (see the Querying options) and write them to a new star file (default output.star, or specified with --o).

--limit column/comparator/value

Extract particles that match a specific comparison (lt for less than, gt for greater than, le for less than or equal to, ge for greater than or equal to). The argument to pass is "column/comparator/value" (e.g. DefocusU/lt/40000 for defocus values less than 40000).

--count (--c column --q query (--e))

Count the number of particles and display the result. Optionally, this can be used with --c and --q to only count a subset of particles that match the query (see the Querying options), otherwise counts all.

--count_mics (--c column --q query (--e))

Count the number of unique micrographs and display the result. Optionally, this can be used with --c and --q to only count a subset of particles that match the query (see the Querying options), otherwise counts all.

--list_column column-name(s) (--c column --q query (--e))

Write all values of a column to a file. For example, passing MicrographName will write all values to MicrographName.txt. To output multiple columns, separate the column names with a slash (for example, MicrographName/CoordinateX outputs MicrographName.txt and CoordinateX.txt). Optionally, this can be used with --c and --q to only consider particles that match the query (see the Querying options), otherwise it lists all values.

--find_shared column-name --f otherfile.star

Find particles that are shared between the input star file and the one provided by --f based on the column provided here. Two new star files will be output, one with the shared particles and one with the unique particles.

--match_mics

Keep only particles from micrographs that also exist in a second star file provided by --f. Output will be written to output.star (or specified with --o).

--extract_optics

Find optics groups that match a column header --c and query --q (see the Querying options) and write the corresponding particles to a new star file. Output will be written to output.star (or specified with --o).

--extract_min minimum-value

Find the micrographs that have this minimum number of particles in them and extract all the particles belonging to them.

--extract_if_nearby distance --f otherfile.star

For every particle in the input star file, check the nearest particle in a second star file provided by --f; particles that have a neighbor closer than the distance (in pixels) provided here will be written to particles_close.star, and those that don't will be written to particles_far.star. Particles that couldn't be matched to a neighbor will be skipped (i.e. if the second star file lacks particles in that micrograph). It will also output a histogram of nearest distances to Particles_distances.png (use --t to change the file type; see the Output options).

--extract_clusters threshold-distance/minimum-number

Extract particles that have a minimum number of neighbors within a given radius. For example, passing 400/4 extracts particles with at least 4 neighbors within 400 pixels.

--extract_indices --f indices.txt

Extract particles with indices that match a list in a second file (specified by --f). The second file must be a single column list of numbers with values between 1 and the last particle index of the star file. The result is written to output.star (or specified with --o).

--extract_random number-of-particles (--c column --q query (--e))

Get a random set of particles totaling the number provided here. Optionally, use --c and --q to extract a random set of each passed query in the specified column (see the Querying options); in this case, the output star files will have the name(s) of the query(ies). Otherwise, a random set from all particles will be written to output.star (or specified with --o).

--remove_poses

Remove poses based on the AngleRot and AngleTilt columns using an interactive scatter plot. Use the lasso tool to select poses to remove then press enter to remove them. Continue removing and then press "e" to save and exit. Output will be written to output.star (or specified with --o).

--split number-of-files

Split the input star file into the number of star files passed here, making sure not to separate particles that belong to the same micrograph. The files will have the input file name with the suffix "_split-#". Note that they will not necessarily contain exactly the same number of particles.

--split_classes

Split the input star file into independent star files for each class. The files will have the names "Class_#.star".

--split_optics

Split the input star file into independent star files for each optics group. The files will have the names of the optics group.

--sort_by column-name(/n)

Sort the columns in ascending order according to the column passed here. Outputs a new file to output.star (or specified with --o). Add a slash followed by "n" if the column contains numeric values (e.g. ClassNumber/n); otherwise, it will sort the values as text.

Modification Options<a name="modify"></a>

--operate column-name[operator]value

Perform operation on all values of a column. The argument to pass is column[operator]value (without the brackets and without any spaces); operators include "*", "/", "+", and "-" (e.g. HelicalTrackLength*0.25). The result is written to a new star file (default output.star, or specified with --o). If your terminal throws an error, try surrounding the argument with quotations (e.g. "HelicalTrackLength*0.25").

--operate_columns column1[operator]column2=newcolumn

Perform operation between two columns and write to a new column. The argument to pass is column1[operator]column2=newcolumn (without the brackets and without any spaces); operators include "*", "/", "+", and "-" (e.g. CoordinateX+OriginX=ShiftedX). If your terminal throws an error, try surrounding the argument with quotations (e.g. "CoordinateX+OriginX=ShiftedX").

--remove_column column-name(s)

Remove column, renumber headers, and write to a new star file (default output.star, or specified with --o). E.g. MicrographName. To enter multiple columns, separate them with a slash: MicrographName/CoordinateX. Note that "relion_star_handler --remove_column" also does this.

--remove_particles --c column --q query (--e)

Remove particles that match a query (specified with --q) within a column header (specified with --c; see the Querying options), and write to a new star file (default output.star, or specified with --o).

--remove_duplicates column-name

Remove duplicate particles based on the column provided here (e.g. ImageName) (one instance of the duplicate is retained).

--remove_mics_list --f micrographs.txt

Remove particles that belong to micrographs that have a match in a second file provided by --f, and write to a new star file (default output.star, or specified with --o). You only need to have the micrograph names and not necessarily the full paths in the second file.

--keep_mics_list --f micrographs.txt

Keep particles that belong to micrographs that have a match in a second file provided by --f, and write to a new star file (default output.star, or specified with --o). You only need to have the micrograph names and not necessarily the full paths in the second file.

**```--in

View on GitHub
GitHub Stars19
CategoryDevelopment
Updated1y ago
Forks4

Languages

Python

Security Score

80/100

Audited on Mar 3, 2025

No findings