97 skills found · Page 1 of 4
kthohr / StatsA C++ header-only library of statistical distribution functions.
erdogant / Distfitdistfit is a python library for probability density fitting.
python-windrose / WindroseA Python Matplotlib, Numpy library to manage wind data, draw windrose (also known as a polar rose plot), draw probability density function and fit Weibull distribution
zfit / ZfitModel manipulation and fitting library based on TensorFlow and optimised for simple and direct manipulation of probability density functions. Its main focus is on scalability, parallelisation and user friendly experience.
reddyprasade / Machine Learning With Scikit Learn Python 3.xIn general, a learning problem considers a set of n samples of data and then tries to predict properties of unknown data. If each sample is more than a single number and, for instance, a multi-dimensional entry (aka multivariate data), it is said to have several attributes or features. Learning problems fall into a few categories: supervised learning, in which the data comes with additional attributes that we want to predict (Click here to go to the scikit-learn supervised learning page).This problem can be either: classification: samples belong to two or more classes and we want to learn from already labeled data how to predict the class of unlabeled data. An example of a classification problem would be handwritten digit recognition, in which the aim is to assign each input vector to one of a finite number of discrete categories. Another way to think of classification is as a discrete (as opposed to continuous) form of supervised learning where one has a limited number of categories and for each of the n samples provided, one is to try to label them with the correct category or class. regression: if the desired output consists of one or more continuous variables, then the task is called regression. An example of a regression problem would be the prediction of the length of a salmon as a function of its age and weight. unsupervised learning, in which the training data consists of a set of input vectors x without any corresponding target values. The goal in such problems may be to discover groups of similar examples within the data, where it is called clustering, or to determine the distribution of data within the input space, known as density estimation, or to project the data from a high-dimensional space down to two or three dimensions for the purpose of visualization (Click here to go to the Scikit-Learn unsupervised learning page).
wavefunction91 / GauXCGauXC is a modern, modular C++ library for the evaluation of quantities related to the exchange-correlation (XC) energy (e.g. potential, etc) in the Gaussian basis set discretization of Kohn-Sham density function theory (KS-DFT) on heterogenous architectures.
wwwtyro / Isosurface GeneratorA JS generator function that returns a list of vertices describing an isosuface given a density and level.
AndreyAkinshin / GgwaterfallR package with functions for drawing density and frequency trail waterfall plots
AbInitioQHub / CoquiAb initio electronic structure beyond density function theory
y2kblog / NUCLEO L476RG DFSDM PDM MicThis project acquires the PDM (Pulse Density Modulation) microphone signal using DFSDM (Digital filter for Sigma-Delta modulators interface) function of STM32 MCU and outputs its frequency characteristics by using FFT.
limingado / NSCThe code is an implementation of the Nystrӧm-based spectral clustering with the K-nearest neighbour-based sampling (KNNS) method (Pang et al. 2021). It is aimed for individual tree segmentation using airborne LiDAR point cloud data. When using the code, please cite as: Yong Pang, Weiwei Wang, Liming Du, Zhongjun Zhang, Xiaojun Liang, Yongning Li, Zuyuan Wang (2021) Nystrӧm-based spectral clustering using airborne LiDAR point cloud data for individual tree segmentation, International Journal of Digital Earth Code files: ‘segmentation.py’: the main function, including deriving local maximum from Canopy Height Model (CHM); ‘VNSC.py’: other functions for the algorithm, including mean-shift voxelization, similarity graph construction, KNNS sampling, eigendecomposition, k-means clustering, as well as the computation and writing of individual tree parameters. Key parameters: When using the code, users can adjust the values of local maximum window, gap (the upper limit of the number of final clusters), knn (the number of k-nearest neighbours in the similarity graph) and quantile in meanshift method based specific data characteristics. Currently, the value of local maximum window is 3m ×3m, the value of gap is defined as the 1.5 times of the local maximum detected from CHM. Parameter knn can be defined as a constant value (40 in the code) based on the data characteristics, or be determined through the relationship between it and the number of voxels. The default setting of quantile in meanshift method is the average density of point clouds. More details can be found in Pang et al. (2021). Test data: ‘ALS_pointclouds.txt’: point cloud data; ‘ALS_CHM.tif’: CHM of the point cloud data; ‘Reference_tree.csv’: field measurements for algorithm validation. The position was measured using differential GNSS. The tree height of each tree in this file is obtained by regression estimation. Outputs: ‘Data_seg.csv’: coordinate of each point (x, y, z) as well as its cluster label after segmentation; ‘Parameter.csv’: individual tree parameters (TreeID, Position_X, Position_Y, Crown, Height) based on the calculation described in Pang et al. (2021).
sayantann11 / Clustering Modelsfor MLlustering in Machine Learning Introduction to Clustering It is basically a type of unsupervised learning method . An unsupervised learning method is a method in which we draw references from datasets consisting of input data without labelled responses. Generally, it is used as a process to find meaningful structure, explanatory underlying processes, generative features, and groupings inherent in a set of examples. Clustering is the task of dividing the population or data points into a number of groups such that data points in the same groups are more similar to other data points in the same group and dissimilar to the data points in other groups. It is basically a collection of objects on the basis of similarity and dissimilarity between them. For ex– The data points in the graph below clustered together can be classified into one single group. We can distinguish the clusters, and we can identify that there are 3 clusters in the below picture. It is not necessary for clusters to be a spherical. Such as : DBSCAN: Density-based Spatial Clustering of Applications with Noise These data points are clustered by using the basic concept that the data point lies within the given constraint from the cluster centre. Various distance methods and techniques are used for calculation of the outliers. Why Clustering ? Clustering is very much important as it determines the intrinsic grouping among the unlabeled data present. There are no criteria for a good clustering. It depends on the user, what is the criteria they may use which satisfy their need. For instance, we could be interested in finding representatives for homogeneous groups (data reduction), in finding “natural clusters” and describe their unknown properties (“natural” data types), in finding useful and suitable groupings (“useful” data classes) or in finding unusual data objects (outlier detection). This algorithm must make some assumptions which constitute the similarity of points and each assumption make different and equally valid clusters. Clustering Methods : Density-Based Methods : These methods consider the clusters as the dense region having some similarity and different from the lower dense region of the space. These methods have good accuracy and ability to merge two clusters.Example DBSCAN (Density-Based Spatial Clustering of Applications with Noise) , OPTICS (Ordering Points to Identify Clustering Structure) etc. Hierarchical Based Methods : The clusters formed in this method forms a tree-type structure based on the hierarchy. New clusters are formed using the previously formed one. It is divided into two category Agglomerative (bottom up approach) Divisive (top down approach) examples CURE (Clustering Using Representatives), BIRCH (Balanced Iterative Reducing Clustering and using Hierarchies) etc. Partitioning Methods : These methods partition the objects into k clusters and each partition forms one cluster. This method is used to optimize an objective criterion similarity function such as when the distance is a major parameter example K-means, CLARANS (Clustering Large Applications based upon Randomized Search) etc. Grid-based Methods : In this method the data space is formulated into a finite number of cells that form a grid-like structure. All the clustering operation done on these grids are fast and independent of the number of data objects example STING (Statistical Information Grid), wave cluster, CLIQUE (CLustering In Quest) etc. Clustering Algorithms : K-means clustering algorithm – It is the simplest unsupervised learning algorithm that solves clustering problem.K-means algorithm partition n observations into k clusters where each observation belongs to the cluster with the nearest mean serving as a prototype of the cluster . Applications of Clustering in different fields Marketing : It can be used to characterize & discover customer segments for marketing purposes. Biology : It can be used for classification among different species of plants and animals. Libraries : It is used in clustering different books on the basis of topics and information. Insurance : It is used to acknowledge the customers, their policies and identifying the frauds. City Planning: It is used to make groups of houses and to study their values based on their geographical locations and other factors present. Earthquake studies: By learning the earthquake-affected areas we can determine the dangerous zones. References : Wiki Hierarchical clustering Ijarcs matteucc analyticsvidhya knowm
olafmersmann / TruncnormDensity, probability, quantile and random number generation functions for the truncated normal distribution.
jaskarth / RhoOptimizing density function compiler
PavanAnanthSharma / Breeden Litzenberger Formula For Risk Neutral DensitiesThe Breeden-Litzenberger formula, proposed by Douglas T. Breeden and Robert H. Litzenberger in 1978, is a method used to extract the implied risk-neutral probability density function from observed option prices
yanji84 / Keras MdnImplementation of mixture density network layer and objective function in Keras
ChrKoenig / R Marginal PlotFunction to plot xy-plot with marginal density functions
CenterForAssessment / SGPFunctions to calculate student growth percentiles and percentile growth projections/trajectories for students using large scale, longitudinal assessment data. Functions use quantile regression to estimate the conditional density associated with each student's achievement history. Percentile growth projections/trajectories are calculated using the coefficient matrices derived from the quantile regression analyses and specify what percentile growth is required for students to reach future achievement targets.
bearloga / TinydensRA set of RStudio add-ins for playing with distribution parameters and visualizing the resulting probability density and mass functions.
InseeFr / BtbAn R package which provides functions dedicated to urban analysis and kernel density estimation.