91 skills found · Page 1 of 4
vector-ai / VectoraiVector AI — A platform for building vector based applications. Encode, query and analyse data using vectors.
Aastha2104 / Parkinson Disease PredictionIntroduction Parkinson’s Disease is the second most prevalent neurodegenerative disorder after Alzheimer’s, affecting more than 10 million people worldwide. Parkinson’s is characterized primarily by the deterioration of motor and cognitive ability. There is no single test which can be administered for diagnosis. Instead, doctors must perform a careful clinical analysis of the patient’s medical history. Unfortunately, this method of diagnosis is highly inaccurate. A study from the National Institute of Neurological Disorders finds that early diagnosis (having symptoms for 5 years or less) is only 53% accurate. This is not much better than random guessing, but an early diagnosis is critical to effective treatment. Because of these difficulties, I investigate a machine learning approach to accurately diagnose Parkinson’s, using a dataset of various speech features (a non-invasive yet characteristic tool) from the University of Oxford. Why speech features? Speech is very predictive and characteristic of Parkinson’s disease; almost every Parkinson’s patient experiences severe vocal degradation (inability to produce sustained phonations, tremor, hoarseness), so it makes sense to use voice to diagnose the disease. Voice analysis gives the added benefit of being non-invasive, inexpensive, and very easy to extract clinically. Background Parkinson's Disease Parkinson’s is a progressive neurodegenerative condition resulting from the death of the dopamine containing cells of the substantia nigra (which plays an important role in movement). Symptoms include: “frozen” facial features, bradykinesia (slowness of movement), akinesia (impairment of voluntary movement), tremor, and voice impairment. Typically, by the time the disease is diagnosed, 60% of nigrostriatal neurons have degenerated, and 80% of striatal dopamine have been depleted. Performance Metrics TP = true positive, FP = false positive, TN = true negative, FN = false negative Accuracy: (TP+TN)/(P+N) Matthews Correlation Coefficient: 1=perfect, 0=random, -1=completely inaccurate Algorithms Employed Logistic Regression (LR): Uses the sigmoid logistic equation with weights (coefficient values) and biases (constants) to model the probability of a certain class for binary classification. An output of 1 represents one class, and an output of 0 represents the other. Training the model will learn the optimal weights and biases. Linear Discriminant Analysis (LDA): Assumes that the data is Gaussian and each feature has the same variance. LDA estimates the mean and variance for each class from the training data, and then uses properties of statistics (Bayes theorem , Gaussian distribution, etc) to compute the probability of a particular instance belonging to a given class. The class with the largest probability is the prediction. k Nearest Neighbors (KNN): Makes predictions about the validation set using the entire training set. KNN makes a prediction about a new instance by searching through the entire set to find the k “closest” instances. “Closeness” is determined using a proximity measurement (Euclidean) across all features. The class that the majority of the k closest instances belong to is the class that the model predicts the new instance to be. Decision Tree (DT): Represented by a binary tree, where each root node represents an input variable and a split point, and each leaf node contains an output used to make a prediction. Neural Network (NN): Models the way the human brain makes decisions. Each neuron takes in 1+ inputs, and then uses an activation function to process the input with weights and biases to produce an output. Neurons can be arranged into layers, and multiple layers can form a network to model complex decisions. Training the network involves using the training instances to optimize the weights and biases. Naive Bayes (NB): Simplifies the calculation of probabilities by assuming that all features are independent of one another (a strong but effective assumption). Employs Bayes Theorem to calculate the probabilities that the instance to be predicted is in each class, then finds the class with the highest probability. Gradient Boost (GB): Generally used when seeking a model with very high predictive performance. Used to reduce bias and variance (“error”) by combining multiple “weak learners” (not very good models) to create a “strong learner” (high performance model). Involves 3 elements: a loss function (error function) to be optimized, a weak learner (decision tree) to make predictions, and an additive model to add trees to minimize the loss function. Gradient descent is used to minimize error after adding each tree (one by one). Engineering Goal Produce a machine learning model to diagnose Parkinson’s disease given various features of a patient’s speech with at least 90% accuracy and/or a Matthews Correlation Coefficient of at least 0.9. Compare various algorithms and parameters to determine the best model for predicting Parkinson’s. Dataset Description Source: the University of Oxford 195 instances (147 subjects with Parkinson’s, 48 without Parkinson’s) 22 features (elements that are possibly characteristic of Parkinson’s, such as frequency, pitch, amplitude / period of the sound wave) 1 label (1 for Parkinson’s, 0 for no Parkinson’s) Project Pipeline pipeline Summary of Procedure Split the Oxford Parkinson’s Dataset into two parts: one for training, one for validation (evaluate how well the model performs) Train each of the following algorithms with the training set: Logistic Regression, Linear Discriminant Analysis, k Nearest Neighbors, Decision Tree, Neural Network, Naive Bayes, Gradient Boost Evaluate results using the validation set Repeat for the following training set to validation set splits: 80% training / 20% validation, 75% / 25%, and 70% / 30% Repeat for a rescaled version of the dataset (scale all the numbers in the dataset to a range from 0 to 1: this helps to reduce the effect of outliers) Conduct 5 trials and average the results Data a_o a_r m_o m_r Data Analysis In general, the models tended to perform the best (both in terms of accuracy and Matthews Correlation Coefficient) on the rescaled dataset with a 75-25 train-test split. The two highest performing algorithms, k Nearest Neighbors and the Neural Network, both achieved an accuracy of 98%. The NN achieved a MCC of 0.96, while KNN achieved a MCC of 0.94. These figures outperform most existing literature and significantly outperform current methods of diagnosis. Conclusion and Significance These robust results suggest that a machine learning approach can indeed be implemented to significantly improve diagnosis methods of Parkinson’s disease. Given the necessity of early diagnosis for effective treatment, my machine learning models provide a very promising alternative to the current, rather ineffective method of diagnosis. Current methods of early diagnosis are only 53% accurate, while my machine learning model produces 98% accuracy. This 45% increase is critical because an accurate, early diagnosis is needed to effectively treat the disease. Typically, by the time the disease is diagnosed, 60% of nigrostriatal neurons have degenerated, and 80% of striatal dopamine have been depleted. With an earlier diagnosis, much of this degradation could have been slowed or treated. My results are very significant because Parkinson’s affects over 10 million people worldwide who could benefit greatly from an early, accurate diagnosis. Not only is my machine learning approach more accurate in terms of diagnostic accuracy, it is also more scalable, less expensive, and therefore more accessible to people who might not have access to established medical facilities and professionals. The diagnosis is also much simpler, requiring only a 10-15 second voice recording and producing an immediate diagnosis. Future Research Given more time and resources, I would investigate the following: Create a mobile application which would allow the user to record his/her voice, extract the necessary vocal features, and feed it into my machine learning model to diagnose Parkinson’s. Use larger datasets in conjunction with the University of Oxford dataset. Tune and improve my models even further to achieve even better results. Investigate different structures and types of neural networks. Construct a novel algorithm specifically suited for the prediction of Parkinson’s. Generalize my findings and algorithms for all types of dementia disorders, such as Alzheimer’s. References Bind, Shubham. "A Survey of Machine Learning Based Approaches for Parkinson Disease Prediction." International Journal of Computer Science and Information Technologies 6 (2015): n. pag. International Journal of Computer Science and Information Technologies. 2015. Web. 8 Mar. 2017. Brooks, Megan. "Diagnosing Parkinson's Disease Still Challenging." Medscape Medical News. National Institute of Neurological Disorders, 31 July 2014. Web. 20 Mar. 2017. Exploiting Nonlinear Recurrence and Fractal Scaling Properties for Voice Disorder Detection', Little MA, McSharry PE, Roberts SJ, Costello DAE, Moroz IM. BioMedical Engineering OnLine 2007, 6:23 (26 June 2007) Hashmi, Sumaiya F. "A Machine Learning Approach to Diagnosis of Parkinson’s Disease."Claremont Colleges Scholarship. Claremont College, 2013. Web. 10 Mar. 2017. Karplus, Abraham. "Machine Learning Algorithms for Cancer Diagnosis." Machine Learning Algorithms for Cancer Diagnosis (n.d.): n. pag. Mar. 2012. Web. 20 Mar. 2017. Little, Max. "Parkinsons Data Set." UCI Machine Learning Repository. University of Oxford, 26 June 2008. Web. 20 Feb. 2017. Ozcift, Akin, and Arif Gulten. "Classifier Ensemble Construction with Rotation Forest to Improve Medical Diagnosis Performance of Machine Learning Algorithms." Computer Methods and Programs in Biomedicine 104.3 (2011): 443-51. Semantic Scholar. 2011. Web. 15 Mar. 2017. "Parkinson’s Disease Dementia." UCI MIND. N.p., 19 Oct. 2015. Web. 17 Feb. 2017. Salvatore, C., A. Cerasa, I. Castiglioni, F. Gallivanone, A. Augimeri, M. Lopez, G. Arabia, M. Morelli, M.c. Gilardi, and A. Quattrone. "Machine Learning on Brain MRI Data for Differential Diagnosis of Parkinson's Disease and Progressive Supranuclear Palsy."Journal of Neuroscience Methods 222 (2014): 230-37. 2014. Web. 18 Mar. 2017. Shahbakhi, Mohammad, Danial Taheri Far, and Ehsan Tahami. "Speech Analysis for Diagnosis of Parkinson’s Disease Using Genetic Algorithm and Support Vector Machine."Journal of Biomedical Science and Engineering 07.04 (2014): 147-56. Scientific Research. July 2014. Web. 2 Mar. 2017. "Speech and Communication." Speech and Communication. Parkinson's Disease Foundation, n.d. Web. 22 Mar. 2017. Sriram, Tarigoppula V. S., M. Venkateswara Rao, G. V. Satya Narayana, and D. S. V. G. K. Kaladhar. "Diagnosis of Parkinson Disease Using Machine Learning and Data Mining Systems from Voice Dataset." SpringerLink. Springer, Cham, 01 Jan. 1970. Web. 17 Mar. 2017.
himanshub1007 / Alzhimers Disease Prediction Using Deep Learning# AD-Prediction Convolutional Neural Networks for Alzheimer's Disease Prediction Using Brain MRI Image ## Abstract Alzheimers disease (AD) is characterized by severe memory loss and cognitive impairment. It associates with significant brain structure changes, which can be measured by magnetic resonance imaging (MRI) scan. The observable preclinical structure changes provides an opportunity for AD early detection using image classification tools, like convolutional neural network (CNN). However, currently most AD related studies were limited by sample size. Finding an efficient way to train image classifier on limited data is critical. In our project, we explored different transfer-learning methods based on CNN for AD prediction brain structure MRI image. We find that both pretrained 2D AlexNet with 2D-representation method and simple neural network with pretrained 3D autoencoder improved the prediction performance comparing to a deep CNN trained from scratch. The pretrained 2D AlexNet performed even better (**86%**) than the 3D CNN with autoencoder (**77%**). ## Method #### 1. Data In this project, we used public brain MRI data from **Alzheimers Disease Neuroimaging Initiative (ADNI)** Study. ADNI is an ongoing, multicenter cohort study, started from 2004. It focuses on understanding the diagnostic and predictive value of Alzheimers disease specific biomarkers. The ADNI study has three phases: ADNI1, ADNI-GO, and ADNI2. Both ADNI1 and ADNI2 recruited new AD patients and normal control as research participants. Our data included a total of 686 structure MRI scans from both ADNI1 and ADNI2 phases, with 310 AD cases and 376 normal controls. We randomly derived the total sample into training dataset (n = 519), validation dataset (n = 100), and testing dataset (n = 67). #### 2. Image preprocessing Image preprocessing were conducted using Statistical Parametric Mapping (SPM) software, version 12. The original MRI scans were first skull-stripped and segmented using segmentation algorithm based on 6-tissue probability mapping and then normalized to the International Consortium for Brain Mapping template of European brains using affine registration. Other configuration includes: bias, noise, and global intensity normalization. The standard preprocessing process output 3D image files with an uniform size of 121x145x121. Skull-stripping and normalization ensured the comparability between images by transforming the original brain image into a standard image space, so that same brain substructures can be aligned at same image coordinates for different participants. Diluted or enhanced intensity was used to compensate the structure changes. the In our project, we used both whole brain (including both grey matter and white matter) and grey matter only. #### 3. AlexNet and Transfer Learning Convolutional Neural Networks (CNN) are very similar to ordinary Neural Networks. A CNN consists of an input and an output layer, as well as multiple hidden layers. The hidden layers are either convolutional, pooling or fully connected. ConvNet architectures make the explicit assumption that the inputs are images, which allows us to encode certain properties into the architecture. These then make the forward function more efficient to implement and vastly reduce the amount of parameters in the network. #### 3.1. AlexNet The net contains eight layers with weights; the first five are convolutional and the remaining three are fully connected. The overall architecture is shown in Figure 1. The output of the last fully-connected layer is fed to a 1000-way softmax which produces a distribution over the 1000 class labels. AlexNet maximizes the multinomial logistic regression objective, which is equivalent to maximizing the average across training cases of the log-probability of the correct label under the prediction distribution. The kernels of the second, fourth, and fifth convolutional layers are connected only to those kernel maps in the previous layer which reside on the same GPU (as shown in Figure1). The kernels of the third convolutional layer are connected to all kernel maps in the second layer. The neurons in the fully connected layers are connected to all neurons in the previous layer. Response-normalization layers follow the first and second convolutional layers. Max-pooling layers follow both response-normalization layers as well as the fifth convolutional layer. The ReLU non-linearity is applied to the output of every convolutional and fully-connected layer.  The first convolutional layer filters the 224x224x3 input image with 96 kernels of size 11x11x3 with a stride of 4 pixels (this is the distance between the receptive field centers of neighboring neurons in a kernel map). The second convolutional layer takes as input the (response-normalized and pooled) output of the first convolutional layer and filters it with 256 kernels of size 5x5x48. The third, fourth, and fifth convolutional layers are connected to one another without any intervening pooling or normalization layers. The third convolutional layer has 384 kernels of size 3x3x256 connected to the (normalized, pooled) outputs of the second convolutional layer. The fourth convolutional layer has 384 kernels of size 3x3x192 , and the fifth convolutional layer has 256 kernels of size 3x3x192. The fully-connected layers have 4096 neurons each. #### 3.2. Transfer Learning Training an entire Convolutional Network from scratch (with random initialization) is impractical[14] because it is relatively rare to have a dataset of sufficient size. An alternative is to pretrain a Conv-Net on a very large dataset (e.g. ImageNet), and then use the ConvNet either as an initialization or a fixed feature extractor for the task of interest. Typically, there are three major transfer learning scenarios: **ConvNet as fixed feature extractor:** We can take a ConvNet pretrained on ImageNet, and remove the last fully-connected layer, then treat the rest structure as a fixed feature extractor for the target dataset. In AlexNet, this would be a 4096-D vector. Usually, we call these features as CNN codes. Once we get these features, we can train a linear classifier (e.g. linear SVM or Softmax classifier) for our target dataset. **Fine-tuning the ConvNet:** Another idea is not only replace the last fully-connected layer in the classifier, but to also fine-tune the parameters of the pretrained network. Due to overfitting concerns, we can only fine-tune some higher-level part of the network. This suggestion is motivated by the observation that earlier features in a ConvNet contains more generic features (e.g. edge detectors or color blob detectors) that can be useful for many kind of tasks. But the later layer of the network becomes progressively more specific to the details of the classes contained in the original dataset. **Pretrained models:** The released pretrained model is usually the final ConvNet checkpoint. So it is common to see people use the network for fine-tuning. #### 4. 3D Autoencoder and Convolutional Neural Network We take a two-stage approach where we first train a 3D sparse autoencoder to learn filters for convolution operations, and then build a convolutional neural network whose first layer uses the filters learned with the autoencoder.  #### 4.1. Sparse Autoencoder An autoencoder is a 3-layer neural network that is used to extract features from an input such as an image. Sparse representations can provide a simple interpretation of the input data in terms of a small number of \parts by extracting the structure hidden in the data. The autoencoder has an input layer, a hidden layer and an output layer, and the input and output layers have same number of units, while the hidden layer contains more units for a sparse and overcomplete representation. The encoder function maps input x to representation h, and the decoder function maps the representation h to the output x. In our problem, we extract 3D patches from scans as the input to the network. The decoder function aims to reconstruct the input form the hidden representation h. #### 4.2. 3D Convolutional Neural Network Training the 3D convolutional neural network(CNN) is the second stage. The CNN we use in this project has one convolutional layer, one pooling layer, two linear layers, and finally a log softmax layer. After training the sparse autoencoder, we take the weights and biases of the encoder from trained model, and use them a 3D filter of a 3D convolutional layer of the 1-layer convolutional neural network. Figure 2 shows the architecture of the network. #### 5. Tools In this project, we used Nibabel for MRI image processing and PyTorch Neural Networks implementation.
mljs / DistanceDistance functions to compare vectors
Xiaoyang-Rebecca / PatternRecognition MatlabFeature reduction projections and classifier models are learned by training dataset and applied to classify testing dataset. A few approaches of feature reduction have been compared in this paper: principle component analysis (PCA), linear discriminant analysis (LDA) and their kernel methods (KPCA,KLDA). Correspondingly, a few approaches of classification algorithm are implemented: Support Vector Machine (SVM), Gaussian Quadratic Maximum Likelihood and K-nearest neighbors (KNN) and Gaussian Mixture Model(GMM).
gptechday / Openai Academy Rag Comparison DemoA NextJS demo of comparing pure vector RAG vs Graph RAG on the Synthea dataset for Open AI Academy
Shauqi / Attack And Anomaly Detection In IoT Sensors In IoT Sites Using Machine Learning ApproachesAttack and Anomaly detection in the Internet of Things (IoT) infrastructure is a rising concern in the domain of IoT. With the increased use of IoT infrastructure in every domain, threats and attacks in these infrastructures are also growing commensurately. Denial of Service, Data Type Probing, Malicious Control, Malicious Operation, Scan, Spying and Wrong Setup are such attacks and anomalies which can cause an IoT system failure. In this paper, performances of several machine learning models have been compared to predict attacks and anomalies on the IoT systems accurately. The machine learning (ML) algorithms that have been used here are Logistic Regression (LR), Support Vector Machine (SVM), Decision Tree (DT), Random Forest (RF), and Artificial Neural Network (ANN). The evaluation metrics used in the comparison of performance are accuracy, precision, recall, f1 score, and area under the Receiver Operating Characteristic Curve. The system obtained 99.4% test accuracy for Decision Tree, Random Forest, and ANN. Though these techniques have the same accuracy, other metrics prove that Random Forest performs comparatively better.
shruti821 / Leaf Disease Detection Using Image ProcessingAgricultural productivity is something on which economy highly depends. This is the one of the reasons that disease detection in plants plays an important role in agriculture field, as having disease in plants are quite natural. If proper care is not taken in this area then it causes serious effects on plants and due to which respective product quality, quantity or productivity is affected. For instance a disease named little leaf disease is a hazardous disease found in pine trees in United States. Detection of plant disease through some automatic technique is beneficial as it reduces a large work of monitoring in big farms of crops, and at very early stage itself it detects the symptoms of diseases i.e. when they appear on plant leaves. This paper introduces an efficient approach to identify healthy and diseased or an infected leaf using image processing and machine learning techniques. Various diseases damage the chlorophyll of leaves and affect with brown or black marks on the leaf area. These can be detected using image prepossessing, image segmentation. Support Vector Machine (SVM) is one of the machine learning algorithms is used for classification. The Convolutional Neural Network (CNN) resulted in a improved accuracy of recognition compared to the SVM approach.
prrao87 / Lancedb StudyComparing LanceDB and Elasticsearch for full-text search and vector search performance
metinaktas / Acoustic Direction Finding Using Single Acoustic Vector Sensor Under High ReverberationWe propose a novel and robust method for acoustic direction finding, which is solely based on acoustic pressure and pressure gradient measurements from single Acoustic Vector Sensor (AVS). We do not make any stochastic and sparseness assumptions regarding the signal source and the environmental characteristics. Hence, our method can be applied to a wide range of wideband acoustic signals including the speech and noise-like signals in various environments. Our method identifies the “clean” time frequency bins that are not distorted by multipath signals and noise, and estimates the 2D-DOA angles at only those identified bins. Moreover, the identification of the clean bins and the corresponding DOA estimation are performed jointly in one framework in a computationally highly efficient manner. We mathematically and experimentally show that the false detection rate of the proposed method is zero, i.e., none of the time-frequency bins with multiple sources are wrongly labeled as single-source, when the source directions do not coincide. Therefore, our method is significantly more reliable and robust compared to the competing state-of-the-art methods that perform the time-frequency bin selection and the DOA estimation separately. The proposed method, for performed simulations, estimates the source direction with high accuracy (less than 1 degree error) even under significantly high reverberation conditions.
praweshd / Speech Emotion RecognitionIn this project, the performance of speech emotion recognition is compared between two methods (SVM vs Bi-LSTM RNN).Conventional classifiers that uses machine learning algorithms has been used for decades in recognizing emotions from speech. However, in recent years, deep learning methods have taken the center stage and have gained popularity for their ability to perform well without any input hand-crafted features. Speech emotion on sets obtained from RAVDESS corpus is classified using a conventionally used Support Vector Machine (SVM) and its performance is compared to that of a bidirectional long short-term memory (LSTM).
SinghAbhi1998 / Stock Market Price PredictionStock Market Price Prediction: Used machine learning algorithms such as Linear Regression, Logistics Regression, Naive Bayes, K Nearest Neighbor, Support Vector Machine, Decision Tree, and Random Forest to identify which algorithm gives better results. Used Neural Networks such as Auto ARIMA, Prophet(Time-Series), and LSTM(Long Term-Short Memory) then compare make Inferences about the model.
AmirhosseinHonardoust / TFIDF Vs Word2VecA detailed educational guide explaining two essential NLP techniques, TF-IDF and Word2Vec. Learn how text is transformed into numerical vectors, compare their mathematical foundations, explore real-world use cases, and implement both methods in Python for text analysis and machine learning.
hardik04021996 / Human Activity Recognition Using Machine LearningArtificial Neural Networks (ANN), k-Nearest Neighbors, Random Forest classifier and Support Vector Machines (SVM) were trained over a HAR dataset on Python and the accuracies achieved by each of the algorithms were compared with each other.
Ahamasaleh / Deep Learning For Intrusion Detection Using Recurrent Neural Network RNNDeep Learning techniques can be implemented in the field of cybersecurity to handle the issues related to intrusion just as they have been successfully implemented in the areas such as computer vision and natural language processing (NLP). RNN model is compared with J48, Artificial Neural Network, Random Forest, Support Vector Machine and other machine learning techniques to detect malicious attacks in terms of binary and multiclass classifications.
dsgiitr / Kge GraphRAGKnowledge Graph Embeddings (KGE) for RAG-LLMs. Our goal was to compare the mathematical differences between Traditional Static Multimodal Vector Embeddings (TVE) from Word2Vec and CLIP encoders for {text:image} datasets, and Knowledge Graph Embeddings (KGE) generated with REBEL triplets trained on PyKeen.
arpit3043 / Extractive Text SummerizationSummarization systems often have additional evidence they can utilize in order to specify the most important topics of document(s). For example, when summarizing blogs, there are discussions or comments coming after the blog post that are good sources of information to determine which parts of the blog are critical and interesting. In scientific paper summarization, there is a considerable amount of information such as cited papers and conference information which can be leveraged to identify important sentences in the original paper. How text summarization works In general there are two types of summarization, abstractive and extractive summarization. Abstractive Summarization: Abstractive methods select words based on semantic understanding, even those words did not appear in the source documents. It aims at producing important material in a new way. They interpret and examine the text using advanced natural language techniques in order to generate a new shorter text that conveys the most critical information from the original text. It can be correlated to the way human reads a text article or blog post and then summarizes in their own word. Input document → understand context → semantics → create own summary. 2. Extractive Summarization: Extractive methods attempt to summarize articles by selecting a subset of words that retain the most important points. This approach weights the important part of sentences and uses the same to form the summary. Different algorithm and techniques are used to define weights for the sentences and further rank them based on importance and similarity among each other. Input document → sentences similarity → weight sentences → select sentences with higher rank. The limited study is available for abstractive summarization as it requires a deeper understanding of the text as compared to the extractive approach. Purely extractive summaries often times give better results compared to automatic abstractive summaries. This is because of the fact that abstractive summarization methods cope with problems such as semantic representation, inference and natural language generation which is relatively harder than data-driven approaches such as sentence extraction. There are many techniques available to generate extractive summarization. To keep it simple, I will be using an unsupervised learning approach to find the sentences similarity and rank them. One benefit of this will be, you don’t need to train and build a model prior start using it for your project. It’s good to understand Cosine similarity to make the best use of code you are going to see. Cosine similarity is a measure of similarity between two non-zero vectors of an inner product space that measures the cosine of the angle between them. Since we will be representing our sentences as the bunch of vectors, we can use it to find the similarity among sentences. Its measures cosine of the angle between vectors. Angle will be 0 if sentences are similar. All good till now..? Hope so :) Next, Below is our code flow to generate summarize text:- Input article → split into sentences → remove stop words → build a similarity matrix → generate rank based on matrix → pick top N sentences for summary.
dimanzt / Game Theoretic Network SimulatorGTNS is a discrete-event network simulator targeted primarily for research and educational use. GTNS is written in Visual C++ programming language and supports different network topologies. This simulator was first produced to implement locally multipath adaptive routing (LMAR) protocol, classified as a new reactive distance vector routing protocol for MANETs. LMAR can find an ad-hoc path without selfish nodes and wormholes using an exhaustive search algorithm in polynomial time. Also when the primary path fails, it discovers an alternative safe path if network graph remains connected after eliminating selfish/malicious nodes. The key feature of LMAR to seek safe route free of selfish and malicious nodes in polynomial time is its searching algorithm and flooding stage that its generated traffic is equi-loaded compared to single-path routing protocols but its security efficiency to bypass the attacks is much better than the other multi-path routing protocols. LMAR concept is introduced to provide the security feature known as availability and a simulator has been developed to analyze its behavior in complex network environments [1]. Then we have added detection mechanism to the simulator, which can detect selfish nodes in network. The proposed algorithm is resilient against collision and can be used in networks which wireless nodes use directional antennas and it also defend against an attack that malicious nodes try to break communications by relaying the packets in a specific direction. Some game theoretic strategies to enforce cooperation in network have been implemented in GTNS, for example Forwarding-Ratio Strategy, TFT-Strategy and ERTFT. This tutorial helps new users to get familiar with GTNS and run different network scenarios.
Sission / Coupled VAE Improved Robustness And Accuracy Of A Variational AutoencoderWe present a coupled Variational Auto-Encoder (VAE) method that improves the accuracy and robustness of the probabilistic inferences on represented data. The new method models the dependency between input feature vectors (images) and weighs the outliers with a higher penalty by generalizing the original loss function to the coupled entropy function, using the principles of nonlinear statistical coupling. We evaluate the performance of the coupled VAE model using the MNIST dataset. Compared with the traditional VAE algorithm, the output images generated by the coupled VAE method are clearer and less blurry. The visualization of the input images embedded in 2D latent variable space provides a deeper insight into the structure of new model with coupled loss function: the latent variable has a smaller deviation and the output values are generated by a more compact latent space. We analyze the histograms of probabilities for the input images using the generalized mean metrics, in which increased geometric mean illustrates that the average likelihood of input data is improved. Increases in the -2/3 mean, which is sensitive to outliers, indicates improved robustness. The decisiveness, measured by the arithmetic mean of the likelihoods, is unchanged and -2/3 mean shows that the new model has better robustness.
bhaveshjaggi / PestDetectionPEST DETECTION USING IMAGE PROCESSING e The principal idea which empowered us to work on the project PEST DETECTION USING IMAGE PROCESSING is to ensure improved and better farming techniques for farmers. Our Solution: The techniques of image analysis are extensively applied to agricultural science, and it provides maximum protection to crops and also much less use of pesticides which can ultimately lead to better crop management and production. The following softwares are required for the project: OpenCV with C++/Python : It is a library which is designed for computational efficiency with a strong focus on real time applications. Pest Detection System Following are the image processing steps which are used in the proposed system. >Color Image to Gray Image Conversion Therefore, images are converted into gray scale images so that they can be handled easily and require less storage. The following equation shows how images are converted into gray scale images. I(x,y)=0.2989*B +0.5870*G +0.1140*B > Image Filtering The PSNR value is calculated for both the average and median resulting images .The average filter provides better result as compared to the median filter. So this paper uses average filter for further processing. > Image Segmentation To detect the pests from the images, the image background is calculated using morphological operators which is most critical after this image is subtracted from the original image. So the resulting image will only have the objects with pixel values 1 and background pixel values 0. >Noise Removal Noise contains dew drops, dust and other visible parts of leaves. As only the object of interest was to be visible on the images,so the aim was to remove the noise to get better and effective results. The Erosion algorithm has been used to remove isolated noisy pixels and to smoothen object boundaries . After noise removal,the next goal was to enhance the detected pests after segmentation which was performed by using the dilation algorithm. >Feature Extraction Different properties of the images are calculated on the basis of those attributes using which image is classified. For image properties, gray level co-occurrence matrix and regional properties of the images are calculated. These properties are used to train the support vector machine to classify images. >Counting of the pests on the leaves is the main purpose, so that it can give an idea of how much pests are there on a leaf.It uses Moore neighborhood tracing algorithm and Jacob's stopping criterion Feasibility: The present framework of pest detection is quite tedious and laborious for the farmers as they have to carry out their acre-acres surveys themselves and it requires a lot of vigorous efforts to achieve the same.Image analysis provides a realistic opportunity for the automation of insect pest detection.Through this system, crop technicians can easily count the pests from the collected specimens, and right pests’ management can be applied to increase both the quantity and quality of production. Using the automated system, crop technicians can make the monitoring process easier. So in order to bring enhancements in the system,we came up with more productive and well organised system with our idea .Due to this automaton applied,lucrativeness increases and labour is reduced.