14 skills found
sayantann11 / All Classification Templetes For MLClassification - Machine Learning This is ‘Classification’ tutorial which is a part of the Machine Learning course offered by Simplilearn. We will learn Classification algorithms, types of classification algorithms, support vector machines(SVM), Naive Bayes, Decision Tree and Random Forest Classifier in this tutorial. Objectives Let us look at some of the objectives covered under this section of Machine Learning tutorial. Define Classification and list its algorithms Describe Logistic Regression and Sigmoid Probability Explain K-Nearest Neighbors and KNN classification Understand Support Vector Machines, Polynomial Kernel, and Kernel Trick Analyze Kernel Support Vector Machines with an example Implement the Naïve Bayes Classifier Demonstrate Decision Tree Classifier Describe Random Forest Classifier Classification: Meaning Classification is a type of supervised learning. It specifies the class to which data elements belong to and is best used when the output has finite and discrete values. It predicts a class for an input variable as well. There are 2 types of Classification: Binomial Multi-Class Classification: Use Cases Some of the key areas where classification cases are being used: To find whether an email received is a spam or ham To identify customer segments To find if a bank loan is granted To identify if a kid will pass or fail in an examination Classification: Example Social media sentiment analysis has two potential outcomes, positive or negative, as displayed by the chart given below. https://www.simplilearn.com/ice9/free_resources_article_thumb/classification-example-machine-learning.JPG This chart shows the classification of the Iris flower dataset into its three sub-species indicated by codes 0, 1, and 2. https://www.simplilearn.com/ice9/free_resources_article_thumb/iris-flower-dataset-graph.JPG The test set dots represent the assignment of new test data points to one class or the other based on the trained classifier model. Types of Classification Algorithms Let’s have a quick look into the types of Classification Algorithm below. Linear Models Logistic Regression Support Vector Machines Nonlinear models K-nearest Neighbors (KNN) Kernel Support Vector Machines (SVM) Naïve Bayes Decision Tree Classification Random Forest Classification Logistic Regression: Meaning Let us understand the Logistic Regression model below. This refers to a regression model that is used for classification. This method is widely used for binary classification problems. It can also be extended to multi-class classification problems. Here, the dependent variable is categorical: y ϵ {0, 1} A binary dependent variable can have only two values, like 0 or 1, win or lose, pass or fail, healthy or sick, etc In this case, you model the probability distribution of output y as 1 or 0. This is called the sigmoid probability (σ). If σ(θ Tx) > 0.5, set y = 1, else set y = 0 Unlike Linear Regression (and its Normal Equation solution), there is no closed form solution for finding optimal weights of Logistic Regression. Instead, you must solve this with maximum likelihood estimation (a probability model to detect the maximum likelihood of something happening). It can be used to calculate the probability of a given outcome in a binary model, like the probability of being classified as sick or passing an exam. https://www.simplilearn.com/ice9/free_resources_article_thumb/logistic-regression-example-graph.JPG Sigmoid Probability The probability in the logistic regression is often represented by the Sigmoid function (also called the logistic function or the S-curve): https://www.simplilearn.com/ice9/free_resources_article_thumb/sigmoid-function-machine-learning.JPG In this equation, t represents data values * the number of hours studied and S(t) represents the probability of passing the exam. Assume sigmoid function: https://www.simplilearn.com/ice9/free_resources_article_thumb/sigmoid-probability-machine-learning.JPG g(z) tends toward 1 as z -> infinity , and g(z) tends toward 0 as z -> infinity K-nearest Neighbors (KNN) K-nearest Neighbors algorithm is used to assign a data point to clusters based on similarity measurement. It uses a supervised method for classification. The steps to writing a k-means algorithm are as given below: https://www.simplilearn.com/ice9/free_resources_article_thumb/knn-distribution-graph-machine-learning.JPG Choose the number of k and a distance metric. (k = 5 is common) Find k-nearest neighbors of the sample that you want to classify Assign the class label by majority vote. KNN Classification A new input point is classified in the category such that it has the most number of neighbors from that category. For example: https://www.simplilearn.com/ice9/free_resources_article_thumb/knn-classification-machine-learning.JPG Classify a patient as high risk or low risk. Mark email as spam or ham. Keen on learning about Classification Algorithms in Machine Learning? Click here! Support Vector Machine (SVM) Let us understand Support Vector Machine (SVM) in detail below. SVMs are classification algorithms used to assign data to various classes. They involve detecting hyperplanes which segregate data into classes. SVMs are very versatile and are also capable of performing linear or nonlinear classification, regression, and outlier detection. Once ideal hyperplanes are discovered, new data points can be easily classified. https://www.simplilearn.com/ice9/free_resources_article_thumb/support-vector-machines-graph-machine-learning.JPG The optimization objective is to find “maximum margin hyperplane” that is farthest from the closest points in the two classes (these points are called support vectors). In the given figure, the middle line represents the hyperplane. SVM Example Let’s look at this image below and have an idea about SVM in general. Hyperplanes with larger margins have lower generalization error. The positive and negative hyperplanes are represented by: https://www.simplilearn.com/ice9/free_resources_article_thumb/positive-negative-hyperplanes-machine-learning.JPG Classification of any new input sample xtest : If w0 + wTxtest > 1, the sample xtest is said to be in the class toward the right of the positive hyperplane. If w0 + wTxtest < -1, the sample xtest is said to be in the class toward the left of the negative hyperplane. When you subtract the two equations, you get: https://www.simplilearn.com/ice9/free_resources_article_thumb/equation-subtraction-machine-learning.JPG Length of vector w is (L2 norm length): https://www.simplilearn.com/ice9/free_resources_article_thumb/length-of-vector-machine-learning.JPG You normalize with the length of w to arrive at: https://www.simplilearn.com/ice9/free_resources_article_thumb/normalize-equation-machine-learning.JPG SVM: Hard Margin Classification Given below are some points to understand Hard Margin Classification. The left side of equation SVM-1 given above can be interpreted as the distance between the positive (+ve) and negative (-ve) hyperplanes; in other words, it is the margin that can be maximized. Hence the objective of the function is to maximize with the constraint that the samples are classified correctly, which is represented as : https://www.simplilearn.com/ice9/free_resources_article_thumb/hard-margin-classification-machine-learning.JPG This means that you are minimizing ‖w‖. This also means that all positive samples are on one side of the positive hyperplane and all negative samples are on the other side of the negative hyperplane. This can be written concisely as : https://www.simplilearn.com/ice9/free_resources_article_thumb/hard-margin-classification-formula.JPG Minimizing ‖w‖ is the same as minimizing. This figure is better as it is differentiable even at w = 0. The approach listed above is called “hard margin linear SVM classifier.” SVM: Soft Margin Classification Given below are some points to understand Soft Margin Classification. To allow for linear constraints to be relaxed for nonlinearly separable data, a slack variable is introduced. (i) measures how much ith instance is allowed to violate the margin. The slack variable is simply added to the linear constraints. https://www.simplilearn.com/ice9/free_resources_article_thumb/soft-margin-calculation-machine-learning.JPG Subject to the above constraints, the new objective to be minimized becomes: https://www.simplilearn.com/ice9/free_resources_article_thumb/soft-margin-calculation-formula.JPG You have two conflicting objectives now—minimizing slack variable to reduce margin violations and minimizing to increase the margin. The hyperparameter C allows us to define this trade-off. Large values of C correspond to larger error penalties (so smaller margins), whereas smaller values of C allow for higher misclassification errors and larger margins. https://www.simplilearn.com/ice9/free_resources_article_thumb/machine-learning-certification-video-preview.jpg SVM: Regularization The concept of C is the reverse of regularization. Higher C means lower regularization, which increases bias and lowers the variance (causing overfitting). https://www.simplilearn.com/ice9/free_resources_article_thumb/concept-of-c-graph-machine-learning.JPG IRIS Data Set The Iris dataset contains measurements of 150 IRIS flowers from three different species: Setosa Versicolor Viriginica Each row represents one sample. Flower measurements in centimeters are stored as columns. These are called features. IRIS Data Set: SVM Let’s train an SVM model using sci-kit-learn for the Iris dataset: https://www.simplilearn.com/ice9/free_resources_article_thumb/svm-model-graph-machine-learning.JPG Nonlinear SVM Classification There are two ways to solve nonlinear SVMs: by adding polynomial features by adding similarity features Polynomial features can be added to datasets; in some cases, this can create a linearly separable dataset. https://www.simplilearn.com/ice9/free_resources_article_thumb/nonlinear-classification-svm-machine-learning.JPG In the figure on the left, there is only 1 feature x1. This dataset is not linearly separable. If you add x2 = (x1)2 (figure on the right), the data becomes linearly separable. Polynomial Kernel In sci-kit-learn, one can use a Pipeline class for creating polynomial features. Classification results for the Moons dataset are shown in the figure. https://www.simplilearn.com/ice9/free_resources_article_thumb/polynomial-kernel-machine-learning.JPG Polynomial Kernel with Kernel Trick Let us look at the image below and understand Kernel Trick in detail. https://www.simplilearn.com/ice9/free_resources_article_thumb/polynomial-kernel-with-kernel-trick.JPG For large dimensional datasets, adding too many polynomial features can slow down the model. You can apply a kernel trick with the effect of polynomial features without actually adding them. The code is shown (SVC class) below trains an SVM classifier using a 3rd-degree polynomial kernel but with a kernel trick. https://www.simplilearn.com/ice9/free_resources_article_thumb/polynomial-kernel-equation-machine-learning.JPG The hyperparameter coefθ controls the influence of high-degree polynomials. Kernel SVM Let us understand in detail about Kernel SVM. Kernel SVMs are used for classification of nonlinear data. In the chart, nonlinear data is projected into a higher dimensional space via a mapping function where it becomes linearly separable. https://www.simplilearn.com/ice9/free_resources_article_thumb/kernel-svm-machine-learning.JPG In the higher dimension, a linear separating hyperplane can be derived and used for classification. A reverse projection of the higher dimension back to original feature space takes it back to nonlinear shape. As mentioned previously, SVMs can be kernelized to solve nonlinear classification problems. You can create a sample dataset for XOR gate (nonlinear problem) from NumPy. 100 samples will be assigned the class sample 1, and 100 samples will be assigned the class label -1. https://www.simplilearn.com/ice9/free_resources_article_thumb/kernel-svm-graph-machine-learning.JPG As you can see, this data is not linearly separable. https://www.simplilearn.com/ice9/free_resources_article_thumb/kernel-svm-non-separable.JPG You now use the kernel trick to classify XOR dataset created earlier. https://www.simplilearn.com/ice9/free_resources_article_thumb/kernel-svm-xor-machine-learning.JPG Naïve Bayes Classifier What is Naive Bayes Classifier? Have you ever wondered how your mail provider implements spam filtering or how online news channels perform news text classification or even how companies perform sentiment analysis of their audience on social media? All of this and more are done through a machine learning algorithm called Naive Bayes Classifier. Naive Bayes Named after Thomas Bayes from the 1700s who first coined this in the Western literature. Naive Bayes classifier works on the principle of conditional probability as given by the Bayes theorem. Advantages of Naive Bayes Classifier Listed below are six benefits of Naive Bayes Classifier. Very simple and easy to implement Needs less training data Handles both continuous and discrete data Highly scalable with the number of predictors and data points As it is fast, it can be used in real-time predictions Not sensitive to irrelevant features Bayes Theorem We will understand Bayes Theorem in detail from the points mentioned below. According to the Bayes model, the conditional probability P(Y|X) can be calculated as: P(Y|X) = P(X|Y)P(Y) / P(X) This means you have to estimate a very large number of P(X|Y) probabilities for a relatively small vector space X. For example, for a Boolean Y and 30 possible Boolean attributes in the X vector, you will have to estimate 3 billion probabilities P(X|Y). To make it practical, a Naïve Bayes classifier is used, which assumes conditional independence of P(X) to each other, with a given value of Y. This reduces the number of probability estimates to 2*30=60 in the above example. Naïve Bayes Classifier for SMS Spam Detection Consider a labeled SMS database having 5574 messages. It has messages as given below: https://www.simplilearn.com/ice9/free_resources_article_thumb/naive-bayes-spam-machine-learning.JPG Each message is marked as spam or ham in the data set. Let’s train a model with Naïve Bayes algorithm to detect spam from ham. The message lengths and their frequency (in the training dataset) are as shown below: https://www.simplilearn.com/ice9/free_resources_article_thumb/naive-bayes-spam-spam-detection.JPG Analyze the logic you use to train an algorithm to detect spam: Split each message into individual words/tokens (bag of words). Lemmatize the data (each word takes its base form, like “walking” or “walked” is replaced with “walk”). Convert data to vectors using scikit-learn module CountVectorizer. Run TFIDF to remove common words like “is,” “are,” “and.” Now apply scikit-learn module for Naïve Bayes MultinomialNB to get the Spam Detector. This spam detector can then be used to classify a random new message as spam or ham. Next, the accuracy of the spam detector is checked using the Confusion Matrix. For the SMS spam example above, the confusion matrix is shown on the right. Accuracy Rate = Correct / Total = (4827 + 592)/5574 = 97.21% Error Rate = Wrong / Total = (155 + 0)/5574 = 2.78% https://www.simplilearn.com/ice9/free_resources_article_thumb/confusion-matrix-machine-learning.JPG Although confusion Matrix is useful, some more precise metrics are provided by Precision and Recall. https://www.simplilearn.com/ice9/free_resources_article_thumb/precision-recall-matrix-machine-learning.JPG Precision refers to the accuracy of positive predictions. https://www.simplilearn.com/ice9/free_resources_article_thumb/precision-formula-machine-learning.JPG Recall refers to the ratio of positive instances that are correctly detected by the classifier (also known as True positive rate or TPR). https://www.simplilearn.com/ice9/free_resources_article_thumb/recall-formula-machine-learning.JPG Precision/Recall Trade-off To detect age-appropriate videos for kids, you need high precision (low recall) to ensure that only safe videos make the cut (even though a few safe videos may be left out). The high recall is needed (low precision is acceptable) in-store surveillance to catch shoplifters; a few false alarms are acceptable, but all shoplifters must be caught. Learn about Naive Bayes in detail. Click here! Decision Tree Classifier Some aspects of the Decision Tree Classifier mentioned below are. Decision Trees (DT) can be used both for classification and regression. The advantage of decision trees is that they require very little data preparation. They do not require feature scaling or centering at all. They are also the fundamental components of Random Forests, one of the most powerful ML algorithms. Unlike Random Forests and Neural Networks (which do black-box modeling), Decision Trees are white box models, which means that inner workings of these models are clearly understood. In the case of classification, the data is segregated based on a series of questions. Any new data point is assigned to the selected leaf node. https://www.simplilearn.com/ice9/free_resources_article_thumb/decision-tree-classifier-machine-learning.JPG Start at the tree root and split the data on the feature using the decision algorithm, resulting in the largest information gain (IG). This splitting procedure is then repeated in an iterative process at each child node until the leaves are pure. This means that the samples at each node belonging to the same class. In practice, you can set a limit on the depth of the tree to prevent overfitting. The purity is compromised here as the final leaves may still have some impurity. The figure shows the classification of the Iris dataset. https://www.simplilearn.com/ice9/free_resources_article_thumb/decision-tree-classifier-graph.JPG IRIS Decision Tree Let’s build a Decision Tree using scikit-learn for the Iris flower dataset and also visualize it using export_graphviz API. https://www.simplilearn.com/ice9/free_resources_article_thumb/iris-decision-tree-machine-learning.JPG The output of export_graphviz can be converted into png format: https://www.simplilearn.com/ice9/free_resources_article_thumb/iris-decision-tree-output.JPG Sample attribute stands for the number of training instances the node applies to. Value attribute stands for the number of training instances of each class the node applies to. Gini impurity measures the node’s impurity. A node is “pure” (gini=0) if all training instances it applies to belong to the same class. https://www.simplilearn.com/ice9/free_resources_article_thumb/impurity-formula-machine-learning.JPG For example, for Versicolor (green color node), the Gini is 1-(0/54)2 -(49/54)2 -(5/54) 2 ≈ 0.168 https://www.simplilearn.com/ice9/free_resources_article_thumb/iris-decision-tree-sample.JPG Decision Boundaries Let us learn to create decision boundaries below. For the first node (depth 0), the solid line splits the data (Iris-Setosa on left). Gini is 0 for Setosa node, so no further split is possible. The second node (depth 1) splits the data into Versicolor and Virginica. If max_depth were set as 3, a third split would happen (vertical dotted line). https://www.simplilearn.com/ice9/free_resources_article_thumb/decision-tree-boundaries.JPG For a sample with petal length 5 cm and petal width 1.5 cm, the tree traverses to depth 2 left node, so the probability predictions for this sample are 0% for Iris-Setosa (0/54), 90.7% for Iris-Versicolor (49/54), and 9.3% for Iris-Virginica (5/54) CART Training Algorithm Scikit-learn uses Classification and Regression Trees (CART) algorithm to train Decision Trees. CART algorithm: Split the data into two subsets using a single feature k and threshold tk (example, petal length < “2.45 cm”). This is done recursively for each node. k and tk are chosen such that they produce the purest subsets (weighted by their size). The objective is to minimize the cost function as given below: https://www.simplilearn.com/ice9/free_resources_article_thumb/cart-training-algorithm-machine-learning.JPG The algorithm stops executing if one of the following situations occurs: max_depth is reached No further splits are found for each node Other hyperparameters may be used to stop the tree: min_samples_split min_samples_leaf min_weight_fraction_leaf max_leaf_nodes Gini Impurity or Entropy Entropy is one more measure of impurity and can be used in place of Gini. https://www.simplilearn.com/ice9/free_resources_article_thumb/gini-impurity-entrophy.JPG It is a degree of uncertainty, and Information Gain is the reduction that occurs in entropy as one traverses down the tree. Entropy is zero for a DT node when the node contains instances of only one class. Entropy for depth 2 left node in the example given above is: https://www.simplilearn.com/ice9/free_resources_article_thumb/entrophy-for-depth-2.JPG Gini and Entropy both lead to similar trees. DT: Regularization The following figure shows two decision trees on the moons dataset. https://www.simplilearn.com/ice9/free_resources_article_thumb/dt-regularization-machine-learning.JPG The decision tree on the right is restricted by min_samples_leaf = 4. The model on the left is overfitting, while the model on the right generalizes better. Random Forest Classifier Let us have an understanding of Random Forest Classifier below. A random forest can be considered an ensemble of decision trees (Ensemble learning). Random Forest algorithm: Draw a random bootstrap sample of size n (randomly choose n samples from the training set). Grow a decision tree from the bootstrap sample. At each node, randomly select d features. Split the node using the feature that provides the best split according to the objective function, for instance by maximizing the information gain. Repeat the steps 1 to 2 k times. (k is the number of trees you want to create, using a subset of samples) Aggregate the prediction by each tree for a new data point to assign the class label by majority vote (pick the group selected by the most number of trees and assign new data point to that group). Random Forests are opaque, which means it is difficult to visualize their inner workings. https://www.simplilearn.com/ice9/free_resources_article_thumb/random-forest-classifier-graph.JPG However, the advantages outweigh their limitations since you do not have to worry about hyperparameters except k, which stands for the number of decision trees to be created from a subset of samples. RF is quite robust to noise from the individual decision trees. Hence, you need not prune individual decision trees. The larger the number of decision trees, the more accurate the Random Forest prediction is. (This, however, comes with higher computation cost). Key Takeaways Let us quickly run through what we have learned so far in this Classification tutorial. Classification algorithms are supervised learning methods to split data into classes. They can work on Linear Data as well as Nonlinear Data. Logistic Regression can classify data based on weighted parameters and sigmoid conversion to calculate the probability of classes. K-nearest Neighbors (KNN) algorithm uses similar features to classify data. Support Vector Machines (SVMs) classify data by detecting the maximum margin hyperplane between data classes. Naïve Bayes, a simplified Bayes Model, can help classify data using conditional probability models. Decision Trees are powerful classifiers and use tree splitting logic until pure or somewhat pure leaf node classes are attained. Random Forests apply Ensemble Learning to Decision Trees for more accurate classification predictions. Conclusion This completes ‘Classification’ tutorial. In the next tutorial, we will learn 'Unsupervised Learning with Clustering.'
himanshub1007 / Alzhimers Disease Prediction Using Deep Learning# AD-Prediction Convolutional Neural Networks for Alzheimer's Disease Prediction Using Brain MRI Image ## Abstract Alzheimers disease (AD) is characterized by severe memory loss and cognitive impairment. It associates with significant brain structure changes, which can be measured by magnetic resonance imaging (MRI) scan. The observable preclinical structure changes provides an opportunity for AD early detection using image classification tools, like convolutional neural network (CNN). However, currently most AD related studies were limited by sample size. Finding an efficient way to train image classifier on limited data is critical. In our project, we explored different transfer-learning methods based on CNN for AD prediction brain structure MRI image. We find that both pretrained 2D AlexNet with 2D-representation method and simple neural network with pretrained 3D autoencoder improved the prediction performance comparing to a deep CNN trained from scratch. The pretrained 2D AlexNet performed even better (**86%**) than the 3D CNN with autoencoder (**77%**). ## Method #### 1. Data In this project, we used public brain MRI data from **Alzheimers Disease Neuroimaging Initiative (ADNI)** Study. ADNI is an ongoing, multicenter cohort study, started from 2004. It focuses on understanding the diagnostic and predictive value of Alzheimers disease specific biomarkers. The ADNI study has three phases: ADNI1, ADNI-GO, and ADNI2. Both ADNI1 and ADNI2 recruited new AD patients and normal control as research participants. Our data included a total of 686 structure MRI scans from both ADNI1 and ADNI2 phases, with 310 AD cases and 376 normal controls. We randomly derived the total sample into training dataset (n = 519), validation dataset (n = 100), and testing dataset (n = 67). #### 2. Image preprocessing Image preprocessing were conducted using Statistical Parametric Mapping (SPM) software, version 12. The original MRI scans were first skull-stripped and segmented using segmentation algorithm based on 6-tissue probability mapping and then normalized to the International Consortium for Brain Mapping template of European brains using affine registration. Other configuration includes: bias, noise, and global intensity normalization. The standard preprocessing process output 3D image files with an uniform size of 121x145x121. Skull-stripping and normalization ensured the comparability between images by transforming the original brain image into a standard image space, so that same brain substructures can be aligned at same image coordinates for different participants. Diluted or enhanced intensity was used to compensate the structure changes. the In our project, we used both whole brain (including both grey matter and white matter) and grey matter only. #### 3. AlexNet and Transfer Learning Convolutional Neural Networks (CNN) are very similar to ordinary Neural Networks. A CNN consists of an input and an output layer, as well as multiple hidden layers. The hidden layers are either convolutional, pooling or fully connected. ConvNet architectures make the explicit assumption that the inputs are images, which allows us to encode certain properties into the architecture. These then make the forward function more efficient to implement and vastly reduce the amount of parameters in the network. #### 3.1. AlexNet The net contains eight layers with weights; the first five are convolutional and the remaining three are fully connected. The overall architecture is shown in Figure 1. The output of the last fully-connected layer is fed to a 1000-way softmax which produces a distribution over the 1000 class labels. AlexNet maximizes the multinomial logistic regression objective, which is equivalent to maximizing the average across training cases of the log-probability of the correct label under the prediction distribution. The kernels of the second, fourth, and fifth convolutional layers are connected only to those kernel maps in the previous layer which reside on the same GPU (as shown in Figure1). The kernels of the third convolutional layer are connected to all kernel maps in the second layer. The neurons in the fully connected layers are connected to all neurons in the previous layer. Response-normalization layers follow the first and second convolutional layers. Max-pooling layers follow both response-normalization layers as well as the fifth convolutional layer. The ReLU non-linearity is applied to the output of every convolutional and fully-connected layer.  The first convolutional layer filters the 224x224x3 input image with 96 kernels of size 11x11x3 with a stride of 4 pixels (this is the distance between the receptive field centers of neighboring neurons in a kernel map). The second convolutional layer takes as input the (response-normalized and pooled) output of the first convolutional layer and filters it with 256 kernels of size 5x5x48. The third, fourth, and fifth convolutional layers are connected to one another without any intervening pooling or normalization layers. The third convolutional layer has 384 kernels of size 3x3x256 connected to the (normalized, pooled) outputs of the second convolutional layer. The fourth convolutional layer has 384 kernels of size 3x3x192 , and the fifth convolutional layer has 256 kernels of size 3x3x192. The fully-connected layers have 4096 neurons each. #### 3.2. Transfer Learning Training an entire Convolutional Network from scratch (with random initialization) is impractical[14] because it is relatively rare to have a dataset of sufficient size. An alternative is to pretrain a Conv-Net on a very large dataset (e.g. ImageNet), and then use the ConvNet either as an initialization or a fixed feature extractor for the task of interest. Typically, there are three major transfer learning scenarios: **ConvNet as fixed feature extractor:** We can take a ConvNet pretrained on ImageNet, and remove the last fully-connected layer, then treat the rest structure as a fixed feature extractor for the target dataset. In AlexNet, this would be a 4096-D vector. Usually, we call these features as CNN codes. Once we get these features, we can train a linear classifier (e.g. linear SVM or Softmax classifier) for our target dataset. **Fine-tuning the ConvNet:** Another idea is not only replace the last fully-connected layer in the classifier, but to also fine-tune the parameters of the pretrained network. Due to overfitting concerns, we can only fine-tune some higher-level part of the network. This suggestion is motivated by the observation that earlier features in a ConvNet contains more generic features (e.g. edge detectors or color blob detectors) that can be useful for many kind of tasks. But the later layer of the network becomes progressively more specific to the details of the classes contained in the original dataset. **Pretrained models:** The released pretrained model is usually the final ConvNet checkpoint. So it is common to see people use the network for fine-tuning. #### 4. 3D Autoencoder and Convolutional Neural Network We take a two-stage approach where we first train a 3D sparse autoencoder to learn filters for convolution operations, and then build a convolutional neural network whose first layer uses the filters learned with the autoencoder.  #### 4.1. Sparse Autoencoder An autoencoder is a 3-layer neural network that is used to extract features from an input such as an image. Sparse representations can provide a simple interpretation of the input data in terms of a small number of \parts by extracting the structure hidden in the data. The autoencoder has an input layer, a hidden layer and an output layer, and the input and output layers have same number of units, while the hidden layer contains more units for a sparse and overcomplete representation. The encoder function maps input x to representation h, and the decoder function maps the representation h to the output x. In our problem, we extract 3D patches from scans as the input to the network. The decoder function aims to reconstruct the input form the hidden representation h. #### 4.2. 3D Convolutional Neural Network Training the 3D convolutional neural network(CNN) is the second stage. The CNN we use in this project has one convolutional layer, one pooling layer, two linear layers, and finally a log softmax layer. After training the sparse autoencoder, we take the weights and biases of the encoder from trained model, and use them a 3D filter of a 3D convolutional layer of the 1-layer convolutional neural network. Figure 2 shows the architecture of the network. #### 5. Tools In this project, we used Nibabel for MRI image processing and PyTorch Neural Networks implementation.
SOYJUN / FTP Implement Based On UDPThe aim of this assignment is to have you do UDP socket client / server programming with a focus on two broad aspects : Setting up the exchange between the client and server in a secure way despite the lack of a formal connection (as in TCP) between the two, so that ‘outsider’ UDP datagrams (broadcast, multicast, unicast - fortuitously or maliciously) cannot intrude on the communication. Introducing application-layer protocol data-transmission reliability, flow control and congestion control in the client and server using TCP-like ARQ sliding window mechanisms. The second item above is much more of a challenge to implement than the first, though neither is particularly trivial. But they are not tightly interdependent; each can be worked on separately at first and then integrated together at a later stage. Apart from the material in Chapters 8, 14 & 22 (especially Sections 22.5 - 22.7), and the experience you gained from the preceding assignment, you will also need to refer to the following : ioctl function (Chapter 17). get_ifi_info function (Section 17.6, Chapter 17). This function will be used by the server code to discover its node’s network interfaces so that it can bind all its interface IP addresses (see Section 22.6). ‘Race’ conditions (Section 20.5, Chapter 20) You also need a thorough understanding of how the TCP protocol implements reliable data transfer, flow control and congestion control. Chapters 17- 24 of TCP/IP Illustrated, Volume 1 by W. Richard Stevens gives a good overview of TCP. Though somewhat dated for some things (it was published in 1994), it remains, overall, a good basic reference. Overview This assignment asks you to implement a primitive file transfer protocol for Unix platforms, based on UDP, and with TCP-like reliability added to the transfer operation using timeouts and sliding-window mechanisms, and implementing flow and congestion control. The server is a concurrent server which can handle multiple clients simultaneously. A client gives the server the name of a file. The server forks off a child which reads directly from the file and transfers the contents over to the client using UDP datagrams. The client prints out the file contents as they come in, in order, with nothing missing and with no duplication of content, directly on to stdout (via the receiver sliding window, of course, but with no other intermediate buffering). The file to be transferred can be of arbitrary length, but its contents are always straightforward ascii text. As an aside let me mention that assuming the file contents ascii is not as restrictive as it sounds. We can always pretend, for example, that binary files are base64 encoded (“ASCII armor”). A real file transfer protocol would, of course, have to worry about transferring files between heterogeneous platforms with different file structure conventions and semantics. The sender would first have to transform the file into a platform-independent, protocol-defined, format (using, say, ASN.1, or some such standard), and the receiver would have to transform the received file into its platform’s native file format. This kind of thing can be fairly time consuming, and is certainly very tedious, to implement, with little educational value - it is not part of this assignment. Arguments for the server You should provide the server with an input file server.in from which it reads the following information, in the order shown, one item per line : Well-known port number for server. Maximum sending sliding-window size (in datagram units). You will not be handing in your server.in file. We shall create our own when we come to test your code. So it is important that you stick strictly to the file name and content conventions specified above. The same applies to the client.in input file below. Arguments for the client The client is to be provided with an input file client.in from which it reads the following information, in the order shown, one item per line : IP address of server (not the hostname). Well-known port number of server. filename to be transferred. Receiving sliding-window size (in datagram units). Random generator seed value. Probability p of datagram loss. This should be a real number in the range [ 0.0 , 1.0 ] (value 0.0 means no loss occurs; value 1.0 means all datagrams all lost). The mean µ, in milliseconds, for an exponential distribution controlling the rate at which the client reads received datagram payloads from its receive buffer. Operation Server starts up and reads its arguments from file server.in. As we shall see, when a client communicates with the server, the server will want to know what IP address that client is using to identify the server (i.e. , the destination IP address in the incoming datagram). Normally, this can be done relatively straightforwardly using the IP_RECVDESTADDR socket option, and picking up the information using the ancillary data (‘control information’) capability of the recvmsg function. Unfortunately, Solaris 2.10 does not support the IP_RECVDESTADDR option (nor, incidentally, does it support the msg_flags option in msghdr - see p.390). This considerably complicates things. In the absence of IP_RECVDESTADDR, what the server has to do as part of its initialization phase is to bind each IP address it has (and, simultaneously, its well-known port number, which it has read in from server.in) to a separate UDP socket. The code in Section 22.6, which uses the get_ifi_info function, shows you how to do that. However, there are important differences between that code and the version you want to implement. The code of Section 22.6 binds the IP addresses and forks off a child for each address that is bound to. We do not want to do that. Instead you should have an array of socket descriptors. For each IP address, create a new socket and bind the address (and well-known port number) to the socket without forking off child processes. Creating child processes comes later, when clients arrive. The code of Section 22.6 also attempts to bind broadcast addresses. We do not want to do this. It binds a wildcard IP address, which we certainly do not want to do either. We should bind strictly only unicast addresses (including the loopback address). The get_ifi_info function (which the code in Section 22.6 uses) has to be modified so that it also gets the network masks for the IP addresses of the node, and adds these to the information stored in the linked list of ifi_info structures (see Figure 17.5, p.471) it produces. As you go binding each IP address to a distinct socket, it will be useful for later processing to build your own array of structures, where a structure element records the following information for each socket : sockfd IP address bound to the socket network mask for the IP address subnet address (obtained by doing a bit-wise and between the IP address and its network mask) Report, in a ReadMe file which you hand in with your code, on the modifications you had to introduce to ensure that only unicast addresses are bound, and on your implementation of the array of structures described above. You should print out on stdout, with an appropriate message and appropriately formatted in dotted decimal notation, the IP address, network mask, and subnet address for each socket in your array of structures (you do not need to print the sockfd). The server now uses select to monitor the sockets it has created for incoming datagrams. When it returns from select, it must use recvfrom or recvmsg to read the incoming datagram (see 6. below). When a client starts, it first reads its arguments from the file client.in. The client checks if the server host is ‘local’ to its (extended) Ethernet. If so, all its communication to the server is to occur as MSG_DONTROUTE (or SO_DONTROUTE socket option). It determines if the server host is ‘local’ as follows. The first thing the client should do is to use the modified get_ifi_info function to obtain all of its IP addresses and associated network masks. Print out on stdout, in dotted decimal notation and with an appropriate message, the IP addresses and network masks obtained. In the following, IPserver designates the IP address the client will use to identify the server, and IPclient designates the IP address the client will choose to identify itself. The client checks whether the server is on the same host. If so, it should use the loopback address 127.0.0.1 for the server (i.e. , IPserver = 127.0.0.1). IPclient should also be set to the loopback address. Otherwise it proceeds as follows: IPserver is set to the IP address for the server in the client.in file. Given IPserver and the (unicast) IP addresses and network masks for the client returned by get_ifi_info in the linked list of ifi_info structures, you should be able to figure out if the server node is ‘local’ or not. This will be discussed in class; but let me just remind you here that you should use ‘longest prefix matching’ where applicable. If there are multiple client addresses, and the server host is ‘local’, the client chooses an IP address for itself, IPclient, which matches up as ‘local’ according to your examination above. If the server host is not ‘local’, then IPclient can be chosen arbitrarily. Print out on stdout the results of your examination, as to whether the server host is ‘local’ or not, as well as the IPclient and IPserver addresses selected. Note that this manner of determining whether the server is local or not is somewhat clumsy and ‘over-engineered’, and, as such, should be viewed more in the nature of a pedagogical exercise. Ideally, we would like to look up the server IP address(es) in the routing table (see Section 18.3). This requires that a routing socket be created, for which we need superuser privilege. Alternatively, we might want to dump out the routing table, using the sysctl function for example (see Section 18.4), and examine it directly. Unfortunately, Solaris 2.10 does not support sysctl. Furthermore, note that there is a slight problem with the address 130.245.1.123/24 assigned to compserv3 (see rightmost column of file hosts, and note that this particular compserv3 address “overlaps” with the 130.245.1.x/28 addresses in that same column assigned to compserv1, compserv2 & comserv4). In particular, if the client is running on compserv3 and the server on any of the other three compservs, and if that server node is also being identified to the client by its /28 (rather than its /24) address, then the client will get a “false positive” when it tests as to whether the server node is local or not. In other words, the client will deem the server node to be local, whereas in fact it should not be considered local. Because of this, it is perhaps best simply not to use compserv3 to run the client (but it is o.k. to use it to run the server). Finally, using MSG_DONTROUTE where possible would seem to gain us efficiency, in as much as the kernel does not need to consult the routing table for every datagram sent. But, in fact, that is not so. Recall that one effect of connect with UDP sockets is that routing information is obtained by the kernel at the time the connect is issued. That information is cached and used for subsequent sends from the connected socket (see p.255). The client now creates a UDP socket and calls bind on IPclient, with 0 as the port number. This will cause the kernel to bind an ephemeral port to the socket. After the bind, use the getsockname function (Section 4.10) to obtain IPclient and the ephemeral port number that has been assigned to the socket, and print that information out on stdout, with an appropriate message and appropriately formatted. The client connects its socket to IPserver and the well-known port number of the server. After the connect, use the getpeername function (Section 4.10) to obtain IPserver and the well-known port number of the server, and print that information out on stdout, with an appropriate message and appropriately formatted. The client sends a datagram to the server giving the filename for the transfer. This send needs to be backed up by a timeout in case the datagram is lost. Note that the incoming datagram from the client will be delivered to the server at the socket to which the destination IP address that the datagram is carrying has been bound. Thus, the server can obtain that address (it is, of course, IPserver) and thereby achieve what IP_RECVDESTADDR would have given us had it been available. Furthermore, the server process can obtain the IP address (this will, of course, be IPclient) and ephemeral port number of the client through the recvfrom or recvmsg functions. The server forks off a child process to handle the client. The server parent process goes back to the select to listen for new clients. Hereafter, and unless otherwise stated, whenever we refer to the ‘server’, we mean the server child process handling the client’s file transfer, not the server parent process. Typically, the first thing the server child would be expected to do is to close all sockets it ‘inherits’ from its parent. However, this is not the case with us. The server child does indeed close the sockets it inherited, but not the socket on which the client request arrived. It leaves that socket open for now. Call this socket the ‘listening’ socket. The server (child) then checks if the client host is local to its (extended) Ethernet. If so, all its communication to the client is to occur as MSG_DONTROUTE (or SO_DONTROUTE socket option). If IPserver (obtained in 5. above) is the loopback address, then we are done. Otherwise, the server has to proceed with the following step. Use the array of structures you built in 1. above, together with the addresses IPserver and IPclient to determine if the client is ‘local’. Print out on stdout the results of your examination, as to whether the client host is ‘local’ or not. The server (child) creates a UDP socket to handle file transfer to the client. Call this socket the ‘connection’ socket. It binds the socket to IPserver, with port number 0 so that its kernel assigns an ephemeral port. After the bind, use the getsockname function (Section 4.10) to obtain IPserver and the ephemeral port number that has been assigned to the socket, and print that information out on stdout, with an appropriate message and appropriately formatted. The server then connects this ‘connection’ socket to the client’s IPclient and ephemeral port number. The server now sends the client a datagram, in which it passes it the ephemeral port number of its ‘connection’ socket as the data payload of the datagram. This datagram is sent using the ‘listening’ socket inherited from its parent, otherwise the client (whose socket is connected to the server’s ‘listening’ socket at the latter’s well-known port number) will reject it. This datagram must be backed up by the ARQ mechanism, and retransmitted in the event of loss. Note that if this datagram is indeed lost, the client might well time out and retransmit its original request message (the one carrying the file name). In this event, you must somehow ensure that the parent server does not mistake this retransmitted request for a new client coming in, and spawn off yet another child to handle it. How do you do that? It is potentially more involved than it might seem. I will be discussing this in class, as well as ‘race’ conditions that could potentially arise, depending on how you code the mechanisms I present. When the client receives the datagram carrying the ephemeral port number of the server’s ‘connection’ socket, it reconnects its socket to the server’s ‘connection’ socket, using IPserver and the ephemeral port number received in the datagram (see p.254). It now uses this reconnected socket to send the server an acknowledgment. Note that this implies that, in the event of the server timing out, it should retransmit two copies of its ‘ephemeral port number’ message, one on its ‘listening’ socket and the other on its ‘connection’ socket (why?). When the server receives the acknowledgment, it closes the ‘listening’ socket it inherited from its parent. The server can now commence the file transfer through its ‘connection’ socket. The net effect of all these binds and connects at server and client is that no ‘outsider’ UDP datagram (broadcast, multicast, unicast - fortuitously or maliciously) can now intrude on the communication between server and client. Starting with the first datagram sent out, the client behaves as follows. Whenever a datagram arrives, or an ACK is about to be sent out (or, indeed, the initial datagram to the server giving the filename for the transfer), the client uses some random number generator function random() (initialized by the client.in argument value seed) to decide with probability p (another client.in argument value) if the datagram or ACK should be discarded by way of simulating transmission loss across the network. (I will briefly discuss in class how you do this.) Adding reliability to UDP The mechanisms you are to implement are based on TCP Reno. These include : Reliable data transmission using ARQ sliding-windows, with Fast Retransmit. Flow control via receiver window advertisements. Congestion control that implements : SlowStart Congestion Avoidance (‘Additive-Increase/Multiplicative Decrease’ – AIMD) Fast Recovery (but without the window-inflation aspect of Fast Recovery) Only some, and by no means all, of the details for these are covered below. The rest will be presented in class, especially those concerning flow control and TCP Reno’s congestion control mechanisms in general : Slow Start, Congestion Avoidance, Fast Retransmit and Fast Recovery. Implement a timeout mechanism on the sender (server) side. This is available to you from Stevens, Section 22.5 . Note, however, that you will need to modify the basic driving mechanism of Figure 22.7 appropriately since the situation at the sender side is not a repetitive cycle of send-receive, but rather a straightforward progression of send-send-send-send- . . . . . . . . . . . Also, modify the RTT and RTO mechanisms of Section 22.5 as specified below. I will be discussing the details of these modifications and the reasons for them in class. Modify function rtt_stop (Fig. 22.13) so that it uses integer arithmetic rather than floating point. This will entail your also having to modify some of the variable and function parameter declarations throughout Section 22.5 from float to int, as appropriate. In the unprrt.h header file (Fig. 22.10) set : RTT_RXTMIN to 1000 msec. (1 sec. instead of the current value 3 sec.) RTT_RXTMAX to 3000 msec. (3 sec. instead of the current value 60 sec.) RTT_MAXNREXMT to 12 (instead of the current value 3) In function rtt_timeout (Fig. 22.14), after doubling the RTO in line 86, pass its value through the function rtt_minmax of Fig. 22.11 (somewhat along the lines of what is done in line 77 of rtt_stop, Fig. 22.13). Finally, note that with the modification to integer calculation of the smoothed RTT and its variation, and given the small RTT values you will experience on the cs / sbpub network, these calculations should probably now be done on a millisecond or even microsecond scale (rather than in seconds, as is the case with Stevens’ code). Otherwise, small measured RTTs could show up as 0 on a scale of seconds, yielding a negative result when we subtract the smoothed RTT from the measured RTT (line 72 of rtt_stop, Fig. 22.13). Report the details of your modifications to the code of Section 22.5 in the ReadMe file which you hand in with your code. We need to have a sender sliding window mechanism for the retransmission of lost datagrams; and a receiver sliding window in order to ensure correct sequencing of received file contents, and some measure of flow control. You should implement something based on TCP Reno’s mechanisms, with cumulative acknowledgments, receiver window advertisements, and a congestion control mechanism I will explain in detail in class. For a reference on TCP’s mechanisms generally, see W. Richard Stevens, TCP/IP Illustrated, Volume 1 , especially Sections 20.2 - 20.4 of Chapter 20 , and Sections 21.1 - 21.8 of Chapter 21 . Bear in mind that our sequence numbers should count datagrams, not bytes as in TCP. Remember that the sender and receiver window sizes have to be set according to the argument values in client.in and server.in, respectively. Whenever the sender window becomes full and so ‘locks’, the server should print out a message to that effect on stdout. Similarly, whenever the receiver window ‘locks’, the client should print out a message on stdout. Be aware of the potential for deadlock when the receiver window ‘locks’. This situation is handled by having the receiver process send a duplicate ACK which acts as a window update when its window opens again (see Figure 20.3 and the discussion about it in TCP/IP Illustrated). However, this is not enough, because ACKs are not backed up by a timeout mechanism in the event they are lost. So we will also need to implement a persist timer driving window probes in the sender process (see Sections 22.1 & 22.2 in Chapter 22 of TCP/IP Illustrated). Note that you do not have to worry about the Silly Window Syndrome discussed in Section 22.3 of TCP/IP Illustrated since the receiver process consumes ‘full sized’ 512-byte messages from the receiver buffer (see 3. below). Report on the details of the ARQ mechanism you implemented in the ReadMe file you hand in. Indeed, you should report on all the TCP mechanisms you implemented in the ReadMe file, both the ones discussed here, and the ones I will be discussing in class. Make your datagram payload a fixed 512 bytes, inclusive of the file transfer protocol header (which must, at the very least, carry: the sequence number of the datagram; ACKs; and advertised window notifications). The client reads the file contents in its receive buffer and prints them out on stdout using a separate thread. This thread sits in a repetitive loop till all the file contents have been printed out, doing the following. It samples from an exponential distribution with mean µ milliseconds (read from the client.in file), sleeps for that number of milliseconds; wakes up to read and print all in-order file contents available in the receive buffer at that point; samples again from the exponential distribution; sleeps; and so on. The formula -1 × µ × ln( random( ) ) , where ln is the natural logarithm, yields variates from an exponential distribution with mean µ, based on the uniformly-distributed variates over ( 0 , 1 ) returned by random(). Note that you will need to implement some sort of mutual exclusion/semaphore mechanism on the client side so that the thread that sleeps and wakes up to consume from the receive buffer is not updating the state variables of the buffer at the same time as the main thread reading from the socket and depositing into the buffer is doing the same. Furthermore, we need to ensure that the main thread does not effectively monopolize the semaphore (and thus lock out for prolonged periods of time) the sleeping thread when the latter wakes up. See the textbook, Section 26.7, ‘Mutexes: Mutual Exclusion’, pp.697-701. You might also find Section 26.8, ‘Condition Variables’, pp.701-705, useful. You will need to devise some way by which the sender can notify the receiver when it has sent the last datagram of the file transfer, without the receiver mistaking that EOF marker as part of the file contents. (Also, note that the last data segment could be a “short” segment of less than 512 bytes – your client needs to be able to handle this correctly somehow.) When the sender receives an ACK for the last datagram of the transfer, the (child) server terminates. The parent server has to take care of cleaning up zombie children. Note that if we want a clean closing, the client process cannot simply terminate when the receiver ACKs the last datagram. This ACK could be lost, which would leave the (child) server process ‘hanging’, timing out, and retransmitting the last datagram. TCP attempts to deal with this problem by means of the TIME_WAIT state. You should have your receiver process behave similarly, sticking around in something akin to a TIME_WAIT state in case in case it needs to retransmit the ACK. In the ReadMe file you hand in, report on how you dealt with the issues raised here: sender notifying receiver of the last datagram, clean closing, and so on. Output Some of the output required from your program has been described in the section Operation above. I expect you to provide further output – clear, well-structured, well-laid-out, concise but sufficient and helpful – in the client and server windows by means of which we can trace the correct evolution of your TCP’s behaviour in all its intricacies : information (e.g., sequence number) on datagrams and acks sent and dropped, window advertisements, datagram retransmissions (and why : dup acks or RTO); entering/exiting Slow Start and Congestion Avoidance, ssthresh and cwnd values; sender and receiver windows locking/unlocking; etc., etc. . . . . The onus is on you to convince us that the TCP mechanisms you implemented are working correctly. Too many students do not put sufficient thought, creative imagination, time or effort into this. It is not the TA’s nor my responsibility to sit staring at an essentially blank screen, trying to summon up our paranormal psychology skills to figure out if your TCP implementation is really working correctly in all its very intricate aspects, simply because the transferred file seems to be printing o.k. in the client window. Nor is it our responsibility to strain our eyes and our patience wading through a mountain of obscure, ill-structured, hyper-messy, debugging-style output because, for example, your effort-conserving concept of what is ‘suitable’ is to dump your debugging output on us, relevant, irrelevant, and everything in between.
bfaloona / IndyTreat logs as a data structure, providing you with the ability to collect given segments, find particular messages, or make assertions for correctness.
Nikkitaseth / ProjectAlphaPYTHON CODE WALKTHROUGH Data Sourcing In order to run a discounted cash flow model (DCF), I needed data, so I found a free API that provided us with everything I needed. I wrote a code that saved every financial statement of every company in a separate text file. In this code, I asked to ping the API’s URL for every ticker, open a text file for one of the financial statements for one company ticker, dump all the data found by the code into this file, and close it. This process was repeated for every company in our company list and every statement I have a code for. By doing so I Ire able to store the data for every company locally and did not need to ping the API every time I ran our code. Once all the financial data for each company was stored in form of a balance sheet, income statement, cash flow statement, and company profile text file, I needed to pick out specific items required for our DCF model. Thus, I defined the functions that selected all required items from the respective financial statements of each company and assigned them to a variable using utils.py. Discounted Cash Flow Model First of all, I needed to import the functions I defined in utils.py before defining the DCF model function, which would run for every company in our list. Next, I ensured to have 5 consecutive years of past data to compute the average. Thus, the first few lines of code checked whether the last year on record was 2019 from which point I would go back 5 years; if the last year was 2018, this would be taken as the first data entry from which I would go back 5 years. The second part mentioned above is important because companies file their 10-K, i.e. their annual report, at different times throughout the year so there may be companies that already filed their reports while others had not. After this step, five-year averages of every item’s percentage of revenue Ire calculated as Ill as the average revenue growth over the same period. These items included EBIT, depreciation & amortization, capital expenditures, and the change in net working capital. Once that was done, there Ire only three variables missing before calculating free cash flows for the next few years: a discount or hurdle rate; industry-specific perpetual growth rates; and a tax rate. After these three variables Ire set up, the next step was to calculate the free cash flows to the firm (fcff) for the next 5 years and determine the terminal value at the end of the period using the growth rate for the corresponding industry. For the former, I use a loop to calculate the fcff for all the year, discount it, and add it to one variable called fcffpv. Once the terminal value was calculated, these two additional numbers captured the enterprise value of the firm. Since I Ire interested in the equity value, I subtracted debt and add cash, which left us with the equity value. In one final step, I divided this value by the number of shares to end up with an intrinsic value per share. After calculating the intrinsic value per share, I compared it to the current share price with two additions. First, I added a buffer to minimize our downside risk for inaccuracy in calculations, which is called the margin of safety. Here, the intrinsic value should at least be 115% of the current share price. I also set an upper limit at 130% to ensure I would not include companies with extraordinarily high valuations, compared to their current price. If the share price calculated fell within this window, I added its ticker to a dataframe, which was the last step in the function. As such, the DCF function would run for every company and provide a dataframe with the tickers of all those companies that Ire undervalued at the time and fell within the 115% - 130% range. Portfolio Optimization The dataframe with the tickers of all the undervalued companies that was previously created has now become the portfolio, which I converted into a list and used as the source for further optimization that is about to come. Some general inputs for the rest of the code Ire the start and end date of the data I requested for optimization, as Ill as the risk-free rate and the number of simulations I wanted to run our optimizations for. Now that the general framework has been created, it is time to choose some conditioning variables to measure the performance of investment in one sector or across a combination of some/all sectors, respectively. Project Alpha uses the following conditioning variables to optimize its portfolios: • Sharpe Ratio: It measures the performance of an investment compared to the risk-free asset, i.e. the 10-year Treasury Bond, after adjusting for its risk factor or standard deviation. The Sharpe ratio would be given a higher Iight for investors who have a higher risk tolerance. In terms of code, I used the bt package to retrieve the data betIen the predetermined start and end date for the companies in our ticker list. This data was then used to find the portfolio with the highest Sharpe ratio. For that, random Iights Ire assigned to each company and the ratio was computed. After running the number of simulations previously determined, the Iights with the highest Sharpe ratio will be located using loc() and labeled ‘sharpe_portfolio’ which is a dataframe containing the excess return, the volatility, Sharpe ratio, as Ill as the Iights for every company. I also located the portfolio with the loIst volatility, put it in a dataframe called ‘min_volatility_port’ which has the same attributes. The rest of the code of this segment simply created a picture with all the portfolios generated, displaying the efficient frontier and highlighting the portfolio with the highest Sharpe ratio and loIst volatility. • Value at Risk (VaR): VaR was chosen as a diagnostic tool to assess the model. In our case, it basically indicated the percentage of time in which a loss greater than 1% would occur over a period of 5 years. Its limitation is that although it measures how bad the best of the bad is, it does not measure how bad it can get, meaning the worst of the worst. In regards to the code, I first requested the adjusted closing for the companies in our ticker list in the determined time horizon. I then retrieved the Iights from our Sharpe portfolio, set the number of days I wanted to simulate as Ill as the cutoff, before calculating the returns of every company in every period; here: daily. Thereafter, I created a new variable called ‘sigma’, which was be a copy of our return variable, in order to ensure the right format and type for our Monte Carlo loop. The simulation is pretty straight forward, as it measures how many runs the returns fall within 1% or outside of it. I then Iighed the resulting returns by the Iight of the company in the portfolio and whenever the portfolio return was outside the set boundary, it would count as a ‘bad simulation’. Once that is done, the number of bad simulations was divided by the total number of simulations to end up with a percentage of how many simulations were bad, which equals our VaR • Treynor Ratio: For the investors that already have a perfectly diversified portfolio and would like to add more assets to it, there would be a higher Iight on the Treynor ratio. It basically uses beta as a risk factor because it carries the risk relative to the market, instead of standard deviation as in Sharpe, meaning only systematic or non-diversifiable risk. For the code, I first calculated the portfolio’s beta. For that, I defined a function ‘beta’ that reads the beta of every company and returns it. The next step is to run a loop that would enter the beta of every company in our ticker list into a new dataframe. After setting the index equal to the tickers and transposing the Sharpe portfolio Iights, I can concat the two thus resulting in two columns: one is the beta of every company and the second is the corresponding Iight in the portfolio. I then created a third column as the product of columns one and two. The sum of all entries in that column is the portfolio beta, which was then used as the denominator for the ratio. The nominator was already calculated as ‘Excess Return’ in the Sharpe portfolio. • Sortino Ratio: The Sortino ratio measures only the downside risk (downside deviation or semi-deviation) by measuring returns against a minimum acceptable return, 𝜏. It is surprising to know that most of the industry ignores the total number of periods taken and just calculates the downside deviation by choosing the periods with downside risk, which results in misleading results. Project Alpha uses all the periods to calculate the same, so as to have an advantage over those robo-advisors/financial advisors that do not follow this process. The alpha in the future would be generated by going long on companies with high correct Sortino and low incorrect Sortino as they are undervalued, and shorting those with low correct Sortino and high incorrect Sortino as these are overvalued. The Sortino ratio would be given more Iight for investors who are more risk averse. This part of the code started with retrieving the data for our benchmark, the S&P 500, for the period and the calculating the average daily and annual return. After that, I calculate the portfolio returns, ‘returns[“Returns”]’, by adding the products of every company’s Iight times its return, which gave us the portfolio return for every period. From here, I calculated the downside risk by comparing the portfolio return in every period to the daily average return of our benchmark in a for loop. Before I did that, I defined a new variable called ‘semi’, which is a data series and will be filled with whatever comes out of the loop every single time. If the portfolio return minus the average daily return of the benchmark was greater than 0 – meaning the portfolio earned more than the average of the S&P500 – the value for the period was set to 0 and added to the semi data series. If it is 0, which is extremely unlikely, but whatever, it would also be 0. If it is less than 0, hoIver, which indicates underperformance, I would square the portfolio return, which already gives us the semi variance I need for our next step. From here, I can simply take the square root of the average of the ‘semi’ data series to get the daily downside risk and multiplying it by the square root of 252, which gives us the annual number. After that, I have all the numbers to calculate the Sortino ratio. • Information Ratio: The information ratio measures the portfolio returns compared to the returns of a benchmark index, i.e. S&P500, after adjusting for its additional risk. It only looks at the excess return of the portfolio over the benchmark and the volatility or risk associated with it. I already have all the inputs I need to calculate his ratio. Thus, I simply created a new dataframe with the portfolio returns of every period and the benchmark returns of every period. To find the excess return, i.e. the nominator, I simply subtracted the latter from the former and assigned it to a new variable, which I called ‘excess_return’. The nominator would be the average return of the portfolio minus the average return of the benchmark, and the denominator would be the standard deviation of the ‘excess_return’ series. Finally, I printed short sentences with the results for every conditioning variable just described as an output in the console.
georgid / AlignmentEvaluationScripts for computing common lyrics-to-audio alignment evaluation metrics. Usable evaluation for any token-based alignment (e.g. if token is word, phrase, note, section etc.) User for the evaluation of the MIREX Lyrics-to-audio challenge
fgirbal / Segment Select CorrectCode release for "Segment, Select, Correct: A Framework for Weakly-Supervised Referring Segmentation"
priyamittal15 / Implementation Of Different Deep Learning Algorithms For Fracture Detection Image ClassificationUsing-Deep-Learning-Techniques-perform-Fracture-Detection-Image-Processing Using Different Image Processing techniques Implementing Fracture Detection on X rays Images on 8000 + images of dataset Description About Project: Bones are the stiff organs that protect vital organs such as the brain, heart, lungs, and other internal organs in the human body. There are 206 bones in the human body, all of which has different shapes, sizes, and structures. The femur bones are the largest, and the auditory ossicles are the smallest. Humans suffer from bone fractures on a regular basis. Bone fractures can happen as a result of an accident or any other situation in which the bones are put under a lot of pressure. Oblique, complex, comminute, spiral, greenstick, and transverse bone fractures are among the many forms that can occur. X-ray, computed tomography (CT), magnetic resonance imaging (MRI), ultrasound, and other types of medical imaging techniques are available to detect various types of disorders. So we design the architecture of it using Neural Networks different models, compare the accuracy, and get a result of which model works better for our dataset and which model delivers correct results on a specific related dataset with 10 classes. Basically our main motive is to check that which model works better on our dataset so in future reference we all get an idea that which model gives better type of accuracy for a respective dataset . Proposed Method for Project: we decided to make this project because we have seen a lot of times that report that are generated by computer produce error sometimes so we wanted to find out which model gives good accuracy and produce less error so we start to research over image processing nd those libraries which are used in image processing like Keras , Matplot lib , Image Generator , tensor flow and other libraries and used some of them and implement it on different image processing algorithm like as CNN , VGG-16 Model ,ResNet50 Model , InceptionV3 Model . and then find the best model which gives best accuracy for that we generate classification report using predefined libraries in python such as precision , recall ,r2score , mean square error etc by importing Sklearn. Methodology of Project: Phase 1: Requirement analysis: • Study concepts of Basic Python programming. • Study of Tensor flow, keras and Python API interface . • Study of basic algorithms of Image Processing and neural network And deep learning concepts. • Collect the dataset from different resources and describe it into Different classes(5 Fractured + 5 non fractured). Phase 2: Designing and development: The stages of design and development are further segmented. This step starts with data from the Requirement and Analysis phase, which will lead to the model construction phase, where a model will be created and an algorithm will be devised. After the algorithm design phase is completed, the focus will shift to algorithm analysis and implementation in this project. Phase 3: Coding Phase: Before real coding begins, the task is divided into modules/units and assigned to team members once the system design papers are received. Because code is developed during this phase, it is the developers' primary emphasis. The most time-consuming aspect of the project will be this. This project's implementation begins with the development of a program in the relevant programming language and the production of an error-free executable program. Phase 4: Testing Phase: When it comes to the testing phase, we may test our model based on the classification report it generates, which contains a variety of factors such as accuracy, f1score, precision, and recall, and we can also test our model based on its training and testing accuracy. Phase 5: Deployment Phase: One of our goals is to bring all of the previous steps together and put them into practice. Another goal is to deploy our model into a python-based interface application after comparing the classification reports and determining which model is best for our dataset.
https-github-com-Rama24 / Peretesan.This XML file does not appear to have any style information associated with it. The document tree is shown below. <xsd:schema xmlns="http://www.springframework.org/schema/mvc" xmlns:xsd="http://www.w3.org/2001/XMLSchema" xmlns:beans="http://www.springframework.org/schema/beans" xmlns:tool="http://www.springframework.org/schema/tool" targetNamespace="http://www.springframework.org/schema/mvc" elementFormDefault="qualified" attributeFormDefault="unqualified"> <xsd:import namespace="http://www.springframework.org/schema/beans" schemaLocation="https://www.springframework.org/schema/beans/spring-beans-4.3.xsd"/> <xsd:import namespace="http://www.springframework.org/schema/tool" schemaLocation="https://www.springframework.org/schema/tool/spring-tool-4.3.xsd"/> <xsd:element name="annotation-driven"> <xsd:annotation> <xsd:documentation source="java:org.springframework.web.servlet.mvc.method.annotation.RequestMappingHandlerAdapter"> <![CDATA[ Configures the annotation-driven Spring MVC Controller programming model. Note that this tag works in Web MVC only, not in Portlet MVC! See org.springframework.web.servlet.config.annotation.EnableWebMvc javadoc for details on code-based alternatives to enabling annotation-driven Spring MVC support. ]]> </xsd:documentation> </xsd:annotation> <xsd:complexType> <xsd:all minOccurs="0"> <xsd:element name="path-matching" minOccurs="0"> <xsd:annotation> <xsd:documentation> <![CDATA[ Configures the path matching part of the Spring MVC Controller programming model. Like annotation-driven, code-based alternatives are also documented in EnableWebMvc javadoc. ]]> </xsd:documentation> </xsd:annotation> <xsd:complexType> <xsd:attribute name="suffix-pattern" type="xsd:boolean"> <xsd:annotation> <xsd:documentation> <![CDATA[ Whether to use suffix pattern match (".*") when matching patterns to requests. If enabled a method mapped to "/users" also matches to "/users.*". The default value is true. ]]> </xsd:documentation> </xsd:annotation> </xsd:attribute> <xsd:attribute name="trailing-slash" type="xsd:boolean"> <xsd:annotation> <xsd:documentation> <![CDATA[ Whether to match to URLs irrespective of the presence of a trailing slash. If enabled a method mapped to "/users" also matches to "/users/". The default value is true. ]]> </xsd:documentation> </xsd:annotation> </xsd:attribute> <xsd:attribute name="registered-suffixes-only" type="xsd:boolean"> <xsd:annotation> <xsd:documentation> <![CDATA[ Whether suffix pattern matching should work only against path extensions explicitly registered when you configure content negotiation. This is generally recommended to reduce ambiguity and to avoid issues such as when a "." appears in the path for other reasons. The default value is false. ]]> </xsd:documentation> </xsd:annotation> </xsd:attribute> <xsd:attribute name="path-helper" type="xsd:string"> <xsd:annotation> <xsd:documentation> <![CDATA[ The bean name of the UrlPathHelper to use for resolution of lookup paths. Use this to override the default UrlPathHelper with a custom subclass, or to share common UrlPathHelper settings across multiple HandlerMappings and MethodNameResolvers. ]]> </xsd:documentation> </xsd:annotation> </xsd:attribute> <xsd:attribute name="path-matcher" type="xsd:string"> <xsd:annotation> <xsd:documentation> <![CDATA[ The bean name of the PathMatcher implementation to use for matching URL paths against registered URL patterns. Default is AntPathMatcher. ]]> </xsd:documentation> </xsd:annotation> </xsd:attribute> </xsd:complexType> </xsd:element> <xsd:element name="message-converters" minOccurs="0"> <xsd:annotation> <xsd:documentation> <![CDATA[ Configures one or more HttpMessageConverter types to use for converting @RequestBody method parameters and @ResponseBody method return values. Using this configuration element is optional. HttpMessageConverter registrations provided here will take precedence over HttpMessageConverter types registered by default. Also see the register-defaults attribute if you want to turn off default registrations entirely. ]]> </xsd:documentation> </xsd:annotation> <xsd:complexType> <xsd:sequence> <xsd:choice maxOccurs="unbounded"> <xsd:element ref="beans:bean"> <xsd:annotation> <xsd:documentation> <![CDATA[ An HttpMessageConverter bean definition. ]]> </xsd:documentation> </xsd:annotation> </xsd:element> <xsd:element ref="beans:ref"> <xsd:annotation> <xsd:documentation> <![CDATA[ A reference to an HttpMessageConverter bean. ]]> </xsd:documentation> </xsd:annotation> </xsd:element> </xsd:choice> </xsd:sequence> <xsd:attribute name="register-defaults" type="xsd:boolean" default="true"> <xsd:annotation> <xsd:documentation> <![CDATA[ Whether or not default HttpMessageConverter registrations should be added in addition to the ones provided within this element. ]]> </xsd:documentation> </xsd:annotation> </xsd:attribute> </xsd:complexType> </xsd:element> <xsd:element name="argument-resolvers" minOccurs="0"> <xsd:annotation> <xsd:documentation> <![CDATA[ Configures HandlerMethodArgumentResolver types to support custom controller method argument types. Using this option does not override the built-in support for resolving handler method arguments. To customize the built-in support for argument resolution configure RequestMappingHandlerAdapter directly. ]]> </xsd:documentation> </xsd:annotation> <xsd:complexType> <xsd:choice minOccurs="1" maxOccurs="unbounded"> <xsd:element ref="beans:bean" minOccurs="0" maxOccurs="unbounded"> <xsd:annotation> <xsd:documentation> <![CDATA[ The HandlerMethodArgumentResolver (or WebArgumentResolver for backwards compatibility) bean definition. ]]> </xsd:documentation> </xsd:annotation> </xsd:element> <xsd:element ref="beans:ref" minOccurs="0" maxOccurs="unbounded"> <xsd:annotation> <xsd:documentation> <![CDATA[ A reference to a HandlerMethodArgumentResolver bean definition. ]]> </xsd:documentation> <xsd:appinfo> <tool:annotation kind="ref"> <tool:expected-type type="java:org.springframework.web.method.support.HandlerMethodArgumentResolver"/> </tool:annotation> </xsd:appinfo> </xsd:annotation> </xsd:element> </xsd:choice> </xsd:complexType> </xsd:element> <xsd:element name="return-value-handlers" minOccurs="0"> <xsd:annotation> <xsd:documentation> <![CDATA[ Configures HandlerMethodReturnValueHandler types to support custom controller method return value handling. Using this option does not override the built-in support for handling return values. To customize the built-in support for handling return values configure RequestMappingHandlerAdapter directly. ]]> </xsd:documentation> </xsd:annotation> <xsd:complexType> <xsd:choice minOccurs="1" maxOccurs="unbounded"> <xsd:element ref="beans:bean" minOccurs="0" maxOccurs="unbounded"> <xsd:annotation> <xsd:documentation> <![CDATA[ The HandlerMethodReturnValueHandler bean definition. ]]> </xsd:documentation> </xsd:annotation> </xsd:element> <xsd:element ref="beans:ref" minOccurs="0" maxOccurs="unbounded"> <xsd:annotation> <xsd:documentation> <![CDATA[ A reference to a HandlerMethodReturnValueHandler bean definition. ]]> </xsd:documentation> <xsd:appinfo> <tool:annotation kind="ref"> <tool:expected-type type="java:org.springframework.web.method.support.HandlerMethodReturnValueHandler"/> </tool:annotation> </xsd:appinfo> </xsd:annotation> </xsd:element> </xsd:choice> </xsd:complexType> </xsd:element> <xsd:element name="async-support" minOccurs="0"> <xsd:annotation> <xsd:documentation> <![CDATA[ Configure options for asynchronous request processing. ]]> </xsd:documentation> </xsd:annotation> <xsd:complexType> <xsd:all minOccurs="0"> <xsd:element name="callable-interceptors" minOccurs="0"> <xsd:annotation> <xsd:documentation> <![CDATA[ The ordered set of interceptors that intercept the lifecycle of concurrently executed requests, which start after a controller returns a java.util.concurrent.Callable. ]]> </xsd:documentation> </xsd:annotation> <xsd:complexType> <xsd:sequence> <xsd:element ref="beans:bean" minOccurs="1" maxOccurs="unbounded"> <xsd:annotation> <xsd:documentation> <![CDATA[ Registers a CallableProcessingInterceptor. ]]> </xsd:documentation> </xsd:annotation> </xsd:element> </xsd:sequence> </xsd:complexType> </xsd:element> <xsd:element name="deferred-result-interceptors" minOccurs="0"> <xsd:annotation> <xsd:documentation> <![CDATA[ The ordered set of interceptors that intercept the lifecycle of concurrently executed requests, which start after a controller returns a DeferredResult. ]]> </xsd:documentation> </xsd:annotation> <xsd:complexType> <xsd:sequence> <xsd:element ref="beans:bean" minOccurs="1" maxOccurs="unbounded"> <xsd:annotation> <xsd:documentation> <![CDATA[ Registers a DeferredResultProcessingInterceptor. ]]> </xsd:documentation> </xsd:annotation> </xsd:element> </xsd:sequence> </xsd:complexType> </xsd:element> </xsd:all> <xsd:attribute name="task-executor" type="xsd:string"> <xsd:annotation> <xsd:documentation source="java:org.springframework.core.task.AsyncTaskExecutor"> <![CDATA[ The bean name of a default AsyncTaskExecutor to use when a controller method returns a {@link Callable}. Controller methods can override this default on a per-request basis by returning an AsyncTask. By default, a SimpleAsyncTaskExecutor is used which does not re-use threads and is not recommended for production. ]]> </xsd:documentation> <xsd:appinfo> <tool:annotation kind="ref"> <tool:expected-type type="java:org.springframework.core.task.AsyncTaskExecutor"/> </tool:annotation> </xsd:appinfo> </xsd:annotation> </xsd:attribute> <xsd:attribute name="default-timeout" type="xsd:long"> <xsd:annotation> <xsd:documentation> <![CDATA[ Specify the amount of time, in milliseconds, before asynchronous request handling times out. In Servlet 3, the timeout begins after the main request processing thread has exited and ends when the request is dispatched again for further processing of the concurrently produced result. If this value is not set, the default timeout of the underlying implementation is used, e.g. 10 seconds on Tomcat with Servlet 3. ]]> </xsd:documentation> </xsd:annotation> </xsd:attribute> </xsd:complexType> </xsd:element> </xsd:all> <xsd:attribute name="conversion-service" type="xsd:string"> <xsd:annotation> <xsd:documentation source="java:org.springframework.core.convert.ConversionService"> <![CDATA[ The bean name of the ConversionService that is to be used for type conversion during field binding. This attribute is not required, and only needs to be specified if custom converters need to be configured. If not specified, a default FormattingConversionService is registered with converters to/from common value types. ]]> </xsd:documentation> <xsd:appinfo> <tool:annotation kind="ref"> <tool:expected-type type="java:org.springframework.core.convert.ConversionService"/> </tool:annotation> </xsd:appinfo> </xsd:annotation> </xsd:attribute> <xsd:attribute name="validator" type="xsd:string"> <xsd:annotation> <xsd:documentation source="java:org.springframework.validation.Validator"> <![CDATA[ The bean name of the Validator that is to be used to validate Controller model objects. This attribute is not required, and only needs to be specified if a custom Validator needs to be configured. If not specified, JSR-303 validation will be installed if a JSR-303 provider is present on the classpath. ]]> </xsd:documentation> <xsd:appinfo> <tool:annotation kind="ref"> <tool:expected-type type="java:org.springframework.validation.Validator"/> </tool:annotation> </xsd:appinfo> </xsd:annotation> </xsd:attribute> <xsd:attribute name="content-negotiation-manager" type="xsd:string"> <xsd:annotation> <xsd:documentation source="java:org.springframework.web.accept.ContentNegotiationManager"> <![CDATA[ The bean name of a ContentNegotiationManager that is to be used to determine requested media types. If not specified, a default ContentNegotiationManager is configured that checks the request path extension first and the "Accept" header second where path extensions such as ".json", ".xml", ".atom", and ".rss" are recognized if Jackson, JAXB2, or the Rome libraries are available. As a fallback option, the path extension is also used to perform a lookup through the ServletContext and the Java Activation Framework (if available). ]]> </xsd:documentation> <xsd:appinfo> <tool:annotation kind="ref"> <tool:expected-type type="java:org.springframework.web.accept.ContentNegotiationManager"/> </tool:annotation> </xsd:appinfo> </xsd:annotation> </xsd:attribute> <xsd:attribute name="message-codes-resolver" type="xsd:string"> <xsd:annotation> <xsd:documentation> <![CDATA[ The bean name of a MessageCodesResolver to use to build message codes from data binding and validation error codes. This attribute is not required. If not specified the DefaultMessageCodesResolver is used. ]]> </xsd:documentation> <xsd:appinfo> <tool:annotation kind="ref"> <tool:expected-type type="java:org.springframework.validation.MessageCodesResolver"/> </tool:annotation> </xsd:appinfo> </xsd:annotation> </xsd:attribute> <xsd:attribute name="enable-matrix-variables" type="xsd:boolean"> <xsd:annotation> <xsd:documentation> <![CDATA[ Matrix variables can appear in any path segment, each matrix variable separated with a ";" (semicolon). For example "/cars;color=red;year=2012". By default, they're removed from the URL. If this property is set to true, matrix variables are not removed from the URL, and the request mapping pattern must use URI variable in path segments where matrix variables are expected. For example "/{cars}". Matrix variables can then be injected into a controller method with @MatrixVariable. ]]> </xsd:documentation> </xsd:annotation> </xsd:attribute> <xsd:attribute name="ignore-default-model-on-redirect" type="xsd:boolean"> <xsd:annotation> <xsd:documentation> <![CDATA[ By default, the content of the "default" model is used both during rendering and redirect scenarios. Alternatively a controller method can declare a RedirectAttributes argument and use it to provide attributes for a redirect. Setting this flag to true ensures the "default" model is never used in a redirect scenario even if a RedirectAttributes argument is not declared. Setting it to false means the "default" model may be used in a redirect if the controller method doesn't declare a RedirectAttributes argument. The default setting is false but new applications should consider setting it to true. ]]> </xsd:documentation> </xsd:annotation> </xsd:attribute> </xsd:complexType> </xsd:element> <xsd:complexType name="content-version-strategy"> <xsd:annotation> <xsd:documentation source="org.springframework.web.servlet.resource.ContentVersionStrategy"> <![CDATA[ A VersionStrategy that calculates an Hex MD5 hashes from the content of the resource and appends it to the file name, e.g. "styles/main-e36d2e05253c6c7085a91522ce43a0b4.css". ]]> </xsd:documentation> </xsd:annotation> <xsd:attribute name="patterns" type="xsd:string" use="required"/> </xsd:complexType> <xsd:complexType name="fixed-version-strategy"> <xsd:annotation> <xsd:documentation source="org.springframework.web.servlet.resource.FixedVersionStrategy"> <![CDATA[ A VersionStrategy that relies on a fixed version applied as a request path prefix, e.g. reduced SHA, version name, release date, etc. ]]> </xsd:documentation> </xsd:annotation> <xsd:attribute name="version" type="xsd:string" use="required"/> <xsd:attribute name="patterns" type="xsd:string" use="required"/> </xsd:complexType> <xsd:complexType name="resource-version-strategy"> <xsd:annotation> <xsd:documentation source="org.springframework.web.servlet.resource.VersionStrategy"> <![CDATA[ A strategy for extracting and embedding a resource version in its URL path. ]]> </xsd:documentation> </xsd:annotation> <xsd:choice minOccurs="1" maxOccurs="1"> <xsd:element ref="beans:bean"> <xsd:annotation> <xsd:documentation source="org.springframework.web.servlet.resource.VersionStrategy"> <![CDATA[ A VersionStrategy bean definition. ]]> </xsd:documentation> </xsd:annotation> </xsd:element> <xsd:element ref="beans:ref"> <xsd:annotation> <xsd:documentation source="org.springframework.web.servlet.resource.VersionStrategy"> <![CDATA[ A reference to a VersionStrategy bean. ]]> </xsd:documentation> </xsd:annotation> </xsd:element> </xsd:choice> <xsd:attribute name="patterns" type="xsd:string" use="required"/> </xsd:complexType> <xsd:complexType name="version-resolver"> <xsd:annotation> <xsd:documentation source="org.springframework.web.servlet.resource.VersionResourceResolver"> <![CDATA[ Resolves request paths containing a version string that can be used as part of an HTTP caching strategy in which a resource is cached with a far future date (e.g. 1 year) and cached until the version, and therefore the URL, is changed. ]]> </xsd:documentation> </xsd:annotation> <xsd:choice maxOccurs="unbounded"> <xsd:element type="content-version-strategy" name="content-version-strategy"/> <xsd:element type="fixed-version-strategy" name="fixed-version-strategy"/> <xsd:element type="resource-version-strategy" name="version-strategy"/> </xsd:choice> </xsd:complexType> <xsd:complexType name="resource-resolvers"> <xsd:annotation> <xsd:documentation source="org.springframework.web.servlet.resource.ResourceResolver"> <![CDATA[ A list of ResourceResolver beans definition and references. A ResourceResolver provides mechanisms for resolving an incoming request to an actual Resource and for obtaining the public URL path that clients should use when requesting the resource. ]]> </xsd:documentation> </xsd:annotation> <xsd:sequence> <xsd:choice maxOccurs="unbounded"> <xsd:element type="version-resolver" name="version-resolver"/> <xsd:element ref="beans:bean"> <xsd:annotation> <xsd:documentation source="org.springframework.web.servlet.resource.ResourceResolver"> <![CDATA[ A ResourceResolver bean definition. ]]> </xsd:documentation> </xsd:annotation> </xsd:element> <xsd:element ref="beans:ref"> <xsd:annotation> <xsd:documentation source="org.springframework.web.servlet.resource.ResourceResolver"> <![CDATA[ A reference to a ResourceResolver bean. ]]> </xsd:documentation> </xsd:annotation> </xsd:element> </xsd:choice> </xsd:sequence> </xsd:complexType> <xsd:complexType name="resource-transformers"> <xsd:annotation> <xsd:documentation source="org.springframework.web.servlet.resource.ResourceTransformer"> <![CDATA[ A list of ResourceTransformer beans definition and references. A ResourceTransformer provides mechanisms for transforming the content of a resource. ]]> </xsd:documentation> </xsd:annotation> <xsd:sequence> <xsd:choice maxOccurs="unbounded"> <xsd:element ref="beans:bean"> <xsd:annotation> <xsd:documentation source="org.springframework.web.servlet.resource.ResourceTransformer"> <![CDATA[ A ResourceTransformer bean definition. ]]> </xsd:documentation> </xsd:annotation> </xsd:element> <xsd:element ref="beans:ref"> <xsd:annotation> <xsd:documentation source="org.springframework.web.servlet.resource.ResourceTransformer"> <![CDATA[ A reference to a ResourceTransformer bean. ]]> </xsd:documentation> </xsd:annotation> </xsd:element> </xsd:choice> </xsd:sequence> </xsd:complexType> <xsd:complexType name="resource-chain"> <xsd:annotation> <xsd:documentation source="org.springframework.web.servlet.config.annotation.ResourceChainRegistration"> <![CDATA[ Assists with the registration of resource resolvers and transformers. Unless set to "false", the auto-registration adds default Resolvers (a PathResourceResolver) and Transformers (CssLinkResourceTransformer, if a VersionResourceResolver has been manually registered). The resource-cache attribute sets whether to cache the result of resource resolution/transformation; setting this to "true" is recommended for production (and "false" for development). A custom Cache can be configured if a CacheManager is provided as a bean reference in the "cache-manager" attribute, and the cache name provided in the "cache-name" attribute. ]]> </xsd:documentation> </xsd:annotation> <xsd:sequence> <xsd:element name="resolvers" type="resource-resolvers" minOccurs="0" maxOccurs="1"/> <xsd:element name="transformers" type="resource-transformers" minOccurs="0" maxOccurs="1"/> </xsd:sequence> <xsd:attribute name="resource-cache" type="xsd:boolean" use="required"> <xsd:annotation> <xsd:documentation> <![CDATA[ Whether the resource chain should cache resource resolution. Note that the resource content itself won't be cached, but rather Resource instances. ]]> </xsd:documentation> </xsd:annotation> </xsd:attribute> <xsd:attribute name="auto-registration" type="xsd:boolean" default="true" use="optional"> <xsd:annotation> <xsd:documentation> <![CDATA[ Whether to register automatically ResourceResolvers and ResourceTransformers. Setting this property to "false" means that it gives developers full control over the registration process. ]]> </xsd:documentation> </xsd:annotation> </xsd:attribute> <xsd:attribute name="cache-manager" type="xsd:string" use="optional"> <xsd:annotation> <xsd:documentation> <![CDATA[ The name of the Cache Manager to cache resource resolution. By default, a ConcurrentCacheMap will be used. Since Resources aren't serializable and can be dependent on the application host, one should not use a distributed cache but rather an in-memory cache. ]]> </xsd:documentation> </xsd:annotation> </xsd:attribute> <xsd:attribute name="cache-name" type="xsd:string" use="optional"> <xsd:annotation> <xsd:documentation> <![CDATA[ The cache name to use in the configured cache manager. Will use "spring-resource-chain-cache" by default. ]]> </xsd:documentation> </xsd:annotation> </xsd:attribute> </xsd:complexType> <xsd:complexType name="cache-control"> <xsd:annotation> <xsd:documentation source="org.springframework.web.cache.CacheControl"> <![CDATA[ Generates "Cache-Control" HTTP response headers. ]]> </xsd:documentation> </xsd:annotation> <xsd:attribute name="must-revalidate" type="xsd:boolean" use="optional"> <xsd:annotation> <xsd:documentation> <![CDATA[ Adds a "must-revalidate" directive in the Cache-Control header. This indicates that caches should revalidate the cached response when it's become stale. ]]> </xsd:documentation> </xsd:annotation> </xsd:attribute> <xsd:attribute name="no-cache" type="xsd:boolean" use="optional"> <xsd:annotation> <xsd:documentation> <![CDATA[ Adds a "no-cache" directive in the Cache-Control header. This indicates that caches should always revalidate cached response with the server. ]]> </xsd:documentation> </xsd:annotation> </xsd:attribute> <xsd:attribute name="no-store" type="xsd:boolean" use="optional"> <xsd:annotation> <xsd:documentation> <![CDATA[ Adds a "no-store" directive in the Cache-Control header. This indicates that caches should never cache the response. ]]> </xsd:documentation> </xsd:annotation> </xsd:attribute> <xsd:attribute name="no-transform" type="xsd:boolean" use="optional"> <xsd:annotation> <xsd:documentation> <![CDATA[ Adds a "no-transform" directive in the Cache-Control header. This indicates that caches should never transform (i.e. compress, optimize) the response content. ]]> </xsd:documentation> </xsd:annotation> </xsd:attribute> <xsd:attribute name="cache-public" type="xsd:boolean" use="optional"> <xsd:annotation> <xsd:documentation> <![CDATA[ Adds a "public" directive in the Cache-Control header. This indicates that any cache MAY store the response. ]]> </xsd:documentation> </xsd:annotation> </xsd:attribute> <xsd:attribute name="cache-private" type="xsd:boolean" use="optional"> <xsd:annotation> <xsd:documentation> <![CDATA[ Adds a "private" directive in the Cache-Control header. This indicates that the response is intended for a single user and may not be stored by shared caches. ]]> </xsd:documentation> </xsd:annotation> </xsd:attribute> <xsd:attribute name="proxy-revalidate" type="xsd:boolean" use="optional"> <xsd:annotation> <xsd:documentation> <![CDATA[ Adds a "proxy-revalidate" directive in the Cache-Control header. This directive has the same meaning as the "must-revalidate" directive, except it only applies to shared caches. ]]> </xsd:documentation> </xsd:annotation> </xsd:attribute> <xsd:attribute name="max-age" type="xsd:int" use="optional"> <xsd:annotation> <xsd:documentation> <![CDATA[ Adds a "max-age" directive in the Cache-Control header. This indicates that the response should be cached for the given number of seconds. ]]> </xsd:documentation> </xsd:annotation> </xsd:attribute> <xsd:attribute name="s-maxage" type="xsd:int" use="optional"> <xsd:annotation> <xsd:documentation> <![CDATA[ Adds a "s-maxage" directive in the Cache-Control header. This directive has the same meaning as the "max-age" directive, except it only applies to shared caches. ]]> </xsd:documentation> </xsd:annotation> </xsd:attribute> <xsd:attribute name="stale-while-revalidate" type="xsd:int" use="optional"> <xsd:annotation> <xsd:documentation> <![CDATA[ Adds a "stale-while-revalidate" directive in the Cache-Control header. This indicates that caches may serve the response after it becomes stale up to the given number of seconds. ]]> </xsd:documentation> </xsd:annotation> </xsd:attribute> <xsd:attribute name="stale-if-error" type="xsd:int" use="optional"> <xsd:annotation> <xsd:documentation> <![CDATA[ Adds a "stale-if-error" directive in the Cache-Control header. When an error is encountered, a cached stale response may be used for the given number of seconds. ]]> </xsd:documentation> </xsd:annotation> </xsd:attribute> </xsd:complexType> <xsd:element name="resources"> <xsd:annotation> <xsd:documentation source="java:org.springframework.web.servlet.resource.ResourceHttpRequestHandler"> <![CDATA[ Configures a handler for serving static resources such as images, js, and, css files with cache headers optimized for efficient loading in a web browser. Allows resources to be served out of any path that is reachable via Spring's Resource handling. ]]> </xsd:documentation> </xsd:annotation> <xsd:complexType> <xsd:sequence> <xsd:element name="cache-control" type="cache-control" minOccurs="0" maxOccurs="1"/> <xsd:element name="resource-chain" type="resource-chain" minOccurs="0" maxOccurs="1"/> </xsd:sequence> <xsd:attribute name="mapping" use="required" type="xsd:string"> <xsd:annotation> <xsd:documentation> <![CDATA[ The URL mapping pattern within the current Servlet context to use for serving resources from this handler, such as "/resources/**" ]]> </xsd:documentation> </xsd:annotation> </xsd:attribute> <xsd:attribute name="location" use="required" type="xsd:string"> <xsd:annotation> <xsd:documentation> <![CDATA[ The resource location from which to serve static content, specified at a Spring Resource pattern. Each location must point to a valid directory. Multiple locations may be specified as a comma-separated list, and the locations will be checked for a given resource in the order specified. For example, a value of "/, classpath:/META-INF/public-web-resources/" will allow resources to be served both from the web app root and from any JAR on the classpath that contains a /META-INF/public-web-resources/ directory, with resources in the web app root taking precedence. For URL-based resources (e.g. files, HTTP URLs, etc) this property supports a special prefix to indicate the charset associated with the URL so that relative paths appended to it can be encoded correctly, e.g. "[charset=Windows-31J]https://example.org/path". ]]> </xsd:documentation> </xsd:annotation> </xsd:attribute> <xsd:attribute name="cache-period" type="xsd:string"> <xsd:annotation> <xsd:documentation> <![CDATA[ Specifies the cache period for the resources served by this resource handler, in seconds. The default is to not send any cache headers but rather to rely on last-modified timestamps only. Set this to 0 in order to send cache headers that prevent caching, or to a positive number of seconds in order to send cache headers with the given max-age value. ]]> </xsd:documentation> </xsd:annotation> </xsd:attribute> <xsd:attribute name="order" type="xsd:token"> <xsd:annotation> <xsd:documentation> <![CDATA[ Specifies the order of the HandlerMapping for the resource handler. The default order is Ordered.LOWEST_PRECEDENCE - 1. ]]> </xsd:documentation> </xsd:annotation> </xsd:attribute> </xsd:complexType> </xsd:element> <xsd:element name="default-servlet-handler"> <xsd:annotation> <xsd:documentation source="java:org.springframework.web.servlet.resource.DefaultServletHttpRequestHandler"> <![CDATA[ Configures a handler for serving static resources by forwarding to the Servlet container's default Servlet. Use of this handler allows using a "/" mapping with the DispatcherServlet while still utilizing the Servlet container to serve static resources. This handler will forward all requests to the default Servlet. Therefore it is important that it remains last in the order of all other URL HandlerMappings. That will be the case if you use the "annotation-driven" element or alternatively if you are setting up your customized HandlerMapping instance be sure to set its "order" property to a value lower than that of the DefaultServletHttpRequestHandler, which is Integer.MAX_VALUE. ]]> </xsd:documentation> </xsd:annotation> <xsd:complexType> <xsd:attribute name="default-servlet-name" type="xsd:string"> <xsd:annotation> <xsd:documentation> <![CDATA[ The name of the default Servlet to forward to for static resource requests. The handler will try to autodetect the container's default Servlet at startup time using a list of known names. If the default Servlet cannot be detected because of using an unknown container or because it has been manually configured, the servlet name must be set explicitly. ]]> </xsd:documentation> </xsd:annotation> </xsd:attribute> </xsd:complexType> </xsd:element> <xsd:element name="interceptors"> <xsd:annotation> <xsd:documentation> <![CDATA[ The ordered set of interceptors that intercept HTTP Servlet Requests handled by Controllers. Interceptors allow requests to be pre/post processed before/after handling. Each interceptor must implement the org.springframework.web.servlet.HandlerInterceptor or org.springframework.web.context.request.WebRequestInterceptor interface. The interceptors in this set are automatically detected by every registered HandlerMapping. The URI paths each interceptor applies to are configurable. ]]> </xsd:documentation> </xsd:annotation> <xsd:complexType> <xsd:choice maxOccurs="unbounded"> <xsd:choice> <xsd:element ref="beans:bean"> <xsd:annotation> <xsd:documentation> <![CDATA[ Registers an interceptor that intercepts every request regardless of its URI path.. ]]> </xsd:documentation> </xsd:annotation> </xsd:element> <xsd:element ref="beans:ref"> <xsd:annotation> <xsd:documentation> <![CDATA[ Registers an interceptor that intercepts every request regardless of its URI path.. ]]> </xsd:documentation> </xsd:annotation> </xsd:element> </xsd:choice> <xsd:element name="interceptor"> <xsd:annotation> <xsd:documentation source="java:org.springframework.web.servlet.handler.MappedInterceptor"> <![CDATA[ Registers an interceptor that interceptors requests sent to one or more URI paths. ]]> </xsd:documentation> </xsd:annotation> <xsd:complexType> <xsd:sequence> <xsd:element name="mapping" maxOccurs="unbounded"> <xsd:complexType> <xsd:attribute name="path" type="xsd:string" use="required"> <xsd:annotation> <xsd:documentation> <![CDATA[ A path into the application intercepted by this interceptor. Exact path mapping URIs (such as "/myPath") are supported as well as Ant-stype path patterns (such as /myPath/**). ]]> </xsd:documentation> </xsd:annotation> </xsd:attribute> </xsd:complexType> </xsd:element> <xsd:element name="exclude-mapping" minOccurs="0" maxOccurs="unbounded"> <xsd:complexType> <xsd:attribute name="path" type="xsd:string" use="required"> <xsd:annotation> <xsd:documentation> <![CDATA[ A path into the application that should not be intercepted by this interceptor. Exact path mapping URIs (such as "/admin") are supported as well as Ant-stype path patterns (such as /admin/**). ]]> </xsd:documentation> </xsd:annotation> </xsd:attribute> </xsd:complexType> </xsd:element> <xsd:choice> <xsd:element ref="beans:bean"> <xsd:annotation> <xsd:documentation> <![CDATA[ The interceptor's bean definition. ]]> </xsd:documentation> </xsd:annotation> </xsd:element> <xsd:element ref="beans:ref"> <xsd:annotation> <xsd:documentation> <![CDATA[ A reference to an interceptor bean. ]]> </xsd:documentation> </xsd:annotation> </xsd:element> </xsd:choice> </xsd:sequence> </xsd:complexType> </xsd:element> </xsd:choice> <xsd:attribute name="path-matcher" type="xsd:string"> <xsd:annotation> <xsd:documentation source="java:org.springframework.util.PathMatcher"> <![CDATA[ The bean name of a PathMatcher implementation to use with nested interceptors. This is an optional, advanced property required only if using custom PathMatcher implementations that support mapping metadata other than the Ant path patterns supported by default. ]]> </xsd:documentation> <xsd:appinfo> <tool:annotation kind="ref"> <tool:expected-type type="java:org.springframework.util.PathMatcher"/> </tool:annotation> </xsd:appinfo> </xsd:annotation> </xsd:attribute> </xsd:complexType> </xsd:element> <xsd:element name="view-controller"> <xsd:annotation> <xsd:documentation source="java:org.springframework.web.servlet.mvc.ParameterizableViewController"> <![CDATA[ Map a simple (logic-less) view controller to a specific URL path (or pattern) in order to render a response with a pre-configured status code and view. ]]> </xsd:documentation> </xsd:annotation> <xsd:complexType> <xsd:attribute name="path" type="xsd:string" use="required"> <xsd:annotation> <xsd:documentation> <![CDATA[ The URL path (or pattern) the controller is mapped to. ]]> </xsd:documentation> </xsd:annotation> </xsd:attribute> <xsd:attribute name="view-name" type="xsd:string"> <xsd:annotation> <xsd:documentation> <![CDATA[ Set the view name to return. Optional. If not specified, the view controller will return null as the view name in which case the configured RequestToViewNameTranslator will select the view name. The DefaultRequestToViewNameTranslator for example translates "/foo/bar" to "foo/bar". ]]> </xsd:documentation> </xsd:annotation> </xsd:attribute> <xsd:attribute name="status-code" type="xsd:int"> <xsd:annotation> <xsd:documentation> <![CDATA[ Set the status code to set on the response. Optional. If not set the response status will be 200 (OK). ]]> </xsd:documentation> </xsd:annotation> </xsd:attribute> </xsd:complexType> </xsd:element> <xsd:element name="redirect-view-controller"> <xsd:annotation> <xsd:documentation source="java:org.springframework.web.servlet.mvc.ParameterizableViewController"> <![CDATA[ Map a simple (logic-less) view controller to the given URL path (or pattern) in order to redirect to another URL. ]]> </xsd:documentation> </xsd:annotation> <xsd:complexType> <xsd:attribute name="path" type="xsd:string" use="required"> <xsd:annotation> <xsd:documentation> <![CDATA[ The URL path (or pattern) the controller is mapped to. ]]> </xsd:documentation> </xsd:annotation> </xsd:attribute> <xsd:attribute name="redirect-url" type="xsd:string" use="required"> <xsd:annotation> <xsd:documentation> <![CDATA[ By default, the redirect URL is expected to be relative to the current ServletContext, i.e. as relative to the web application root. ]]> </xsd:documentation> </xsd:annotation> </xsd:attribute> <xsd:attribute name="status-code" type="xsd:int"> <xsd:annotation> <xsd:documentation> <![CDATA[ Set the specific redirect 3xx status code to use. If not set, org.springframework.web.servlet.view.RedirectView will select MOVED_TEMPORARILY (302) by default. ]]> </xsd:documentation> </xsd:annotation> </xsd:attribute> <xsd:attribute name="context-relative" type="xsd:boolean"> <xsd:annotation> <xsd:documentation> <![CDATA[ Whether to interpret a given redirect URL that starts with a slash ("/") as relative to the current ServletContext, i.e. as relative to the web application root. The default is "true". ]]> </xsd:documentation> </xsd:annotation> </xsd:attribute> <xsd:attribute name="keep-query-params" type="xsd:boolean"> <xsd:annotation> <xsd:documentation> <![CDATA[ Whether to propagate the query parameters of the current request through to the target redirect URL. The default is "false". ]]> </xsd:documentation> </xsd:annotation> </xsd:attribute> </xsd:complexType> </xsd:element> <xsd:element name="status-controller"> <xsd:annotation> <xsd:documentation source="java:org.springframework.web.servlet.mvc.ParameterizableViewController"> <![CDATA[ Map a simple (logic-less) controller to the given URL path (or pattern) in order to sets the response status to the given code without rendering a body. ]]> </xsd:documentation> </xsd:annotation> <xsd:complexType> <xsd:attribute name="path" type="xsd:string" use="required"> <xsd:annotation> <xsd:documentation> <![CDATA[ The URL path (or pattern) the controller is mapped to. ]]> </xsd:documentation> </xsd:annotation> </xsd:attribute> <xsd:attribute name="status-code" type="xsd:int" use="required"> <xsd:annotation> <xsd:documentation> <![CDATA[ The status code to set on the response. ]]> </xsd:documentation> </xsd:annotation> </xsd:attribute> </xsd:complexType> </xsd:element> <xsd:complexType name="contentNegotiationType"> <xsd:all> <xsd:element name="default-views" minOccurs="0"> <xsd:complexType> <xsd:sequence> <xsd:choice maxOccurs="unbounded"> <xsd:element ref="beans:bean"> <xsd:annotation> <xsd:documentation> <![CDATA[ A bean definition for an org.springframework.web.servlet.View class. ]]> </xsd:documentation> </xsd:annotation> </xsd:element> <xsd:element ref="beans:ref"> <xsd:annotation> <xsd:documentation> <![CDATA[ A reference to a bean for an org.springframework.web.servlet.View class. ]]> </xsd:documentation> </xsd:annotation> </xsd:element> </xsd:choice> </xsd:sequence> </xsd:complexType> </xsd:element> </xsd:all> <xsd:attribute name="use-not-acceptable" type="xsd:string"> <xsd:annotation> <xsd:documentation> <![CDATA[ Indicate whether a 406 Not Acceptable status code should be returned if no suitable view can be found. ]]> </xsd:documentation> </xsd:annotation> </xsd:attribute> </xsd:complexType> <xsd:complexType name="urlViewResolverType"> <xsd:attribute name="prefix" type="xsd:string"> <xsd:annotation> <xsd:documentation> <![CDATA[ The prefix that gets prepended to view names when building a URL. ]]> </xsd:documentation> </xsd:annotation> </xsd:attribute> <xsd:attribute name="suffix" type="xsd:string"> <xsd:annotation> <xsd:documentation> <![CDATA[ The suffix that gets appended to view names when building a URL. ]]> </xsd:documentation> </xsd:annotation> </xsd:attribute> <xsd:attribute name="cache-views" type="xsd:boolean"> <xsd:annotation> <xsd:documentation> <![CDATA[ Enable or disable thew caching of resolved views. Default is "true": caching is enabled. Disable this only for debugging and development. ]]> </xsd:documentation> </xsd:annotation> </xsd:attribute> <xsd:attribute name="view-class" type="xsd:string"> <xsd:annotation> <xsd:documentation> <![CDATA[ The view class that should be used to create views. Configure this if you want to provide a custom View implementation, typically a ub-class of the expected View type. ]]> </xsd:documentation> <xsd:appinfo> <tool:annotation kind="ref"> <tool:expected-type type="java:java.lang.Class"/> </tool:annotation> </xsd:appinfo> </xsd:annotation> </xsd:attribute> <xsd:attribute name="view-names" type="xsd:string"> <xsd:annotation> <xsd:documentation> <![CDATA[ Set the view names (or name patterns) that can be handled by this view resolver. View names can contain simple wildcards such that 'my*', '*Report' and '*Repo*' will all match the view name 'myReport'. ]]> </xsd:documentation> </xsd:annotation> </xsd:attribute> </xsd:complexType> <xsd:element name="view-resolvers"> <xsd:annotation> <xsd:documentation> <![CDATA[ Configure a chain of ViewResolver instances to resolve view names returned from controllers into actual view instances to use for rendering. All registered resolvers are wrapped in a single (composite) ViewResolver with its order property set to 0 so that other external resolvers may be ordere ]]> <![CDATA[ d before or after it. When content negotiation is enabled the order property is set to highest priority instead with the ContentNegotiatingViewResolver encapsulating all other registered view resolver instances. That way the resolvers registered through the MVC namespace form self-encapsulated resolver chain. ]]> </xsd:documentation> </xsd:annotation> <xsd:complexType> <xsd:choice minOccurs="1" maxOccurs="unbounded"> <xsd:element name="content-negotiation" type="contentNegotiationType"> <xsd:annotation> <xsd:documentation> <![CDATA[ Registers a ContentNegotiatingViewResolver with the list of all other registered ViewResolver instances used to set its "viewResolvers" property. See the javadoc of ContentNegotiatingViewResolver for more details. ]]> </xsd:documentation> </xsd:annotation> </xsd:element> <xsd:element name="jsp" type="urlViewResolverType"> <xsd:annotation> <xsd:documentation> <![CDATA[ Register an InternalResourceViewResolver bean for JSP rendering. By default, "/WEB-INF/" is registered as a view name prefix and ".jsp" as a suffix. ]]> </xsd:documentation> </xsd:annotation> </xsd:element> <xsd:element name="tiles" type="urlViewResolverType"> <xsd:annotation> <xsd:documentation> <![CDATA[ Register a TilesViewResolver based on Tiles 3.x. To configure Tiles you must also add a top-level <mvc:tiles-configurer> element or declare a TilesConfigurer bean. ]]> </xsd:documentation> </xsd:annotation> </xsd:element> <xsd:element name="freemarker" type="urlViewResolverType"> <xsd:annotation> <xsd:documentation> <![CDATA[ Register a FreeMarkerViewResolver. By default, ".ftl" is configured as a view name suffix. To configure FreeMarker you must also add a top-level <mvc:freemarker-configurer> element or declare a FreeMarkerConfigurer bean. ]]> </xsd:documentation> </xsd:annotation> </xsd:element> <xsd:element name="groovy" type="urlViewResolverType"> <xsd:annotation> <xsd:documentation> <![CDATA[ Register a GroovyMarkupViewResolver. By default, ".tpl" is configured as a view name suffix. To configure the Groovy markup template engine you must also add a top-level <mvc:groovy-configurer> element or declare a GroovyMarkupConfigurer bean. ]]> </xsd:documentation> </xsd:annotation> </xsd:element> <xsd:element name="script-template" type="urlViewResolverType"> <xsd:annotation> <xsd:documentation> <![CDATA[ Register a ScriptTemplateViewResolver. To configure the Script engine you must also add a top-level <mvc:script-template-configurer> element or declare a ScriptTemplateConfigurer bean. ]]> </xsd:documentation> </xsd:annotation> </xsd:element> <xsd:element name="bean-name" maxOccurs="1"> <xsd:annotation> <xsd:documentation> <![CDATA[ Register a BeanNameViewResolver bean. ]]> </xsd:documentation> </xsd:annotation> </xsd:element> <xsd:element ref="beans:bean"> <xsd:annotation> <xsd:documentation> <![CDATA[ Register a ViewResolver as a direct bean declaration. ]]> </xsd:documentation> </xsd:annotation> </xsd:element> <xsd:element ref="beans:ref"> <xsd:annotation> <xsd:documentation> <![CDATA[ Register a ViewResolver through references to an existing bean declaration. ]]> </xsd:documentation> </xsd:annotation> </xsd:element> </xsd:choice> <xsd:attribute name="order" type="xsd:int"> <xsd:annotation> <xsd:documentation> <![CDATA[ ViewResolver's registered through this element are encapsulated in an instance of org.springframework.web.servlet.view.ViewResolverComposite and follow the order of registration. This attribute determines the order of the ViewResolverComposite itself relative to any additional ViewResolver's (not registered through this element) present in the Spring configuration By default this property is not set, which means the resolver is ordered at Ordered.LOWEST_PRECEDENCE unless content negotiation is enabled in which case the order (if not set explicitly) is changed to Ordered.HIGHEST_PRECEDENCE. ]]> </xsd:documentation> </xsd:annotation> </xsd:attribute> </xsd:complexType> </xsd:element> <xsd:element name="tiles-configurer"> <xsd:annotation> <xsd:documentation> <![CDATA[ Configure Tiles 3.x by registering a TilesConfigurer bean. This is a shortcut alternative to declaring a TilesConfigurer bean directly. ]]> </xsd:documentation> </xsd:annotation> <xsd:complexType> <xsd:sequence> <xsd:element name="definitions" minOccurs="0" maxOccurs="unbounded"> <xsd:complexType> <xsd:attribute name="location" type="xsd:string" use="required"> <xsd:annotation> <xsd:documentation> <![CDATA[ The location of a file containing Tiles definitions (or a Spring resource pattern). If no Tiles definitions are registerd, then "/WEB-INF/tiles.xml" is expected to exists. ]]> </xsd:documentation> </xsd:annotation> </xsd:attribute> </xsd:complexType> </xsd:element> </xsd:sequence> <xsd:attribute name="check-refresh" type="xsd:boolean"> <xsd:annotation> <xsd:documentation> <![CDATA[ Whether to check Tiles definition files for a refresh at runtime. Default is "false". ]]> </xsd:documentation> </xsd:annotation> </xsd:attribute> <xsd:attribute name="validate-definitions" type="xsd:boolean"> <xsd:annotation> <xsd:documentation> <![CDATA[ Whether to validate the Tiles XML definitions. Default is "true". ]]> </xsd:documentation> </xsd:annotation> </xsd:attribute> <xsd:attribute name="definitions-factory" type="xsd:string"> <xsd:annotation> <xsd:documentation> <![CDATA[ The Tiles DefinitionsFactory class to use. Default is Tiles' default. ]]> </xsd:documentation> </xsd:annotation> </xsd:attribute> <xsd:attribute name="preparer-factory" type="xsd:string"> <xsd:annotation> <xsd:documentation> <![CDATA[ The Tiles PreparerFactory class to use. Default is Tiles' default. Consider "org.springframework.web.servlet.view.tiles3.SimpleSpringPreparerFactory" or "org.springframework.web.servlet.view.tiles3.SpringBeanPreparerFactory" (see javadoc). ]]> </xsd:documentation> </xsd:annotation> </xsd:attribute> </xsd:complexType> </xsd:element> <xsd:element name="freemarker-configurer"> <xsd:annotation> <xsd:documentation> <![CDATA[ Configure FreeMarker for view resolution by registering a FreeMarkerConfigurer bean. This is a shortcut alternative to declaring a FreeMarkerConfigurer bean directly. ]]> </xsd:documentation> </xsd:annotation> <xsd:complexType> <xsd:sequence> <xsd:element name="template-loader-path" minOccurs="0" maxOccurs="unbounded"> <xsd:complexType> <xsd:attribute name="location" type="xsd:string" use="required"> <xsd:annotation> <xsd:documentation> <![CDATA[ The location of a FreeMarker template loader path (or a Spring resource pattern). ]]> </xsd:documentation> </xsd:annotation> </xsd:attribute> </xsd:complexType> </xsd:element> </xsd:sequence> </xsd:complexType> </xsd:element> <xsd:element name="groovy-configurer"> <xsd:annotation> <xsd:documentation> <![CDATA[ Configure the Groovy markup template engine for view resolution by registering a GroovyMarkupConfigurer bean. This is a shortcut alternative to declaring a GroovyMarkupConfigurer bean directly. ]]> </xsd:documentation> </xsd:annotation> <xsd:complexType> <xsd:attribute name="auto-indent" type="xsd:boolean"> <xsd:annotation> <xsd:documentation> <![CDATA[ Whether you want the template engine to render indents automatically. ]]> </xsd:documentation> </xsd:annotation> </xsd:attribute> <xsd:attribute name="cache-templates" type="xsd:boolean"> <xsd:annotation> <xsd:documentation> <![CDATA[ If enabled templates are compiled once for each source (URL or File). It is recommended to keep this flag to true unless you are in development mode and want automatic reloading of templates. ]]> </xsd:documentation> </xsd:annotation> </xsd:attribute> <xsd:attribute name="resource-loader-path" type="xsd:string"> <xsd:annotation> <xsd:documentation> <![CDATA[ The Groovy markup template engine resource loader path via a Spring resource location. ]]> </xsd:documentation> </xsd:annotation> </xsd:attribute> </xsd:complexType> </xsd:element> <xsd:element name="script-template-configurer"> <xsd:annotation> <xsd:documentation> <![CDATA[ Configure the script engine for view resolution by registering a ScriptTemplateConfigurer bean. This is a shortcut alternative to declaring a ScriptTemplateConfigurer bean directly. ]]> </xsd:documentation> </xsd:annotation> <xsd:complexType> <xsd:sequence> <xsd:element name="script" minOccurs="0" maxOccurs="unbounded"> <xsd:complexType> <xsd:attribute name="location" type="xsd:string" use="required"> <xsd:annotation> <xsd:documentation> <![CDATA[ The location of the script to be loaded by the script engine (library or user provided). ]]> </xsd:documentation> </xsd:annotation> </xsd:attribute> </xsd:complexType> </xsd:element> </xsd:sequence> <xsd:attribute name="engine-name" type="xsd:string" use="required"> <xsd:annotation> <xsd:documentation> <![CDATA[ The script engine name to use by the view. The script engine must implement Invocable. ]]> </xsd:documentation> </xsd:annotation> </xsd:attribute> <xsd:attribute name="render-object" type="xsd:string"> <xsd:annotation> <xsd:documentation> <![CDATA[ The object where belong the render function. For example, in order to call Mustache.render(), renderObject should be set to Mustache and renderFunction to render. ]]> </xsd:documentation> </xsd:annotation> </xsd:attribute> <xsd:attribute name="render-function" type="xsd:string" use="required"> <xsd:annotation> <xsd:documentation> <![CDATA[ Set the render function name. ]]> </xsd:documentation> </xsd:annotation> </xsd:attribute> <xsd:attribute name="content-type" type="xsd:string"> <xsd:annotation> <xsd:documentation> <![CDATA[ Set the content type to use for the response (text/html by default). ]]> </xsd:documentation> </xsd:annotation> </xsd:attribute> <xsd:attribute name="charset" type="xsd:string"> <xsd:annotation> <xsd:documentation> <![CDATA[ Set the charset used to read script and template files (UTF-8 by default). ]]> </xsd:documentation> </xsd:annotation> </xsd:attribute> <xsd:attribute name="resource-loader-path" type="xsd:string"> <xsd:annotation> <xsd:documentation> <![CDATA[ The script engine resource loader path via a Spring resource location. ]]> </xsd:documentation> </xsd:annotation> </xsd:attribute> <xsd:attribute name="shared-engine" type="xsd:boolean"> <xsd:annotation> <xsd:documentation> <![CDATA[ When set to false, use thread-local ScriptEngine instances instead of one single shared instance. This flag should be set to false for those using non thread-safe script engines with templating libraries not designed for concurrency, like Handlebars or React running on Nashorn for example. In this case, Java 8u60 or greater is required due to this bug: https://bugs.openjdk.java.net/browse/JDK-8076099. ]]> </xsd:documentation> </xsd:annotation> </xsd:attribute> </xsd:complexType> </xsd:element> <xsd:element name="cors"> <xsd:annotation> <xsd:documentation> <![CDATA[ Configure cross origin requests processing. ]]> </xsd:documentation> </xsd:annotation> <xsd:complexType> <xsd:sequence> <xsd:element name="mapping" minOccurs="1" maxOccurs="unbounded"> <xsd:annotation> <xsd:documentation> <![CDATA[ Enable cross origin requests processing on the specified path pattern. By default, all origins, GET HEAD POST methods, all headers and credentials are allowed and max age is set to 30 minutes. ]]> </xsd:documentation> </xsd:annotation> <xsd:complexType> <xsd:attribute name="path" type="xsd:string" use="required"> <xsd:annotation> <xsd:documentation> <![CDATA[ A path into the application that should handle CORS requests. Exact path mapping URIs (such as "/admin") are supported as well as Ant-stype path patterns (such as /admin/**). ]]> </xsd:documentation> </xsd:annotation> </xsd:attribute> <xsd:attribute name="allowed-origins" type="xsd:string"> <xsd:annotation> <xsd:documentation> <![CDATA[ Comma-separated list of origins to allow, e.g. "https://domain1.com, https://domain2.com". The special value "*" allows all domains (default). Note that CORS checks use values from "Forwarded" (RFC 7239), "X-Forwarded-Host", "X-Forwarded-Port", and "X-Forwarded-Proto" headers, if present, in order to reflect the client-originated address. Consider using the ForwardedHeaderFilter in order to choose from a central place whether to extract and use such headers, or whether to discard them. See the Spring Framework reference for more on this filter. ]]> </xsd:documentation> </xsd:annotation> </xsd:attribute> <xsd:attribute name="allowed-methods" type="xsd:string"> <xsd:annotation> <xsd:documentation> <![CDATA[ Comma-separated list of HTTP methods to allow, e.g. "GET, POST". The special value "*" allows all method. By default GET, HEAD and POST methods are allowed. ]]> </xsd:documentation> </xsd:annotation> </xsd:attribute> <xsd:attribute name="allowed-headers" type="xsd:string"> <xsd:annotation> <xsd:documentation> <![CDATA[ Comma-separated list of headers that a pre-flight request can list as allowed for use during an actual request. The special value of "*" allows actual requests to send any header (default). ]]> </xsd:documentation> </xsd:annotation> </xsd:attribute> <xsd:attribute name="exposed-headers" type="xsd:string"> <xsd:annotation> <xsd:documentation> <![CDATA[ Comma-separated list of response headers other than simple headers (i.e. Cache-Control, Content-Language, Content-Type, Expires, Last-Modified, Pragma) that an actual response might have and can be exposed. Empty by default. ]]> </xsd:documentation> </xsd:annotation> </xsd:attribute> <xsd:attribute name="allow-credentials" type="xsd:boolean"> <xsd:annotation> <xsd:documentation> <![CDATA[ Whether user credentials are supported (true by default). ]]> </xsd:documentation> </xsd:annotation> </xsd:attribute> <xsd:attribute name="max-age" type="xsd:long"> <xsd:annotation> <xsd:documentation> <![CDATA[ How long, in seconds, the response from a pre-flight request can be cached by clients. 1800 seconds (30 minutes) by default. ]]> </xsd:documentation> </xsd:annotation> </xsd:attribute> </xsd:complexType> </xsd:element> </xsd:sequence> </xsd:complexType> </xsd:element> </xsd:schema>
ZulqarnainZilli / 9 Email Marketing Tips For Content Marketers9 Email Marketing Tips For Content Marketers Even “agnostics” regarding email marketing can't hash out the following evidence - the average ROI from this promotional practice is close to 3,800%. Measureless opportunities to scale up and relative cheapness, compared to other reaching-out channels, are the two reasons why the email marketing is fair-haired by businesses. However, this is not about the price and physical extent alone. The chief advantage is a better alignment of communication with customers. If you hope a certain content strategy brings desirable results, overlooking the quality of mailing messages will be a sorry pitfall. Always keep in mind that newsletters, welcome, retention, and other emails are not just a brand's facade - but a powerful tool for generating conversions. By joining sides of email and content strategies, you can come up with synergy from both. In this guide, we’ll cover a few recommendations for content marketers on how to write email messages that work. Tips for email marketing Segment your list Split the batch of email recipients into smaller groups based on chosen criteria, and mail distinct relevant messages - for each. You can use recipients' GEO, demographic characteristics, or purchase history to distinguish homogeneous clusters and proceed with the content planning. Segmentation is the basic premise for personalization, and if you still doubt why bothering about the latter - here are just a few numbers we took from Instapage: 52% of customers claim they do care if the message was tailor-made or not 82% of marketers say that mail personalization increases the open ratio custom emails have 41% more unique clicks than mass-produced ones. To avoid a fragmented approach, use data from CRMs, website analytics tools, and other sources to define segments. Concerning phrasings, a good idea is to create Buyer personas profiles. Thus, you'll be able to choose the appropriate message length and wording. Say, design a newsletter to promote paid subscription for an email validator service. You've decided to distinguish corporate clients based on their company size and determined the following groups: #1 - B2Bs and #2 - sole entrepreneurs. Possible messages for the two: #1. Our "XXL" plan is perfect for agencies and enterprises. One can add unlimited users and conduct up to 100,000 checks per month. #2. With our "S" you get 1,000 credits and 5,000 unique recipients - for only $33 per month. Plus - a 7-days free trial. Use interactive content The best content marketers know that interactive content came into vogue a long time ago. As to emails, here are the most common examples: CSS animated buttons If you include CTAs buttons (that we hope you do) - liven them up a bit. Add an animated hover effect, so that every time a recipient puts a cursor on a button, it changes shape, shade, color, or text. “Add hover to emphasise objects”, source This shouldn’t necessarily be something dramatic - add tiny accents that will yet grab the user's attention. starring “Add a star rating component to engage readers with content”, source Including ranking or reviewing widgets in the email body is one of the most working ways to engage the reader with the message. Ask recipients to assess your product or service with stars. Add the link to Google Forms if you want to receive an extended opinion on overall customer satisfaction. pictures' rollovers “Use animated images to describe goods better”, source The effect is eagerly used by the ones who promote online stores. Using The rollover allows to show goods from different angles or even play with recipients, if relevant. Take into account that this feature only works on desktops - mobile mail users will see the very first picture only. images carousel “Add pieces of text directly on images”, source If you want to enhance goods cards with descriptive content, say - price and shipping details, use a carousel instead of a rollover. As so, you can add more info pictures to the email body and, hopefully, convert more recipients into customers. a countdown “Countdowns work well for limited in time offers”, source Again, this type of interactive content fits the online shopping niche. Animated clocks amplify urgency and theoretically increase conversions. But it's important to stay extremely careful and not to sound desperate - otherwise, the newsletter will end up in the recipient's "Spam". Improve design The attractiveness of an email is something granted on certain terms, indeed. Not all emails need to be flashy or include expensive designs. However, there are some prevailing common trends in the matter. By following them, you seem to show the recipient that your company is moving in step with the times, and not stuck in the 2000s. Here's the shortlist from the TOP email design trends list that a 99designs provides - as of 2021: magazine-styled “Make newsletters to look a bit editorial”, source More and more newsletters tend to look like a centerfold from good old printed media. With a strict following to the "Less is more" principle - clear fonts, short phrases, HD-quality images with a few objects on them, and short CTAs. hand-made illustrations “Unique pictures create a distinct flavour of your brand”, source Tailored icons or sketchy images - whatever fits your mailing purpose, just make sure it's not too bright, contrast, or overloaded with details. Give preference to clean colors. skeuomorphic objects This is when a design resembles a real object. To see an example - just open a reader App on your smartphone. “A skeuomorphic bookshelf”, source HD photographies “If you operate in the luxury segment, do not skimp on email visuals”, source These are expensive content, but if you work in fashion or other chick industries - it may be worth the effort. animated content Yeap, we've covered this in a previous tip. single scroll “Looks especially good on smartphones”, source Place the entire email content, including buttons, on the endless-looking long frame. Focus on conversions Stay focused on what's your mailing purpose. Don't forget that everybody expects to see a good ROI from email actions at the end of the reporting period. Craft effective CTAs - perceive these not as a sole button with a "Download now" text or so, but as an entire sense of a message that you write. To create a captivating CTA copy, adhere to the below advices: include win-win propositions Even though you’re not providing a customer with a discount or cash refund at the moment, your proposition may include a non-monetary incentive. New arrivals, selection of the latest news, free copies, advice from experts - the only rule here is to offer what’ll hold in high esteem. trigger on emotions Don't long-windedly list benefits. Instead, simulate a life situation and show how your product or service can help. use several CTAs throughout the email Email body may be viewed in several scrolls, especially when via small mobile devices’ screens. If you add a call to action at the beginning of the message, a mere number of users will get back to it after finishing reading the content. Thus, you may lose potential conversion. Include several buttons throughout the email body, but don’t sound repeatedly - change calls’ forms and wording. Encourage readers to reply Driving recipients to reply is challenging yet able to be done. First, choose the proper writing tone. According to an extensive study of emails that didn’t get a response, the most preferable is a 3rd-grade reading level. “Too elementary or too proficient tone may scare away readers”, source Of course, you must apply this recommendation with an eye on the recipient. If you mail to a professor or a government agency, a “3rd-grade” rule isn’t applicable. But all else being equal - simplify the lexicon to the level a schoolchild can understand it. Another trick is to sound overall happy. Emails that are enhanced with positive emotions get 10-15% more replies, on average than neutral ones. The best manner is to choose a slightly warm tone. Exaggerated excitement may look weird and even suspicious, especially when reaching out to business partners. And don’t forget about courtesy. A rare person will respond if you address him or her with a hair-raising “To whom it may concern” phrase. Make it personal Personification shouldn’t be confused with personalization. The second is rather about mailing fitting content from a commercial perspective, while the first term - about addressing the recipient as a one-off personality. Personal emails start with the recipient’s name - and no other way. They include references to the user's interests or past actions. For example, if your tourist agency’s client is interested in island vacations - you shall approach him or her with respective offers. They also shall contain personalized promotions, if any. The best way to expand this approach on hundreds or thousands of recipients is to launch trigger-based email campaigns. Create delivery scenarios for different segments or stages of a sales pipeline. Then prepare a fitting sequence of relevant content - for every single scenario. To give a human face to mailing, one can practice greetings, as well. Birthdays, state holidays, anniversaries, a new status in the loyalty system - there are a lot of examples of what one may congratulate the customer with. Keep your emails out of spam folders It is better not to launch mailing at all than to use an untrustworthy emails’ database. The risks are much higher than a slew of undelivered messages - from harming a sender's reputation to being banned by mailing systems. So it's better to stay proactive: tidy away broken, misspelled, temporary, or other worrisome emails from the database - either manually or with the help of software collect a valid email address only - through email finders avoid spam-trigger words establish a double opt-in validation set the correct mailing frequency. Make sure your emails look clean and crisp Newsletters shall afterall bring revenues - whether you want it or not. But in a bid of quantity, don’t lose the overall content integrity and sense: a subject line, pre-header, header, email body, and calls shall be consistent with one another the copy must be of the proper size; although the length depends on many factors, stick to an “ideal” interval - 50 to 125 words if can, don’t attach too many files or links to external websites - mailing filters are suspicious to these adapt the layout to fit smaller screens - nothing looks worse than broken email elements when you open it on mobile. Wrapping up It doesn't make much difference whether you create mailing content for personal or business purposes - these email marketing tips will serve both. No strains here - the recipient’s interest should be at your forefront. If you can hook him or her with the content by using tricks we've covered, you’ll never fail with enough conversions.
aiok03 / Final MiningDescriptive statistics and Explanatory data analysis In order to have an idea of the received data, we look through our table transactions and train. The shape of the train is 6000 rows and 2 columns (client_id and target – gender). Also we considered the info of transactions and noticed that there are no empty values, all of them are equal to 130039. After that we merged two tables and called it as data. To display unique codes and types we used ‘unique’ function and noticed that unique codes 173 and unique types 61. Using ‘describe’ function we can see minimal code, type, sum and the same parameters but maximum. The first hypothesis was to find what gender makes lots of requests. For conveniency we used for loop to make values in percentile view. And according to the barplot the biggest number of processes are made by females. The second hypothesis was to find the code with the biggest sum. For that we grouped by code and counted the mean of all sums. This list we converted from series to frame for further working process. The problem was that the code interpreted the code as the index, that’s why we have to fix it with ‘reset_index’ function. After that we plotted the graph and noticed that the most high sum is with 4722 code and proved it with another code under the graph. The third hypothesis is to find the distribution of sums relatively to the gender. But the first graph didn’t replaced this information because the scatter of the data is too high. The sign is not normally distributed and it is not symmetrical. It is hard to asses, that’s why we grouped information by gender and counted mean of the sum. According to this information we noticed that males spend more money than women. The same process we made with median and got the same conclusion. And since the mean and median values are not equal, our assumption about unnormalized data was proved. The last hypothesis was to find number of clients for each type and code – to find the most popular request within clients. For that we applied ‘str’ to each parameter for correct visualization on the graph. Counted the number of each request for type and code and reflected it in the graphs. According to them the most popular is 1010 type and 6011 code. Lastly, for further working process we returned type and code to the int type. Feature engineering Client’s balance condition We took every sum from dataframe data, grouped for every client and found the sum for each of them. We calculated the income and expenses for each client. Some clients with minus value made more expenses, some of them not, that means that he got more income. In minus is 0, in plus is 1. RFM In RFM section we started from Recency. For each client we grouped the information about them and found the maximum date where the transaction was done. The datetime column consisted from two values – date and time, for further working process in future engineering section we divided them for different columns. The most recent day we equaled to 457 and according to this value started to count the recency of last transactions for each client by subtraction. The next step is Frequency. We used ‘group by’ function and counted appearance of each client in our database. The last step is Monetary (to count expenses). Using group by function and condition, where the sum is less than 0 (expenses are negative values), we counted the total expenses of each client and noticed one point. That some clients didn’t spend any money at all. Segmentation based on RFM We merged all the tables into one and made a rank according to the best values in each segment using percentage. Using the formula we divided clients by 5 score scale, by this database and elbow method, plotted the graph, where 3 clusters were optimal solution. With KMeans library we plotted the k-mean illustration of clients according to the distance from randomly chosen centroids, showed distribution of clients in clusters. After the work done we gathered basic table with clusters using prefixes to each of them. Clustering for codes Now we'll work with codes to create clustering codes, and we'll utilize TF IDF and k-means to do it. We will also employ limitization, tokenization, and stop word elimination. We import the pymorphy2 library for limiting, and limiting is when words take their original form. Tokenization by sentences is the process of dividing a written language into component sentences. We also need to delete stop words, a stop word is a commonly used word (such as “the”, “a”, “an”, “in”) that a search engine has been programmed to ignore, both when indexing entries for searching and when retrieving them as the result of a search query. We would not want these words to take up space in our database, or taking up valuable processing time. We also make use of the re – Regular expression operations library, which is a library for regular expression operations. In this section we also use MorphAnalyzer() - Morphological analysis is the identification of a word's features based on how it is spelt. Morphological analysis does not make use of information about nearby words. For morphological analysis of words, there is a MorphAnalyzer class in pymorphy2. If we apply directly the clustering on those matrix, we will have issues as our matrices are very sparse and the computation of distances will be a mess. What we can do, is to perform IS to reduce data to a dense matrix of dimension 156 by applying SVD. Singular Value Decomposition (SVD) is one of the widely used methods for dimensionality reduction. We defined that 156 is the right number in our case. We used the Silhouette score to evaluate the quality of clusters created using K-Means. By Silhouette score we chose number of clusters and performed k means clustering on our tf-idf matrix. Then we tried to do a visualization of our clusters and we applied t-sne . t-SNE is a tool to visualize high-dimensional data. And then we added clusters to data and df dataframe. Finally we created word cloud by our clusters Clustering for types Data cleaning for types Firstly, we noticed that there were 155 types. However in data, there are 61 types. When we merge the data and that types, the total number of types become 58. This means that 3 types have no any description and that’s why we replace them with the mode value. Also we found that some types have type description ‘н.д’ which means no data and their total number in data is 26. Also we noticed that type description repeats for several types and we dropped duplicates and replaced them with first accurancy type in data. Creating clusters for types We manually divided them into the 5 categories according to dome key words in description. And merged them with our dataframe. Then we noticed outliers in recency and frequency. We found 0.999 and 0.001 quantile, where the first one is considered as the high, and the second is the low boundary. Everything above 0,999 and below 0.001 is considered as an outlier. We removed them for both recency and frequency. After that we checked dataframe by describe and concluded that everything become normal. Supervised learning The time for prediction came. We divided our dataframe into train and test and used KNN, Decision Tree Classifier and Random Forest, Logistic Regression for further predictions. We decided to investigate the accuracy from 1 to 20 with step 2 for each neighbor in train and test. And built the plot. The best result is accuracy 58 for 19 neighbors. Decision Tree gave us 54 for test set and Random Forest’s accuracy was 64. We investigated feature importance for both of them and noticed that monetary had the most influence on predicting the data. For Grid Search we manually set the hyper parameters and for cross validation equals to four folds. Best estimater for random forest classifier for grid search was found. After that good estimaters were chosen for random forest, and the same accuracy occurred. Best accuracy for random forest with default hyper parameters. We built confusion matrix and calculated recall, precision and f-1 score. Also we decided to build lofistic regression but the accuracy was too small, that’s why we build roc-auc and precision-recall curve. Conclusion All the models showed that taken data was not enough and actually not the best for gender prediction. Actions for increase the accuracy were done, such as adding more features, removing outliers. According to this investigation the best choice was random forest.
tnc-ca-geo / Video ImporterTraverses directories to import video as segmented events with correct timestamps and metadata
akullpp / GroxySimple reverse proxy to proxy requests based on the first path segment to the correct service in order to substitute services partially in local development
ajayrock007 / Why Website Design And Development Is Important And How It Helps In Making Your Business Profitable With the coming of new innovation, it is very not entirely obvious out on regarded openings accessible. The present circumstance is surprisingly more dreadful when one doesn't have the aptitude to tap on these changes. Indeed, this is the situation for organizations which have restricted information on site advancement and plan. Let's be honest, site administrations have colossally changed how the business functions. Thus, for genuine business people or organizations wishing to know the significance of sites this article gives simply that. Makes route simple With regards to having an effective online stage, the client should appreciate simple route. Basically, data gave on the site ought to be anything but difficult to get to. Hence, it is normal that the pages have quick stacking speeds. Accordingly, the site like online car parts store is needed to offer alternatives to additional guide in route. This incorporates the consideration of an inquiry box. Here, the clients will type on the hunt instrument and rapidly be coordinated to the segment. It is through excellent website composition that a designer's site accomplishes this. Beside building up the site, the designer is encouraged to routinely test the pages for simplicity of route. This is to kill or resolve bugs that may hamper the simplicity of stacking site pages. Keep in mind, on the off chance that a site has great route abilities, at that point it is ensured of more natural traffic. Will win with SEO Site design improvement has become a major perspective to see with regards to the site. With a large number of sites challenging to top in web index results pages (SERPs), web crawlers needed to acquaint a path with list locales. All things considered, it is through web improvement and plan that one will achieve a higher positioning. Here, boundaries, for example, title labels, utilization of catchphrases, picture advancement, connecting among others are thought of. This suggests that the site fulfills all the guidelines needed by be positioned top. Thusly, it is through streamlining that the site becomes easy to understand. Beside having the site, the website admins will hold the truly necessary clients. Under this, the web engineer is needed to incorporate highlights, for example, "embolden". This further involves the need to have short sighted plans on the pages. In this way, you will learn on the normal stacking speeds. It is through this improvement that the site shows up when various questions are made. So the site gets more taps on query items. Give visual substance on the site Truth be stated, selling unique item and administrations can be unwieldy. This is additionally muddled when an organization just gives huge loads of text about their forte. It is here that site advancement flavours things up. By reaching an expert website specialist, the entrepreneur will pick the pictures to utilize. Furthermore, the undertaking has the opportunity to pick the quantity of promotion recordings and pictures. This will be guided by the improvement on web indexes. The value of utilizing visual substance is that furnishes the clients with an away from of what the item resembles. Evidently, not all clients comprehend the administrations or items offered through content. So the consideration of pictures makes it easy to drive the message home. Other than this, utilization of pictures on the site effectively catches the consideration of the peruses. Prior to perusing the content, clients are frequently enthused about the picture. This improves the odds of having more clients to the site. By the by, website admins are encouraged to try not to stuff the visual information. This is on the grounds that it makes it hard for the client to decipher. It additionally brings down the positioning of the site of site improvement. So it is imperative to direct the utilization of symbolism. Increment the deals Business flourishing is profoundly moored on the quantity of deals made. All things considered, making a site can adequately help an undertaking to draw in more deals. As per Statista, online business exercises are foreseen to develop by 21.3% constantly 2019. This demonstrates that deals on sites are drawing in more clients. These days, more entrepreneurs are hurrying to direct their exchanges on the web. This is on the grounds that they have detected the incredible chance to exploit online deals. The expansion in deals goes inseparably with the developing number of clients. To additionally advance the business, website admins are urged to incorporate updates. It is through updates and overhauls that the site capacities are smoothened. Also, it shows the customers that the brand is committed to offering excellent administrations and data. Another approach to improve the deals is by including advancements. Here, you will make the truly necessary fluff among clients. This repeats into more deals. Also, this gives clients the feeling that they can gain reasonable items from the organization. Along these lines, all exercises on the site increase the value of the business somehow. Draw in lifetime customers to your business As the organization tries to spread its wings and extend, it is crucial to have steadfast clients. In any case, this can be an overwhelming undertaking particularly when the business person utilizes helpless strategies to accomplish this. It is now that improvement and planning of the site help out. The measurements recovered from the webpage empower website admins to screen the action of clients. Here, it is conceivable to feature the clients that have persistently upheld the brand. In the wake of pinpointing them, the entrepreneur should utilize inventive approaches to hold these clients. One creative alternative is compensating them with blessing vouchers and prizes. This will give them more motivation to get to your administrations or items. Keep in mind, it is through the site that the entrepreneur guarantees no reliable client is forgotten about. Another interesting thing about the lifetime clients is that they can advertise the brand. So they get to in a roundabout way work for the organization. This additionally lessens the expense of showcasing. Contact more customers One of the principle objectives of building up an endeavor is to fill regarding client base. Indeed, there is a heap of approaches to accomplish this yet each has various outcomes. With regards to web advancement and plan, there are some significant achievements accomplished. The first is that it puts the brand name out there. Basically, when the site is free on Worldwide Web then the organization is on a worldwide stage. This implies that the generally secret endeavor can be looked and give items to far away clients. It is these administrations that guide to lessen the distance for the clients to get to the exercises. Here, there are different choices, for example, buying or requesting the item on the site. Besides, the organization actually stays in contact with the neighborhood clients. Incredible right! Improving client commitment Routinely, an undertaking was facilitated in a physical structure. Nonetheless, circumstances are different as more administrations have gotten advanced. It is thus that business visionaries are urged to create wonderful sites. In this stage, it is very simple to keep a decent affinity with the end client. This involves recovering input on the administrations and items advertised. So you can associate with them and give fundamental reactions to the inquiries inquired. Furthermore, there is no restriction on the hour of action. Via robotizing the administrations on the site, customers are ensured of nonstop administrations. Additionally under client commitment, the blog or webpage proprietor can update clients as often as possible consistently. For example, in the event that new value charges are presented, at that point clients are among the first to know. Creative in showcasing and promoting For new businesses, having items and administrations out there is major in making progress. All things considered, showcasing methodologies prove to be useful in selling the brand. Contrasted with strategies, for example, the utilization of primary media and boards, site advancement is pocket-accommodating. It is through this online stage that an organization can show all important data. This incorporates; items/administrations offered, area, estimating, notoriety, contacts among others. The website admin can helpfully post alluring proposals on the site. Strangely, it is simpler to refresh astounding limits and offers on the site. So there is no personal time in trusting that the promotion will be set up. A similar case applies when the organization wishes to pull down a blog entry or advert. Also, the undertaking can work with a given figure. I'm not catching this' meaning? Basically, through SEO the business can realize where to put more accentuation. Also, the site gives forward-thinking data on the most recent promotion on the lookout. Smoothing out the brand While presenting a site for the organization, it is critical that the brand name be predictable. It is through site advancement and website architecture that this is refined. Here, the website admin will make a particular brand name that will be included on all the web crawlers. So there is no variety whether or not the site is on Bing or Google. Moreover, the brand logo and name is comparable all through. This decreases the odds of disarray with other serious brands. This likewise streams down to the issue of consistency. It is foreseen that the organization keeps a consistent following of their clients. In the event of rebranding, the website admin ought to guarantee that the due system is followed. Whenever this is thought of, at that point the web indexes will naturally refresh the records. In this way, when clients look for the brand will get to the correct thing. One more approach to see this is that the site can help to advise customers regarding changes. As the organization utilizes different methods, for example, online media, the site can likewise contribute. Here, the website admin can even prod the peruses of another look before dispatch. All things considered, these progressions can be executed all through. How Website Development and Web Design Helps Enterprises to Make Profits Saving on expenses Shockingly, numerous start up and significant organization fall flat in their endeavour because of low benefits. This is regardless of having extraordinary assumption for the speculation made. Some portion of the disappointment is ascribed to commitment of helpless business strategies, for example, carelessness of web administrations. It ought to be realized that site advancement and configuration is reasonable. By appropriately organizing the substance, the website admin saves a great deal of cost during web advancement. The cost saving angle stretches out to the modern acquires the site will bring to the business. Moreover, the site lessens the distance covered to contact the clients. In the event that one was to actually converse with possible clients, at that point it would be asset concentrated. It is here that web administrations come in. Besides, reducing such additional expenses implies that the business is gathering more benefit. Allowing promotions on the site Entrepreneurs probably go over the numerous advertisements been communicated on different site. Indeed, this is one of the fascinating ways an undertaking can draw in more benefit. Fundamentally, the organization will be drawn closer by different ventures to have their advertisements run on the site. As a component of promoting and publicizing, the host site will charge a specific add up to have the advert. Accordingly, it is essential to think of an interesting and famous site. By zeroing in on this, the website admin will put the site on the spotlight. The huge champs here are those whose site pulls in more undertakings and promotions. E-Commerce As specified previously, pre-cuts and administrations have moved from the stores to online stages. One of the significant online scenes is the site. We should take the case of Amazon, figured out how to add to 44 percent of the complete web based business deals in the United States. In addition, Statista featured that the organization has figured out how to make $108.35 million out of 2017. After the top to bottom elaboration of the significance of a site, certain viewpoints come out clear. The first is that business undertakings should try to create and plan a custom site. Also, it is significant to put the best foot forward. So it is foreseen that the site or blog meets and outperforms the rules. Having said this, it is dependent upon the website admin to take that wide action and build up a site. Tag : website designing company in delhi, website development company in delhi