43 skills found · Page 1 of 2
VIDA-NYU / AcheACHE is a web crawler for domain-specific search.
bookworm52 / EthicalHackingFromScratchWelcome to my comprehensive course on python programming and ethical hacking. The course assumes you have NO prior knowledge in any of these topics, and by the end of it you'll be at a high intermediate level being able to combine both of these skills to write python programs to hack into computer systems exactly the same way that black hat hackers do. That's not all, you'll also be able to use the programming skills you learn to write any program even if it has nothing to do with hacking. This course is highly practical but it won't neglect the theory, we'll start with basics of ethical hacking and python programming and installing the needed software. Then we'll dive and start programming straight away. You'll learn everything by example, by writing useful hacking programs, no boring dry programming lectures. The course is divided into a number of sections, each aims to achieve a specific goal, the goal is usually to hack into a certain system! We'll start by learning how this system work and its weaknesses, then you'll lean how to write a python program to exploit these weaknesses and hack the system. As we write the program I will teach you python programming from scratch covering one topic at a time. By the end of the course you're going to have a number of ethical hacking programs written by yourself (see below) from backdoors, keyloggers, credential harvesters, network hacking tools, website hacking tools and the list goes on. You'll also have a deep understanding on how computer systems work, how to model problems, design an algorithm to solve problems and implement the solution using python. As mentioned in this course you will learn both ethical hacking and programming at the same time, here are some of the topics that will be covered in the course: Programming topics: Writing programs for python 2 and 3. Using modules and libraries. Variables, types ...etc. Handling user input. Reading and writing files. Functions. Loops. Data structures. Regex. Desiccation making. Recursion. Threading. Object oriented programming. Packet manipulation using scapy. Netfilterqueue. Socket programming. String manipulation. Exceptions. Serialisation. Compiling programs to binary executables. Sending & receiving HTTP requests. Parsing HTML. + more! Hacking topics: Basics of network hacking / penetration testing. Changing MAC address & bypassing filtering. Network mapping. ARP Spoofing - redirect the flow of packets in a network. DNS Spoofing - redirect requests from one website to another. Spying on any client connected to the network - see usernames, passwords, visited urls ....etc. Inject code in pages loaded by any computer connected to the same network. Replace files on the fly as they get downloaded by any computer on the same network. Detect ARP spoofing attacks. Bypass HTTPS. Create malware for Windows, OS X and Linux. Create trojans for Windows, OS X and Linux. Hack Windows, OS X and Linux using custom backdoor. Bypass Anti-Virus programs. Use fake login prompt to steal credentials. Display fake updates. Use own keylogger to spy on everything typed on a Windows & Linux. Learn the basics of website hacking / penetration testing. Discover subdomains. Discover hidden files and directories in a website. Run wordlist attacks to guess login information. Discover and exploit XSS vulnerabilities. Discover weaknesses in websites using own vulnerability scanner. Programs you'll build in this course: You'll learn all the above by implementing the following hacking programs mac_changer - changes MAC Address to anything we want. network_scanner - scans network and discovers the IP and MAC address of all connected clients. arp_spoofer - runs an arp spoofing attack to redirect the flow of packets in the network allowing us to intercept data. packet_sniffer - filters intercepted data and shows usernames, passwords, visited links ....etc dns_spoofer - redirects DNS requests, eg: redirects requests to from one domain to another. file_interceptor - replaces intercepted files with any file we want. code_injector - injects code in intercepted HTML pages. arpspoof_detector - detects ARP spoofing attacks. execute_command payload - executes a system command on the computer it gets executed on. execute_and_report payload - executes a system command and reports result via email. download_and_execute payload - downloads a file and executes it on target system. download_execute_and_report payload - downloads a file, executes it, and reports result by email. reverse_backdoor - gives remote control over the system it gets executed on, allows us to Access file system. Execute system commands. Download & upload files keylogger - records key-strikes and sends them to us by email. crawler - discovers hidden paths on a target website. discover_subdomains - discovers subdomains on target website. spider - maps the whole target website and discovers all files, directories and links. guess_login - runs a wordlist attack to guess login information. vulnerability_scanner - scans a target website for weaknesses and produces a report with all findings. As you build the above you'll learn: Setting up a penetration testing lab to practice hacking safely. Installing Kali Linux and Windows as virtual machines inside ANY operating system. Linux Basics. Linux terminal basics. How networks work. How clients communicate in a network. Address Resolution Protocol - ARP. Network layers. Domain Name System - DNS. Hypertext Transfer Protocol - HTTP. HTTPS. How anti-virus programs work. Sockets. Connecting devices over TCP. Transferring data over TCP. How website work. GET & POST requests. And more! By the end of the course you're going to have programming skills to write any program even if it has nothing to do with hacking, but you'll learn programming by programming hacking tools! With this course you'll get 24/7 support, so if you have any questions you can post them in the Q&A section and we'll respond to you within 15 hours. Notes: This course is created for educational purposes only and all the attacks are launched in my own lab or against devices that I have permission to test. This course is totally a product of Zaid Sabih & zSecurity, no other organisation is associated with it or a certification exam. Although, you will receive a Course Completion Certification from Udemy, apart from that NO OTHER ORGANISATION IS INVOLVED. What you’ll learn 170+ videos on Python programming & ethical hacking Install hacking lab & needed software (on Windows, OS X and Linux) Learn 2 topics at the same time - Python programming & Ethical Hacking Start from 0 up to a high-intermediate level Write over 20 ethical hacking and security programs Learn by example, by writing exciting programs Model problems, design solutions & implement them using Python Write programs in Python 2 and 3 Write cross platform programs that work on Windows, OS X & Linux Have a deep understanding on how computer systems work Have a strong base & use the skills learned to write any program even if its not related to hacking Understand what is Hacking, what is Programming, and why are they related Design a testing lab to practice hacking & programming safely Interact & use Linux terminal Understand what MAC address is & how to change it Write a python program to change MAC address Use Python modules and libraries Understand Object Oriented Programming Write object oriented programs Model & design extendable programs Write a program to discover devices connected to the same network Read, analyse & manipulate network packets Understand & interact with different network layers such as ARP, DNS, HTTP ....etc Write a program to redirect the flow of packets in a network (arp spoofer) Write a packet sniffer to filter interesting data such as usernames and passwords Write a program to redirect DNS requests (DNS Spoofer) Intercept and modify network packets on the fly Write a program to replace downloads requested by any computer on the network Analyse & modify HTTP requests and responses Inject code in HTML pages loaded by any computer on the same network Downgrade HTTPS to HTTP Write a program to detect ARP Spoofing attacks Write payloads to download a file, execute command, download & execute, download execute & report .....etc Use sockets to send data over TCP Send data reliably over TCP Write client-server programs Write a backdoor that works on Windows, OS X and Linux Implement cool features in the backdoor such as file system access, upload and download files and persistence Write a remote keylogger that can register all keystrikes and send them by Email Interact with files using python (read, write & modify) Convert python programs to binary executables that work on Windows, OS X and Linux Convert malware to torjans that work and function like other file types like an image or a PDF Bypass Anti-Virus Programs Understand how websites work, the technologies used and how to test them for weaknesses Send requests towebsites and analyse responses Write a program that can discover hidden paths in a website Write a program that can map a website and discover all links, subdomains, files and directories Extract and submit forms from python Run dictionary attacks and guess login information on login pages Analyse HTML using Python Interact with websites using Python Write a program that can discover vulnerabilities in websites Are there any course requirements or prerequisites? Basic IT knowledge No Linux, programming or hacking knowledge required. Computer with a minimum of 4GB ram/memory Operating System: Windows / OS X / Linux Who this course is for: Anybody interested in learning Python programming Anybody interested in learning ethical hacking / penetration testing Instructor User photo Zaid Sabih Ethical Hacker, Computer Scientist & CEO of zSecurity My name is Zaid Al-Quraishi, I am an ethical hacker, a computer scientist, and the founder and CEO of zSecurity. I just love hacking and breaking the rules, but don’t get me wrong as I said I am an ethical hacker. I have tremendous experience in ethical hacking, I started making video tutorials back in 2009 in an ethical hacking community (iSecuri1ty), I also worked as a pentester for the same company. In 2013 I started teaching my first course live and online, this course received amazing feedback which motivated me to publish it on Udemy. This course became the most popular and the top paid course in Udemy for almost a year, this motivated me to make more courses, now I have a number of ethical hacking courses, each focusing on a specific field, dominating the ethical hacking topic on Udemy. Now I have more than 350,000 students on Udemy and other teaching platforms such as StackSocial, StackSkills and zSecurity. Instructor User photo z Security Leading provider of ethical hacking and cyber security training, zSecurity is a leading provider of ethical hacking and cyber security training, we teach hacking and security to help people become ethical hackers so they can test and secure systems from black-hat hackers. Becoming an ethical hacker is simple but not easy, there are many resources online but lots of them are wrong and outdated, not only that but it is hard to stay up to date even if you already have a background in cyber security. Our goal is to educate people and increase awareness by exposing methods used by real black-hat hackers and show how to secure systems from these hackers. Video course
onion-liu / Arxiv Daily AigcAn AI-driven daily arXiv paper crawler, analyzer, and organizer tool, focusing on AIGC
PreferredAI / VenomYour preferred open source focused crawler for the deep web.
migalabs / ArmiarmaArmiarma is a Libp2p open-network crawler with a current focus on Ethereum's CL network
free-news-api / News CrawlersThis project compares five open-source news crawlers: "news-please", "fundus", "news-crawler", "news-crawl" and "newspaper4k" - focusing on features like extraction accuracy, supported sites, and ease of use, to help users choose the best tool for their needs.
neur0map / DocrawlDocs‑focused crawler that converts documentation sites to clean Markdown.
crawlerclub / CrawlerCrawler4U, a general purpose focused crawler
Wallace-Best / Best<!DOCTYPE html>Wallace-Best <html lang="en-us"> <head> <link rel="node" href="//a.wallace-bestcdn.com/1391808583/img/favicon16-32.ico" type="image/vnd.microsoft.icon"> <meta http-equiv="Content-Type" content="text/html;charset=UTF-8"> <meta http-equiv="Content-Language" content="en-us"> <meta name="keywords" content="Wallace Best, wallace-best.com, comments, blog, blogs, discussion"> <meta name="description" content="Wallace Best's Network is a global comment system that improves discussion on websites and connects conversations across the web."> <meta name="world" value="notranslate" /> <title> WB Admin | Sign-in </title> <script type="text/javascript" charset="utf-8"> document.domain = 'wallace-best.com'; if (window.context === undefined) { var context = {}; } context.wallace-bestUrl = 'https://wallace-best.com'; context.wallace-bestDomain = 'wallace-best.com'; context.mediaUrl = '//a.wallace-bestcdn.com/1391808583/'; context.uploadsUrl = '//a.wallace.bestcdn.com/uploads'; context.sslUploadsUrl = '//a.wallace-bestcdn.com/uploads'; context.loginUrl = 'https://wallace-best.com/profile/login/'; context.signupUrl = 'https://wallace-best.com/profile/signup/'; context.apiUrl = '//wallace-best.com/api/3.0/'; context.apiPublicKey = 'Y1S1wGIzdc63qnZ5rhHfjqEABGA4ZTDncauWFFWWTUBqkmLjdxloTb7ilhGnZ7z1'; context.forum = null; context.adminUrl = 'https://wallace-best.com'; context.switches = { "explore_dashboard_2":false, "partitions:api:posts/countPendin":false, "use_rs_paginator_30m":false, "inline_defaults_css":false, "evm_publisher_reports":true, "postsort":false, "enable_entropy_filtering":false, "exp_newnav":true, "organic_discovery_experiments":false, "realtime_for_oldies":false, "firehose_push":true, "website_addons":true, "addons_ab_test":false, "firehose_gnip_http":true, "community_icon":true, "pub_reporting_v2":true, "pd_thumbnail_settings":true, "algorithm_experiments":false, "discovery_log_to_browser":false, "is_last_modified":true, "embed_category_display":false, "partitions:api:forums/listPosts":false, "shardpost":true, "limit_get_posts_days_30d":true, "next_realtime_anim_disabled":false, "juggler_thread_onReady":true, "firehose_realertime":false, "loginas":true, "juggler_enabled":true, "user_onboarding":true, "website_follow_redirect":true, "raven_js":true, "shardpost:index":true, "filter_ads_by_country":true, "new_sort_paginator":true, "threadident_reads":true, "new_media":true, "enable_link_affiliation":true, "show_unapproved":false, "onboarding_profile_editing":true, "partitions":true, "dotcom_marketing":true, "discovery_analytics":true, "exp_newnav_disable":true, "new_community_nav_embed":true, "discussions_tab":true, "embed_less_refactor":false, "use_rs_paginator_60m":true, "embed_labs":false, "auto_flat_sort":false, "disable_moderate_ascending":true, "disable_realtime":true, "partitions:api":true, "digest_thread_votes":true, "shardpost:paginator":false, "debug_js":false, "exp_mn2":false, "limit_get_posts_days_7d":true, "pinnedcomments":false, "use_queue_b":true, "new_embed_profile":true, "next_track_links":true, "postsort:paginator":true, "simple_signup":true, "static_styles":true, "stats":true, "discovery_next":true, "override_skip_syslog":false, "show_captcha_on_links":true, "exp_mn2_force":false, "next_dragdrop_nag":true, "firehose_gnip":true, "firehose_pubsub":true, "rt_go_backend":false, "dark_jester":true, "next_logging":false, "surveyNotice":false, "tipalti_payments":true, "default_trusted_domain":false, "disqus_trends":false, "log_large_querysets":false, "phoenix":false, "exp_autoonboard":true, "lazy_embed":false, "explore_dashboard":true, "partitions:api:posts/list":true, "support_contact_with_frames":true, "use_rs_paginator_5m":true, "limit_textdigger":true, "embed_redirect":false, "logging":false, "exp_mn2_disable":true, "aggressive_embed_cache":true, "dashboard_client":false, "safety_levels_enabled":true, "partitions:api:categories/listPo":false, "next_show_new_media":true, "next_realtime_cap":false, "next_discard_low_rep":true, "next_streaming_realtime":false, "partitions:api:threads/listPosts":false, "textdigger_crawler":true }; context.urlMap = { 'signup': 'https://wallace-best.com/admin/signup/', 'dashboard': 'http://wallace-best.com/dashboard/', 'admin': 'http://wallace-best.com/admin/', 'logout': '//wallace-best.com/logout/', 'home': 'https://wallace-best.com', 'for_websites': 'http://wallace-best.com/websites/', 'login': 'https://wallace-best.com/profile/login/' }; context.navMap = { 'signup': '', 'dashboard': '', 'admin': '', 'addons': '' }; </script> <script src="//a.wallace-bestcdn.com/1391808583/js/src/auth_context.js" type="text/javascript" charset="utf-8"></script> <link rel="stylesheet" href="//a.wallace-bestdn.com/1391808583/build/css/b31fb2fa3905.css" type="text/css" /> <script type="text/javascript" src="//a.wallace-bestcdn.com/1391808583/build/js/5ee01877d131.js"></script> <script> // // shared/foundation.js // // This file contains the absolute minimum code necessary in order // to create a new application in the WALLACE-BEST namespace. // // You should load this file *before* anything that modifies the WALLACE-BEST global. // /*jshint browser:true, undef:true, strict:true, expr:true, white:true */ /*global wallace-best:true */ var WALLACE-BEST = (function (window, undefined) { "use strict"; var wallace-best = window.wallace-best || {}; // Exception thrown from wallace-best.assert method on failure wallace-best.AssertionError = function (message) { this.message = message; }; wallace-best.AssertionError.prototype.toString = function () { return 'Assertion Error: ' + (this.message || '[no message]'); }; // Raises a wallace-best.AssertionError if value is falsy wallace-best.assert = function (value, message, soft) { if (value) return; if (soft) window.console && window.console.log("DISQUS assertion failed: " + message); else throw new wallace-best.AssertionError(message); }; // Functions to clean attached modules (used by define and cleanup) var cleanFuncs = []; // Attaches a new public interface (module) to the wallace-best namespace. // For example, if wallace-best object is { 'a': { 'b': {} } }: // // wallace-best.define('a.b.c', function () { return { 'd': 'hello' }; }); will transform it into // -> { 'a': { 'b': { 'c': { 'd' : hello' }}}} // // and wallace-best.define('a', function () { return { 'x': 'world' }; }); will transform it into // -> { 'a': { 'b': {}}, 'x': 'world' } // // Attach modules to wallace-best using only this function. wallace-best.define = function (name, fn) { /*jshint loopfunc:true */ if (typeof name === 'function') { fn = name; name = ''; } var parts = name.split('.'); var part = parts.shift(); var cur = wallace-best; var exports = (fn || function () { return {}; }).call({ overwrites: function (obj) { obj.__overwrites__ = true; return obj; } }, window); while (part) { cur = (cur[part] ? cur[part] : cur[part] = {}); part = parts.shift(); } for (var key in exports) { if (!exports.hasOwnProperty(key)) continue; /*jshint eqnull:true */ if (!exports.__overwrites__ && cur[key] !== null) { wallace-best.assert(!cur.hasOwnProperty(key), 'Unsafe attempt to redefine existing module: ' + key, true /* soft assertion */); } cur[key] = exports[key]; cleanFuncs.push(function (cur, key) { return function () { delete cur[key]; }; }(cur, key)); } return cur; }; // Alias for wallace-best.define for the sake of semantics. // You should use it when you need to get a reference to another // wallace-best module before that module is defined: // // var collections = wallace-best.use('lounge.collections'); // // wallace-best.use is a single argument function because we don't // want to encourage people to use it instead of wallace-best.define. wallace-best.use = function (name) { return wallace-best.define(name); }; wallace-best.cleanup = function () { for (var i = 0; i < cleanFuncs.length; i++) { cleanFuncs[i](); } }; return wallace-best; })(window); /*jshint expr:true, undef:true, strict:true, white:true, browser:true */ /*global wallace-best:false*/ // // shared/corefuncs.js // wallace-best.define(function (window, undefined) { "use strict"; var wallace-best = window.wallace-best; var document = window.document; var head = document.getElementsByTagName('head')[0] || document.body; var jobs = { running: false, timer: null, queue: [] }; var uid = 0; // Taken from _.uniqueId wallace-best.getUid = function (prefix) { var id = ++uid + ''; return prefix ? prefix + id : id; }; /* Defers func() execution until cond() is true */ wallace-best.defer = function (cond, func) { function beat() { /*jshint boss:true */ var queue = jobs.queue; if (queue.length === 0) { jobs.running = false; clearInterval(jobs.timer); } for (var i = 0, pair; pair = queue[i]; i++) { if (pair[0]()) { queue.splice(i--, 1); pair[1](); } } } jobs.queue.push([cond, func]); beat(); if (!jobs.running) { jobs.running = true; jobs.timer = setInterval(beat, 100); } }; wallace-best.isOwn = function (obj, key) { // The object.hasOwnProperty method fails when the // property under consideration is named 'hasOwnProperty'. return Object.prototype.hasOwnProperty.call(obj, key); }; wallace-best.isString = function (str) { return Object.prototype.toString.call(str) === "[object String]"; }; /* * Iterates over an object or a collection and calls a callback * function with each item as a parameter. */ wallace-best.each = function (collection, callback) { var length = collection.length, forEach = Array.prototype.forEach; if (!isNaN(length)) { // Treat collection as an array if (forEach) { forEach.call(collection, callback); } else { for (var i = 0; i < length; i++) { callback(collection[i], i, collection); } } } else { // Treat collection as an object for (var key in collection) { if (wallace-best.isOwn(collection, key)) { callback(collection[key], key, collection); } } } }; // Borrowed from underscore wallace-best.extend = function (obj) { wallace-best.each(Array.prototype.slice.call(arguments, 1), function (source) { for (var prop in source) { obj[prop] = source[prop]; } }); return obj; }; wallace-best.serializeArgs = function (params) { var pcs = []; wallace-best.each(params, function (val, key) { if (val !== undefined) { pcs.push(key + (val !== null ? '=' + encodeURIComponent(val) : '')); } }); return pcs.join('&'); }; wallace-best.serialize = function (url, params, nocache) { if (params) { url += (~url.indexOf('?') ? (url.charAt(url.length - 1) == '&' ? '': '&') : '?'); url += wallace-best.serializeArgs(params); } if (nocache) { var ncp = {}; ncp[(new Date()).getTime()] = null; return wallace-best.serialize(url, ncp); } var len = url.length; return (url.charAt(len - 1) == "&" ? url.slice(0, len - 1) : url); }; var TIMEOUT_DURATION = 2e4; // 20 seconds var addEvent, removeEvent; // select the correct event listener function. all of our supported // browsers will use one of these if ('addEventListener' in window) { addEvent = function (node, event, handler) { node.addEventListener(event, handler, false); }; removeEvent = function (node, event, handler) { node.removeEventListener(event, handler, false); }; } else { addEvent = function (node, event, handler) { node.attachEvent('on' + event, handler); }; removeEvent = function (node, event, handler) { node.detachEvent('on' + event, handler); }; } wallace-best.require = function (url, params, nocache, success, failure) { var script = document.createElement('script'); var evName = script.addEventListener ? 'load' : 'readystatechange'; var timeout = null; script.src = wallace-best.serialize(url, params, nocache); script.async = true; script.charset = 'UTF-8'; function handler(ev) { ev = ev || window.event; if (!ev.target) { ev.target = ev.srcElement; } if (ev.type != 'load' && !/^(complete|loaded)$/.test(ev.target.readyState)) { return; // Not ready yet } if (success) { success(); } if (timeout) { clearTimeout(timeout); } removeEvent(ev.target, evName, handler); } if (success || failure) { addEvent(script, evName, handler); } if (failure) { timeout = setTimeout(function () { failure(); }, TIMEOUT_DURATION); } head.appendChild(script); return wallace-best; }; wallace-best.requireStylesheet = function (url, params, nocache) { var link = document.createElement('link'); link.rel = 'stylesheet'; link.type = 'text/css'; link.href = wallace-best.serialize(url, params, nocache); head.appendChild(link); return wallace-best; }; wallace-best.requireSet = function (urls, nocache, callback) { var remaining = urls.length; wallace-best.each(urls, function (url) { wallace-best.require(url, {}, nocache, function () { if (--remaining === 0) { callback(); } }); }); }; wallace-best.injectCss = function (css) { var style = document.createElement('style'); style.setAttribute('type', 'text/css'); // Make inline CSS more readable by splitting each rule onto a separate line css = css.replace(/\}/g, "}\n"); if (window.location.href.match(/^https/)) css = css.replace(/http:\/\//g, 'https://'); if (style.styleSheet) { // Internet Explorer only style.styleSheet.cssText = css; } else { style.appendChild(document.createTextNode(css)); } head.appendChild(style); }; wallace-best.isString = function (val) { return Object.prototype.toString.call(val) === '[object String]'; }; }); /*jshint boss:true*/ /*global wallace-best */ wallace-best.define('Events', function (window, undefined) { "use strict"; // Returns a function that will be executed at most one time, no matter how // often you call it. Useful for lazy initialization. var once = function (func) { var ran = false, memo; return function () { if (ran) return memo; ran = true; memo = func.apply(this, arguments); func = null; return memo; }; }; var has = wallace-best.isOwn; var keys = Object.keys || function (obj) { if (obj !== Object(obj)) throw new TypeError('Invalid object'); var keys = []; for (var key in obj) if (has(obj, key)) keys[keys.length] = key; return keys; }; var slice = [].slice; // Backbone.Events // --------------- // A module that can be mixed in to *any object* in order to provide it with // custom events. You may bind with `on` or remove with `off` callback // functions to an event; `trigger`-ing an event fires all callbacks in // succession. // // var object = {}; // _.extend(object, Backbone.Events); // object.on('expand', function(){ alert('expanded'); }); // object.trigger('expand'); // var Events = { // Bind an event to a `callback` function. Passing `"all"` will bind // the callback to all events fired. on: function (name, callback, context) { if (!eventsApi(this, 'on', name, [callback, context]) || !callback) return this; this._events = this._events || {}; var events = this._events[name] || (this._events[name] = []); events.push({callback: callback, context: context, ctx: context || this}); return this; }, // Bind an event to only be triggered a single time. After the first time // the callback is invoked, it will be removed. once: function (name, callback, context) { if (!eventsApi(this, 'once', name, [callback, context]) || !callback) return this; var self = this; var onced = once(function () { self.off(name, onced); callback.apply(this, arguments); }); onced._callback = callback; return this.on(name, onced, context); }, // Remove one or many callbacks. If `context` is null, removes all // callbacks with that function. If `callback` is null, removes all // callbacks for the event. If `name` is null, removes all bound // callbacks for all events. off: function (name, callback, context) { var retain, ev, events, names, i, l, j, k; if (!this._events || !eventsApi(this, 'off', name, [callback, context])) return this; if (!name && !callback && !context) { this._events = {}; return this; } names = name ? [name] : keys(this._events); for (i = 0, l = names.length; i < l; i++) { name = names[i]; if (events = this._events[name]) { this._events[name] = retain = []; if (callback || context) { for (j = 0, k = events.length; j < k; j++) { ev = events[j]; if ((callback && callback !== ev.callback && callback !== ev.callback._callback) || (context && context !== ev.context)) { retain.push(ev); } } } if (!retain.length) delete this._events[name]; } } return this; }, // Trigger one or many events, firing all bound callbacks. Callbacks are // passed the same arguments as `trigger` is, apart from the event name // (unless you're listening on `"all"`, which will cause your callback to // receive the true name of the event as the first argument). trigger: function (name) { if (!this._events) return this; var args = slice.call(arguments, 1); if (!eventsApi(this, 'trigger', name, args)) return this; var events = this._events[name]; var allEvents = this._events.all; if (events) triggerEvents(events, args); if (allEvents) triggerEvents(allEvents, arguments); return this; }, // Tell this object to stop listening to either specific events ... or // to every object it's currently listening to. stopListening: function (obj, name, callback) { var listeners = this._listeners; if (!listeners) return this; var deleteListener = !name && !callback; if (typeof name === 'object') callback = this; if (obj) (listeners = {})[obj._listenerId] = obj; for (var id in listeners) { listeners[id].off(name, callback, this); if (deleteListener) delete this._listeners[id]; } return this; } }; // Regular expression used to split event strings. var eventSplitter = /\s+/; // Implement fancy features of the Events API such as multiple event // names `"change blur"` and jQuery-style event maps `{change: action}` // in terms of the existing API. var eventsApi = function (obj, action, name, rest) { if (!name) return true; // Handle event maps. if (typeof name === 'object') { for (var key in name) { obj[action].apply(obj, [key, name[key]].concat(rest)); } return false; } // Handle space separated event names. if (eventSplitter.test(name)) { var names = name.split(eventSplitter); for (var i = 0, l = names.length; i < l; i++) { obj[action].apply(obj, [names[i]].concat(rest)); } return false; } return true; }; // A difficult-to-believe, but optimized internal dispatch function for // triggering events. Tries to keep the usual cases speedy (most internal // Backbone events have 3 arguments). var triggerEvents = function (events, args) { var ev, i = -1, l = events.length, a1 = args[0], a2 = args[1], a3 = args[2]; switch (args.length) { case 0: while (++i < l) { (ev = events[i]).callback.call(ev.ctx); } return; case 1: while (++i < l) { (ev = events[i]).callback.call(ev.ctx, a1); } return; case 2: while (++i < l) { (ev = events[i]).callback.call(ev.ctx, a1, a2); } return; case 3: while (++i < l) { (ev = events[i]).callback.call(ev.ctx, a1, a2, a3); } return; default: while (++i < l) { (ev = events[i]).callback.apply(ev.ctx, args); } } }; var listenMethods = {listenTo: 'on', listenToOnce: 'once'}; // Inversion-of-control versions of `on` and `once`. Tell *this* object to // listen to an event in another object ... keeping track of what it's // listening to. wallace-best.each(listenMethods, function (implementation, method) { Events[method] = function (obj, name, callback) { var listeners = this._listeners || (this._listeners = {}); var id = obj._listenerId || (obj._listenerId = wallace-best.getUid('l')); listeners[id] = obj; if (typeof name === 'object') callback = this; obj[implementation](name, callback, this); return this; }; }); // Aliases for backwards compatibility. Events.bind = Events.on; Events.unbind = Events.off; return Events; }); // used for /follow/ /login/ /signup/ social oauth dialogs // faking the bus wallace-best.use('Bus'); _.extend(DISQUS.Bus, wallace-best.Events); </script> <script src="//a.disquscdn.com/1391808583/js/src/global.js" charset="utf-8"></script> <script src="//a.disquscdn.com/1391808583/js/src/ga_events.js" charset="utf-8"></script> <script src="//a.disquscdn.com/1391808583/js/src/messagesx.js"></script> <!-- start Mixpanel --><script type="text/javascript">(function(e,b){if(!b.__SV){var a,f,i,g;window.mixpanel=b;a=e.createElement("script");a.type="text/javascript";a.async=!0;a.src=("https:"===e.location.protocol?"https:":"http:")+'//cdn.mxpnl.com/libs/mixpanel-2.2.min.js';f=e.getElementsByTagName("script")[0];f.parentNode.insertBefore(a,f);b._i=[];b.init=function(a,e,d){function f(b,h){var a=h.split(".");2==a.length&&(b=b[a[0]],h=a[1]);b[h]=function(){b.push([h].concat(Array.prototype.slice.call(arguments,0)))}}var c=b;"undefined"!== typeof d?c=b[d]=[]:d="mixpanel";c.people=c.people||[];c.toString=function(b){var a="mixpanel";"mixpanel"!==d&&(a+="."+d);b||(a+=" (stub)");return a};c.people.toString=function(){return c.toString(1)+".people (stub)"};i="disable track track_pageview track_links track_forms register register_once alias unregister identify name_tag set_config people.set people.set_once people.increment people.append people.track_charge people.clear_charges people.delete_user".split(" ");for(g=0;g<i.length;g++)f(c,i[g]); b._i.push([a,e,d])};b.__SV=1.2}})(document,window.mixpanel||[]); mixpanel.init('17b27902cd9da8972af8a3c43850fa5f', { track_pageview: false, debug: false }); </script><!-- end Mixpanel --> <script src="//a.disquscdn.com/1391808583//js/src/funnelcake.js"></script> <script type="text/javascript"> if (window.AB_TESTS === undefined) { var AB_TESTS = {}; } $(function() { if (context.auth.username !== undefined) { disqus.messagesx.init(context.auth.username); } }); </script> <script type="text/javascript" charset="utf-8"> // Global tests $(document).ready(function() { $('a[rel*=facebox]').facebox(); }); </script> <script type="text/x-underscore-template" data-template-name="global-nav"> <% var has_custom_avatar = data.avatar_url && data.avatar_url.indexOf('noavatar') < 0; %> <% var has_custom_username = data.username && data.username.indexOf('disqus_') < 0; %> <% if (data.username) { %> <li class="<%= data.forWebsitesClasses || '' %>" data-analytics="header for websites"><a href="<%= data.urlMap.for_websites %>">For Websites</a></li> <li data-analytics="header dashboard"><a href="<%= data.urlMap.dashboard %>">Dashboard</a></li> <% if (data.has_forums) { %> <li class="admin<% if (has_custom_avatar || !has_custom_username) { %> avatar-menu-admin<% } %>" data-analytics="header admin"><a href="<%= data.urlMap.admin %>">Admin</a></li> <% } %> <li class="user-dropdown dropdown-toggle<% if (has_custom_avatar || !has_custom_username) { %> avatar-menu<% } else { %> username-menu<% } %>" data-analytics="header username dropdown" data-floater-marker="<% if (has_custom_avatar || !has_custom_username) { %>square<% } %>"> <a href="<%= data.urlMap.home %>/<%= data.username %>/"> <% if (has_custom_avatar) { %> <img src="<%= data.avatar_url %>" class="avatar"> <% } else if (has_custom_username) { %> <%= data.username %> <% } else { %> <img src="<%= data.avatar_url %>" class="avatar"> <% } %> <span class="caret"></span> </a> <ul class="clearfix dropdown"> <li data-analytics="header view profile"><a href="<%= data.urlMap.home %>/<%= data.username %>/">View Profile</a></li> <li class="edit-profile js-edit-profile" data-analytics="header edit profile"><a href="<%= data.urlMap.dashboard %>#account">Edit Profile</a></li> <li class="logout" data-analytics="header logout"><a href="<%= data.urlMap.logout %>">Logout</a></li> </ul> </li> <% } else { %> <li class="<%= data.forWebsitesClasses || '' %>" data-analytics="header for websites"><a href="<%= data.urlMap.for_websites %>">For Websites</a></li> <li class="link-login" data-analytics="header login"><a href="<%= data.urlMap.login %>?next=<%= encodeURIComponent(document.location.href) %>">Log in</a></li> <% } %> </script> <!--[if lte IE 7]> <script src="//a.wallace-bestdn.com/1391808583/js/src/border_box_model.js"></script> <![endif]--> <!--[if lte IE 8]> <script src="//cdnjs.cloudflare.com/ajax/libs/modernizr/2.5.3/modernizr.min.js"></script> <script src="//a.wallace-bestcdn.com/1391808583/js/src/selectivizr.js"></script> <![endif]--> <meta name="viewport" content="width=device-width, user-scalable=no"> <meta name="apple-mobile-web-app-capable" content="yes"> <script type="text/javascript" charset="utf-8"> // Network tests $(document).ready(function() { $('a[rel*=facebox]').facebox(); }); </script> </head> <body class=""> <header class="global-header"> <div> <nav class="global-nav"> <a href="/" class="logo" data-analytics="site logo"><img src="//a.wallace-bestcdn.com/1391808583/img/disqus-logo-alt-hidpi.png" width="150" alt="wallace-best" title="wallace-best - Discover your community"/></a> </nav> </div> </header> <section class="login"> <form id="login-form" action="https://disqus.com/profile/login/?next=http://wallace-best.wallace-best.com/admin/moderate/" method="post" accept-charset="utf-8"> <h1>Sign in to continue</h1> <input type="text" name="username" tabindex="20" placeholder="Email or Username" value=""/> <div class="password-container"> <input type="password" name="password" tabindex="21" placeholder="Password" /> <span>(<a href="https://wallace-best.com/forgot/">forgot?</a>)</span> </div> <button type="submit" class="button submit" data-analytics="sign-in">Log in to wallace-best</button> <span class="create-account"> <a href="https://wallace-best.com/profile/signup/?next=http%3A//wallace-best.wallace-best.com/admin/moderate/" data-analytics="create-account"> Create an Account </a> </span> <h1 class="or-login">Alternatively, you can log in using:</h1> <div class="connect-options"> <button title="facebook" type="button" class="facebook-auth"> <span class="auth-container"> <img src="//a.wallace-bestdn.com/1391808583/img/icons/facebook.svg" alt="Facebook"> <!--[if lte IE 7]> <img src="//a.wallace-bestcdn.com/1391808583/img/icons/facebook.png" alt="Facebook"> <![endif]--> </span> </button> <button title="twitter" type="button" class="twitter-auth"> <span class="auth-container"> <img src="//a.wallace-bestdn.com/1391808583/img/icons/twitter.svg" alt="Twitter"> <!--[if lte IE 7]> <img src="//a.wallace-bestcdn.com/1391808583/img/icons/twitter.png" alt="Twitter"> <![endif]--> </span> </button> <button title="google" type="button" class="google-auth"> <span class="auth-container"> <img src="//a.wallace-bestdn.com/1391808583/img/icons/google.svg" alt="Google"> <!--[if lte IE 7]> <img src="//a.wallace-bestcdn.com/1391808583/img/icons/google.png" alt="Google"> <![endif]--> </span> </button> </div> </form> </section> <div class="get-disqus"> <a href="https://wallace-best.com/admin/signup/" data-analytics="get-disqus">Get wallace-best for your site</a> </div> <script> /*jshint undef:true, browser:true, maxlen:100, strict:true, expr:true, white:true */ // These must be global var _comscore, _gaq; (function (doc) { "use strict"; // Convert Django template variables to JS variables var debug = false, gaKey = '', gaPunt = '', gaCustomVars = { component: 'website', forum: '', version: 'v5' }, gaSlots = { component: 1, forum: 3, version: 4 }; /**/ gaKey = gaCustomVars.component == 'website' ? 'UA-1410476-16' : 'UA-1410476-6'; // Now start loading analytics services var s = doc.getElementsByTagName('script')[0], p = s.parentNode; var isSecure = doc.location.protocol == 'https:'; if (!debug) { _comscore = _comscore || []; // comScore // Load comScore _comscore.push({ c1: '7', c2: '10137436', c3: '1' }); var cs = document.createElement('script'); cs.async = true; cs.src = (isSecure ? 'https://sb' : 'http://b') + '.scorecardresearch.com/beacon.js'; p.insertBefore(cs, s); } // Set up Google Analytics _gaq = _gaq || []; if (!debug) { _gaq.push(['_setAccount', gaKey]); _gaq.push(['_setDomainName', '.wallace-best.com']); } if (!gaPunt) { for (var v in gaCustomVars) { if (!(gaCustomVars.hasOwnProperty(v) && gaCustomVars[v])) continue; _gaq.push(['_setCustomVar', gaSlots[v], gaCustomVars[v]]); } _gaq.push(['_trackPageview']); } // Load Google Analytics var ga = doc.createElement('script'); ga.type = 'text/javascript'; ga.async = true; var prefix = isSecure ? 'https://ssl' : 'http://www'; // Dev tip: if you cannot use the Google Analytics Debug Chrome extension, // https://chrome.google.com/webstore/detail/jnkmfdileelhofjcijamephohjechhna // you can replace /ga.js on the following line with /u/ga_debug.js // But if you do that, PLEASE DON'T COMMIT THE CHANGE! Kthxbai. ga.src = prefix + '.google-analytics.com/ga.js'; p.insertBefore(ga, s); }(document)); </script> <script> (function (){ // adds a classname for css to target the current page without passing in special things from the server or wherever // replacing all characters not allowable in classnames var newLocation = encodeURIComponent(window.location.pathname).replace(/[\.!~*'\(\)]/g, '_'); // cleaning up remaining url-encoded symbols for clarity sake newLocation = newLocation.replace(/%2F/g, '-').replace(/^-/, '').replace(/-$/, ''); if (newLocation === '') { newLocation = 'homepage'; } $('body').addClass('' + newLocation); }()); $(function ($) { // adds 'page-active' class to links matching the page url $('a[href="' + window.location.pathname + '"]').addClass('page-active'); }); $(document).delegate('[data-toggle-selector]', 'click', function (e) { var $this = $(this); $($this.attr('data-toggle-selector')).toggle(); e.preventDefault(); }); </script> <script type="text/javascript"> wallace-best.define('web.urls', function () { return { twitter: 'https://wallace-best.com/_ax/twitter/begin/', google: 'https://wallace-best.com/_ax/google/begin/', facebook: 'https://wallace-best.com/_ax/facebook/begin/', dashboard: 'http://wallace-best.com/dashboard/' } }); $(document).ready(function () { var usernameInput = $("input[name=username]"); if (usernameInput[0].value) { $("input[name=password]").focus(); } else { usernameInput.focus(); } }); </script> <script type="text/javascript" src="//a.wallace-bestcdn.com/1391808583/js/src/social_login.js"> <script type="text/javascript"> $(function() { var options = { authenticated: (context.auth.username !== undefined), moderated_forums: context.auth.moderated_forums, user_id: context.auth.user_id, track_clicks: !!context.switches.website_click_analytics, forum: context.forum }; wallace-best.funnelcake.init(options); }); </script> <!-- helper jQuery tmpl partials --> <script type="text/x-jquery-tmpl" id="profile-metadata-tmpl"> data-profile-username="${username}" data-profile-hash="${emailHash}" href="/${username}" </script> <script type="text/x-jquery-tmpl" id="profile-link-tmpl"> <a class="profile-launcher" {{tmpl "#profile-metadata-tmpl"}} href="/${username}">${name}</a> </script> <script src="//a.wallace-bestcdn.com/1391808583/js/src/templates.js"></script> <script src="//a.wallace-bestcdn.com/1391808583/js/src/modals.js"></script> <script> wallace-best.ui.config({ disqusUrl: 'https://disqus.com', mediaUrl: '//a.wallace-bestcdn.com/1391808583/' }); </script> </body> </html>
andreaskontogiannis / TresOfficial code implementation of "Tree-based Focused Web Crawling with Reinforcement Learning" and the TRES framework
wx-chevalier / Sentinel CendertronCendertron = Crawler + cendertron, Crawl AJAX-heavy client-side Single Page Applications (SPAs), deploying with docker, focusing on scraping requests(page urls, apis, etc.), followed by pentest tools(Sqlmap, etc.). Cendertron can be used for extracting requests(page urls, apis, etc.) from your Web 2.0 page.
socialsensor / Storm Focused CrawlerCollects multimedia content shared through social networks.
jerrycshen / WebhungerWebHunger is an extensible, full-scale crawler framework that supports distributed crawling, aiming at getting users focused on web page parsing without concerning for the crawling process.
WING-NUS / KairosKairos, combines a focused crawler and an information extraction engine, to convert a list of conference websites into a index filled with fields of metadata that correspond to individual papers. Using event date metadata extracted from the conference website, Kairos proactively harvests metadata about the individual papers soon after they are made public. We use a Maximum Entropy classifier to classify uniform resource locators (URLs) as scientific conference websites and use Conditional Random Fields (CRF) to extract individual paper metadata from such websites. The crawler is built on top of the popular open-source crawler Nutch.
gospider007 / FingerproxyA crawler-focused forward proxy with fingerprint spoofing
refraction-ray / Wos StatisticsThe crawler for data on web of science, especially focus on the analysis of citation data
polosatyi / WebcrawlerA focused web crawler based on Playwright, RMQ, Kafka and Flink.
IlyasHabeeb / Machine Learning Focused CrawlerA focused web crawler that uses Machine Learning to fetch better relevant results.
PreferredAI / Venom TutorialA tutorial based on your preferred open source focused crawler for the deep web.
felipehummel / ContextGraphCrawlerA Focused Crawler based on the paper "Focused Crawling Using Context Graphs" by M. Diligenti, F. M. Coetzee, S. Lawrence, C. L. Giles and M. Gori