197 skills found · Page 1 of 7
mingrammer / Diagrams:art: Diagram as Code for prototyping cloud system architectures
likec4 / Likec4Visualize, collaborate, and evolve the software architecture with always actual and live diagrams from your code
ShisatoYano / AutonomousVehicleControlBeginnersGuidePython sample codes and documents about Autonomous vehicle control algorithm. This project can be used as a technical guide book to study the algorithms and the software architectures for beginners.
awslabs / Diagram As CodeDiagram-as-code for AWS architecture.
danilop / LambdAuthA sample authentication service implemented with a server-less architecture, using AWS Lambda to host and execute the code and Amazon DynamoDB as persistent storage. This provides a cost-efficient solution that is scalable and highly available and can be used with Amazon Cognito for Developer Authenticated Identities.
facebookresearch / Eb JepaAn open source library designed to provide community examples of Joint Embedding Predictive Architectures (JEPAs). It contains code and examples for learning representations from images, video, and action-conditioned video, as well as planning using JEPA-based models.
aws-samples / Well Architected Iac AnalyzerSample Generative AI tool for evaluating Infrastructure as Code and architecture diagrams against AWS Well-Architected best practices.
dmytrostriletskyi / Diagrams As CodeDiagrams as code: declarative configurations using YAML for drawing cloud system architectures.
DocHubTeam / DocHubУправление архитектурой как кодом
xPOURY4 / CodeCraft ArchitectAI-powered software architect and full-stack engineer prompt that elevates web code development by enforcing production-grade architecture, consistent coding standards, and automated quality practices. Designed to boost developer productivity and code quality when used as the primary prompt.
mehtadushy / SelecSLS PytorchReference ImageNet implementation of SelecSLS CNN architecture proposed in the SIGGRAPH 2020 paper "XNect: Real-time Multi-Person 3D Motion Capture with a Single RGB Camera". The repository also includes code for pruning the model based on implicit sparsity emerging from adaptive gradient descent methods, as detailed in the CVPR 2019 paper "On implicit filter level sparsity in Convolutional Neural Networks".
perrystreetsoftware / HarmonizeHarmonize is a modern linter for Swift that allows you to write architectural lint rules as unit tests, helping your team to keep your codebase clean, maintainable, and consistent as it grows, without relying on manual code reviews.
finos / Architecture As Code"Architecture as Code" (AasC) aims to devise and manage software architecture via a machine readable and version-controlled codebase, fostering a robust understanding, efficient development, and seamless maintenance of complex software architectures
molyswu / Hand Detectionusing Neural Networks (SSD) on Tensorflow. This repo documents steps and scripts used to train a hand detector using Tensorflow (Object Detection API). As with any DNN based task, the most expensive (and riskiest) part of the process has to do with finding or creating the right (annotated) dataset. I was interested mainly in detecting hands on a table (egocentric view point). I experimented first with the [Oxford Hands Dataset](http://www.robots.ox.ac.uk/~vgg/data/hands/) (the results were not good). I then tried the [Egohands Dataset](http://vision.soic.indiana.edu/projects/egohands/) which was a much better fit to my requirements. The goal of this repo/post is to demonstrate how neural networks can be applied to the (hard) problem of tracking hands (egocentric and other views). Better still, provide code that can be adapted to other uses cases. If you use this tutorial or models in your research or project, please cite [this](#citing-this-tutorial). Here is the detector in action. <img src="images/hand1.gif" width="33.3%"><img src="images/hand2.gif" width="33.3%"><img src="images/hand3.gif" width="33.3%"> Realtime detection on video stream from a webcam . <img src="images/chess1.gif" width="33.3%"><img src="images/chess2.gif" width="33.3%"><img src="images/chess3.gif" width="33.3%"> Detection on a Youtube video. Both examples above were run on a macbook pro **CPU** (i7, 2.5GHz, 16GB). Some fps numbers are: | FPS | Image Size | Device| Comments| | ------------- | ------------- | ------------- | ------------- | | 21 | 320 * 240 | Macbook pro (i7, 2.5GHz, 16GB) | Run without visualizing results| | 16 | 320 * 240 | Macbook pro (i7, 2.5GHz, 16GB) | Run while visualizing results (image above) | | 11 | 640 * 480 | Macbook pro (i7, 2.5GHz, 16GB) | Run while visualizing results (image above) | > Note: The code in this repo is written and tested with Tensorflow `1.4.0-rc0`. Using a different version may result in [some errors](https://github.com/tensorflow/models/issues/1581). You may need to [generate your own frozen model](https://pythonprogramming.net/testing-custom-object-detector-tensorflow-object-detection-api-tutorial/?completed=/training-custom-objects-tensorflow-object-detection-api-tutorial/) graph using the [model checkpoints](model-checkpoint) in the repo to fit your TF version. **Content of this document** - Motivation - Why Track/Detect hands with Neural Networks - Data preparation and network training in Tensorflow (Dataset, Import, Training) - Training the hand detection Model - Using the Detector to Detect/Track hands - Thoughts on Optimizations. > P.S if you are using or have used the models provided here, feel free to reach out on twitter ([@vykthur](https://twitter.com/vykthur)) and share your work! ## Motivation - Why Track/Detect hands with Neural Networks? There are several existing approaches to tracking hands in the computer vision domain. Incidentally, many of these approaches are rule based (e.g extracting background based on texture and boundary features, distinguishing between hands and background using color histograms and HOG classifiers,) making them not very robust. For example, these algorithms might get confused if the background is unusual or in situations where sharp changes in lighting conditions cause sharp changes in skin color or the tracked object becomes occluded.(see [here for a review](https://www.cse.unr.edu/~bebis/handposerev.pdf) paper on hand pose estimation from the HCI perspective) With sufficiently large datasets, neural networks provide opportunity to train models that perform well and address challenges of existing object tracking/detection algorithms - varied/poor lighting, noisy environments, diverse viewpoints and even occlusion. The main drawbacks to usage for real-time tracking/detection is that they can be complex, are relatively slow compared to tracking-only algorithms and it can be quite expensive to assemble a good dataset. But things are changing with advances in fast neural networks. Furthermore, this entire area of work has been made more approachable by deep learning frameworks (such as the tensorflow object detection api) that simplify the process of training a model for custom object detection. More importantly, the advent of fast neural network models like ssd, faster r-cnn, rfcn (see [here](https://github.com/tensorflow/models/blob/master/research/object_detection/g3doc/detection_model_zoo.md#coco-trained-models-coco-models) ) etc make neural networks an attractive candidate for real-time detection (and tracking) applications. Hopefully, this repo demonstrates this. > If you are not interested in the process of training the detector, you can skip straight to applying the [pretrained model I provide in detecting hands](#detecting-hands). Training a model is a multi-stage process (assembling dataset, cleaning, splitting into training/test partitions and generating an inference graph). While I lightly touch on the details of these parts, there are a few other tutorials cover training a custom object detector using the tensorflow object detection api in more detail[ see [here](https://pythonprogramming.net/training-custom-objects-tensorflow-object-detection-api-tutorial/) and [here](https://towardsdatascience.com/how-to-train-your-own-object-detector-with-tensorflows-object-detector-api-bec72ecfe1d9) ]. I recommend you walk through those if interested in training a custom object detector from scratch. ## Data preparation and network training in Tensorflow (Dataset, Import, Training) **The Egohands Dataset** The hand detector model is built using data from the [Egohands Dataset](http://vision.soic.indiana.edu/projects/egohands/) dataset. This dataset works well for several reasons. It contains high quality, pixel level annotations (>15000 ground truth labels) where hands are located across 4800 images. All images are captured from an egocentric view (Google glass) across 48 different environments (indoor, outdoor) and activities (playing cards, chess, jenga, solving puzzles etc). <img src="images/egohandstrain.jpg" width="100%"> If you will be using the Egohands dataset, you can cite them as follows: > Bambach, Sven, et al. "Lending a hand: Detecting hands and recognizing activities in complex egocentric interactions." Proceedings of the IEEE International Conference on Computer Vision. 2015. The Egohands dataset (zip file with labelled data) contains 48 folders of locations where video data was collected (100 images per folder). ``` -- LOCATION_X -- frame_1.jpg -- frame_2.jpg ... -- frame_100.jpg -- polygons.mat // contains annotations for all 100 images in current folder -- LOCATION_Y -- frame_1.jpg -- frame_2.jpg ... -- frame_100.jpg -- polygons.mat // contains annotations for all 100 images in current folder ``` **Converting data to Tensorflow Format** Some initial work needs to be done to the Egohands dataset to transform it into the format (`tfrecord`) which Tensorflow needs to train a model. This repo contains `egohands_dataset_clean.py` a script that will help you generate these csv files. - Downloads the egohands datasets - Renames all files to include their directory names to ensure each filename is unique - Splits the dataset into train (80%), test (10%) and eval (10%) folders. - Reads in `polygons.mat` for each folder, generates bounding boxes and visualizes them to ensure correctness (see image above). - Once the script is done running, you should have an images folder containing three folders - train, test and eval. Each of these folders should also contain a csv label document each - `train_labels.csv`, `test_labels.csv` that can be used to generate `tfrecords` Note: While the egohands dataset provides four separate labels for hands (own left, own right, other left, and other right), for my purpose, I am only interested in the general `hand` class and label all training data as `hand`. You can modify the data prep script to generate `tfrecords` that support 4 labels. Next: convert your dataset + csv files to tfrecords. A helpful guide on this can be found [here](https://pythonprogramming.net/creating-tfrecord-files-tensorflow-object-detection-api-tutorial/).For each folder, you should be able to generate `train.record`, `test.record` required in the training process. ## Training the hand detection Model Now that the dataset has been assembled (and your tfrecords), the next task is to train a model based on this. With neural networks, it is possible to use a process called [transfer learning](https://www.tensorflow.org/tutorials/image_retraining) to shorten the amount of time needed to train the entire model. This means we can take an existing model (that has been trained well on a related domain (here image classification) and retrain its final layer(s) to detect hands for us. Sweet!. Given that neural networks sometimes have thousands or millions of parameters that can take weeks or months to train, transfer learning helps shorten training time to possibly hours. Tensorflow does offer a few models (in the tensorflow [model zoo](https://github.com/tensorflow/models/blob/master/research/object_detection/g3doc/detection_model_zoo.md#coco-trained-models-coco-models)) and I chose to use the `ssd_mobilenet_v1_coco` model as my start point given it is currently (one of) the fastest models (read the SSD research [paper here](https://arxiv.org/pdf/1512.02325.pdf)). The training process can be done locally on your CPU machine which may take a while or better on a (cloud) GPU machine (which is what I did). For reference, training on my macbook pro (tensorflow compiled from source to take advantage of the mac's cpu architecture) the maximum speed I got was 5 seconds per step as opposed to the ~0.5 seconds per step I got with a GPU. For reference it would take about 12 days to run 200k steps on my mac (i7, 2.5GHz, 16GB) compared to ~5hrs on a GPU. > **Training on your own images**: Please use the [guide provided by Harrison from pythonprogramming](https://pythonprogramming.net/training-custom-objects-tensorflow-object-detection-api-tutorial/) on how to generate tfrecords given your label csv files and your images. The guide also covers how to start the training process if training locally. [see [here] (https://pythonprogramming.net/training-custom-objects-tensorflow-object-detection-api-tutorial/)]. If training in the cloud using a service like GCP, see the [guide here](https://github.com/tensorflow/models/blob/master/research/object_detection/g3doc/running_on_cloud.md). As the training process progresses, the expectation is that total loss (errors) gets reduced to its possible minimum (about a value of 1 or thereabout). By observing the tensorboard graphs for total loss(see image below), it should be possible to get an idea of when the training process is complete (total loss does not decrease with further iterations/steps). I ran my training job for 200k steps (took about 5 hours) and stopped at a total Loss (errors) value of 2.575.(In retrospect, I could have stopped the training at about 50k steps and gotten a similar total loss value). With tensorflow, you can also run an evaluation concurrently that assesses your model to see how well it performs on the test data. A commonly used metric for performance is mean average precision (mAP) which is single number used to summarize the area under the precision-recall curve. mAP is a measure of how well the model generates a bounding box that has at least a 50% overlap with the ground truth bounding box in our test dataset. For the hand detector trained here, the mAP value was **0.9686@0.5IOU**. mAP values range from 0-1, the higher the better. <img src="images/accuracy.jpg" width="100%"> Once training is completed, the trained inference graph (`frozen_inference_graph.pb`) is then exported (see the earlier referenced guides for how to do this) and saved in the `hand_inference_graph` folder. Now its time to do some interesting detection. ## Using the Detector to Detect/Track hands If you have not done this yet, please following the guide on installing [Tensorflow and the Tensorflow object detection api](https://github.com/tensorflow/models/blob/master/research/object_detection/g3doc/installation.md). This will walk you through setting up the tensorflow framework, cloning the tensorflow github repo and a guide on - Load the `frozen_inference_graph.pb` trained on the hands dataset as well as the corresponding label map. In this repo, this is done in the `utils/detector_utils.py` script by the `load_inference_graph` method. ```python detection_graph = tf.Graph() with detection_graph.as_default(): od_graph_def = tf.GraphDef() with tf.gfile.GFile(PATH_TO_CKPT, 'rb') as fid: serialized_graph = fid.read() od_graph_def.ParseFromString(serialized_graph) tf.import_graph_def(od_graph_def, name='') sess = tf.Session(graph=detection_graph) print("> ====== Hand Inference graph loaded.") ``` - Detect hands. In this repo, this is done in the `utils/detector_utils.py` script by the `detect_objects` method. ```python (boxes, scores, classes, num) = sess.run( [detection_boxes, detection_scores, detection_classes, num_detections], feed_dict={image_tensor: image_np_expanded}) ``` - Visualize detected bounding detection_boxes. In this repo, this is done in the `utils/detector_utils.py` script by the `draw_box_on_image` method. This repo contains two scripts that tie all these steps together. - detect_multi_threaded.py : A threaded implementation for reading camera video input detection and detecting. Takes a set of command line flags to set parameters such as `--display` (visualize detections), image parameters `--width` and `--height`, videe `--source` (0 for camera) etc. - detect_single_threaded.py : Same as above, but single threaded. This script works for video files by setting the video source parameter videe `--source` (path to a video file). ```cmd # load and run detection on video at path "videos/chess.mov" python detect_single_threaded.py --source videos/chess.mov ``` > Update: If you do have errors loading the frozen inference graph in this repo, feel free to generate a new graph that fits your TF version from the model-checkpoint in this repo. Use the [export_inference_graph.py](https://github.com/tensorflow/models/blob/master/research/object_detection/export_inference_graph.py) script provided in the tensorflow object detection api repo. More guidance on this [here](https://pythonprogramming.net/testing-custom-object-detector-tensorflow-object-detection-api-tutorial/?completed=/training-custom-objects-tensorflow-object-detection-api-tutorial/). ## Thoughts on Optimization. A few things that led to noticeable performance increases. - Threading: Turns out that reading images from a webcam is a heavy I/O event and if run on the main application thread can slow down the program. I implemented some good ideas from [Adrian Rosebuck](https://www.pyimagesearch.com/2017/02/06/faster-video-file-fps-with-cv2-videocapture-and-opencv/) on parrallelizing image capture across multiple worker threads. This mostly led to an FPS increase of about 5 points. - For those new to Opencv, images from the `cv2.read()` method return images in [BGR format](https://www.learnopencv.com/why-does-opencv-use-bgr-color-format/). Ensure you convert to RGB before detection (accuracy will be much reduced if you dont). ```python cv2.cvtColor(image_np, cv2.COLOR_BGR2RGB) ``` - Keeping your input image small will increase fps without any significant accuracy drop.(I used about 320 x 240 compared to the 1280 x 720 which my webcam provides). - Model Quantization. Moving from the current 32 bit to 8 bit can achieve up to 4x reduction in memory required to load and store models. One way to further speed up this model is to explore the use of [8-bit fixed point quantization](https://heartbeat.fritz.ai/8-bit-quantization-and-tensorflow-lite-speeding-up-mobile-inference-with-low-precision-a882dfcafbbd). Performance can also be increased by a clever combination of tracking algorithms with the already decent detection and this is something I am still experimenting with. Have ideas for optimizing better, please share! <img src="images/general.jpg" width="100%"> Note: The detector does reflect some limitations associated with the training set. This includes non-egocentric viewpoints, very noisy backgrounds (e.g in a sea of hands) and sometimes skin tone. There is opportunity to improve these with additional data. ## Integrating Multiple DNNs. One way to make things more interesting is to integrate our new knowledge of where "hands" are with other detectors trained to recognize other objects. Unfortunately, while our hand detector can in fact detect hands, it cannot detect other objects (a factor or how it is trained). To create a detector that classifies multiple different objects would mean a long involved process of assembling datasets for each class and a lengthy training process. > Given the above, a potential strategy is to explore structures that allow us **efficiently** interleave output form multiple pretrained models for various object classes and have them detect multiple objects on a single image. An example of this is with my primary use case where I am interested in understanding the position of objects on a table with respect to hands on same table. I am currently doing some work on a threaded application that loads multiple detectors and outputs bounding boxes on a single image. More on this soon.
HariSekhon / Diagrams As CodeCloud & DevOps Architecture Diagrams-as-Code in Python, D2 and MermaidJS languages
plantuml-stdlib / Archimate PlantUMLPlantUML macros and other includes for Archimate Diagrams
abusufyanvu / 6S191 MIT DeepLearningMIT Introduction to Deep Learning (6.S191) Instructors: Alexander Amini and Ava Soleimany Course Information Summary Prerequisites Schedule Lectures Labs, Final Projects, Grading, and Prizes Software labs Gather.Town lab + Office Hour sessions Final project Paper Review Project Proposal Presentation Project Proposal Grading Rubric Past Project Proposal Ideas Awards + Categories Important Links and Emails Course Information Summary MIT's introductory course on deep learning methods with applications to computer vision, natural language processing, biology, and more! Students will gain foundational knowledge of deep learning algorithms and get practical experience in building neural networks in TensorFlow. Course concludes with a project proposal competition with feedback from staff and a panel of industry sponsors. Prerequisites We expect basic knowledge of calculus (e.g., taking derivatives), linear algebra (e.g., matrix multiplication), and probability (e.g., Bayes theorem) -- we'll try to explain everything else along the way! Experience in Python is helpful but not necessary. This class is taught during MIT's IAP term by current MIT PhD researchers. Listeners are welcome! Schedule Monday Jan 18, 2021 Lecture: Introduction to Deep Learning and NNs Lab: Lab 1A Tensorflow and building NNs from scratch Tuesday Jan 19, 2021 Lecture: Deep Sequence Modelling Lab: Lab 1B Music Generation using RNNs Wednesday Jan 20, 2021 Lecture: Deep Computer Vision Lab: Lab 2A Image classification and detection Thursday Jan 21, 2021 Lecture: Deep Generative Modelling Lab: Lab 2B Debiasing facial recognition systems Friday Jan 22, 2021 Lecture: Deep Reinforcement Learning Lab: Lab 3 pixel-to-control planning Monday Jan 25, 2021 Lecture: Limitations and New Frontiers Lab: Lab 3 continued Tuesday Jan 26, 2021 Lecture (part 1): Evidential Deep Learning Lecture (part 2): Bias and Fairness Lab: Work on final assignments Lab competition entries due at 11:59pm ET on Canvas! Lab 1, Lab 2, and Lab 3 Wednesday Jan 27, 2021 Lecture (part 1): Nigel Duffy, Ernst & Young Lecture (part 2): Kate Saenko, Boston University and MIT-IBM Watson AI Lab Lab: Work on final assignments Assignments due: Sign up for Final Project Competition Thursday Jan 28, 2021 Lecture (part 1): Sanja Fidler, U. Toronto, Vector Institute, and NVIDIA Lecture (part 2): Katherine Chou, Google Lab: Work on final assignments Assignments due: 1 page paper review (if applicable) Friday Jan 29, 2021 Lecture: Student project pitch competition Lab: Awards ceremony and prize giveaway Assignments due: Project proposals (if applicable) Lectures Lectures will be held starting at 1:00pm ET from Jan 18 - Jan 29 2021, Monday through Friday, virtually through Zoom. Current MIT students, faculty, postdocs, researchers, staff, etc. will be able to access the lectures during this two week period, synchronously or asynchronously, via the MIT Canvas course webpage (MIT internal only). Lecture recordings will be uploaded to the Canvas as soon as possible; students are not required to attend any lectures synchronously. Please see the Canvas for details on Zoom links. The public edition of the course will only be made available after completion of the MIT course. Labs, Final Projects, Grading, and Prizes Course will be graded during MIT IAP for 6 units under P/D/F grading. Receiving a passing grade requires completion of each software lab project (through honor code, with submission required to enter lab competitions), a final project proposal/presentation or written review of a deep learning paper (submission required), and attendance/lecture viewing (through honor code). Submission of a written report or presentation of a project proposal will ensure a passing grade. MIT students will be eligible for prizes and awards as part of the class competitions. There will be two parts to the competitions: (1) software labs and (2) final projects. More information is provided below. Winners will be announced on the last day of class, with thousands of dollars of prizes being given away! Software labs There are three TensorFlow software lab exercises for the course, designed as iPython notebooks hosted in Google Colab. Software labs can be found on GitHub: https://github.com/aamini/introtodeeplearning. These are self-paced exercises and are designed to help you gain practical experience implementing neural networks in TensorFlow. For registered MIT students, submission of lab materials is not necessary to get credit for the course or to pass the course. At the end of each software lab there will be task-associated materials to submit (along with instructions) for entry into the competitions, open to MIT students and affiliates during the IAP offering. This includes MIT students/affiliates who are taking the class as listeners -- you are eligible! These instructions are provided at the end of each of the labs. Completing these tasks and submitting your materials to Canvas will enter you into a per-lab competition. MIT students and affiliates will be eligible for prizes during the IAP offering; at the end of the course, prize-winners will be awarded with their prizes. All competition submissions are due on January 26 at 11:59pm ET to Canvas. For the software lab competitions, submissions will be judged on the basis of the following criteria: Strength and quality of final results (lab dependent) Soundness of implementation and approach Thoroughness and quality of provided descriptions and figures Gather.Town lab + Office Hour sessions After each day’s lecture, there will be open Office Hours in the class GatherTown, up until 3pm ET. An MIT email is required to log in and join the GatherTown. During these sessions, there will not be a walk through or dictation of the labs; the labs are designed to be self-paced and to be worked on on your own time. The GatherTown sessions will be hosted by course staff and are held so you can: Ask questions on course lectures, labs, logistics, project, or anything else; Work on the labs in the presence of classmates/TAs/instructors; Meet classmates to find groups for the final project; Group work time for the final project; Bring the class community together. Final project To satisfy the final project requirement for this course, students will have two options: (1) write a 1 page paper review (single-spaced) on a recent deep learning paper of your choice or (2) participate and present in the project proposal pitch competition. The 1 page paper review option is straightforward, we propose some papers within this document to help you get started, and you can satisfy a passing grade with this option -- you will not be eligible for the grand prizes. On the other hand, participation in the project proposal pitch competition will equivalently satisfy your course requirements but additionally make you eligible for the grand prizes. See the section below for more details and requirements for each of these options. Paper Review Students may satisfy the final project requirement by reading and reviewing a recent deep learning paper of their choosing. In the written review, students should provide both: 1) a description of the problem, technical approach, and results of the paper; 2) critical analysis and exposition of the limitations of the work and opportunities for future work. Reviews should be submitted on Canvas by Thursday Jan 28, 2021, 11:59:59pm Eastern Time (ET). Just a few paper options to consider... https://papers.nips.cc/paper/2017/file/3f5ee243547dee91fbd053c1c4a845aa-Paper.pdf https://papers.nips.cc/paper/2018/file/69386f6bb1dfed68692a24c8686939b9-Paper.pdf https://papers.nips.cc/paper/2020/file/1457c0d6bfcb4967418bfb8ac142f64a-Paper.pdf https://science.sciencemag.org/content/362/6419/1140 https://papers.nips.cc/paper/2018/file/0e64a7b00c83e3d22ce6b3acf2c582b6-Paper.pdf https://arxiv.org/pdf/1906.11829.pdf https://www.nature.com/articles/s42256-020-00237-3 https://pubmed.ncbi.nlm.nih.gov/32084340/ Project Proposal Presentation Keyword: proposal This is a 2 week course so we do not require results or working implementations! However, to win the top prizes, nice, clear results and implementations will demonstrate feasibility of your proposal which is something we look for! Logistics -- please read! You must sign up to present before 11:59:59pm Eastern Time (ET) on Wednesday Jan 27, 2021 Slides must be in a Google Slide before 11:59:59pm Eastern Time (ET) on Thursday Jan 28, 2021 Project groups can be between 1 and 5 people Listeners welcome To be eligible for a prize you must have at least 1 registered MIT student in your group Each participant will only be allowed to be in one group and present one project pitch Synchronous attendance on 1/29/21 is required to make the project pitch! 3 min presentation on your idea (we will be very strict with the time limits) Prizes! (see below) Sign up to Present here: by 11:59pm ET on Wednesday Jan 27 Once you sign up, make your slide in the following Google Slides; submit by midnight on Thursday Jan 28. Please specify the project group # on your slides!!! Things to Consider This doesn’t have to be a new deep learning method. It can just be an interesting application that you apply some existing deep learning method to. What problem are you solving? Are there use cases/applications? Why do you think deep learning methods might be suited to this task? How have people done it before? Is it a new task? If so, what are similar tasks that people have worked on? In what aspects have they succeeded or failed? What is your method of solving this problem? What type of model + architecture would you use? Why? What is the data for this task? Do you need to make a dataset or is there one publicly available? What are the characteristics of the data? Is it sparse, messy, imbalanced? How would you deal with that? Project Proposal Grading Rubric Project proposals will be evaluated by a panel of judges on the basis of the following three criteria: 1) novelty and impact; 2) technical soundness, feasibility, and organization, including quality of any presented results; 3) clarity and presentation. Each judge will award a score from 1 (lowest) to 5 (highest) for each of the criteria; the average score from each judge across these criteria will then be averaged with that of the other judges to provide the final score. The proposals with the highest final scores will be selected for prizes. Here are the guidelines for the criteria: Novelty and impact: encompasses the potential impact of the project idea, its novelty with respect to existing approaches. Why does the proposed work matter? What problem(s) does it solve? Why are these problems important? Technical soundness, feasibility, and organization: encompasses all technical aspects of the proposal. Do the proposed methodology and architecture make sense? Is the architecture the best suited for the proposed problem? Is deep learning the best approach for the problem? How realistic is it to implement the idea? Was there any implementation of the method? If results and data are presented, we will evaluate the strength of the results/data. Clarity and presentation: encompasses the delivery and quality of the presentation itself. Is the talk well organized? Are the slides aesthetically compelling? Is there a clear, well-delivered narrative? Are the problem and proposed method clearly presented? Past Project Proposal Ideas Recipe Generation with RNNs Can we compress videos with CNN + RNN? Music Generation with RNNs Style Transfer Applied to X GAN’s on a new modality Summarizing text/news articles Combining news articles about similar events Code or spec generation Multimodal speech → handwriting Generate handwriting based on keywords (i.e. cursive, slanted, neat) Predicting stock market trends Show language learners articles or videos at their level Transfer of writing style Chemical Synthesis with Recurrent Neural networks Transfer learning to learn something in a domain for which it’s hard or risky to gather data or do training RNNs to model some type of time series data Computer vision to coach sports players Computer vision system for safety brakes or warnings Use IBM Watson API to get the sentiment of your Facebook newsfeed Deep learning webcam to give wifi-access to friends or improve video chat in some way Domain-specific chatbot to help you perform a specific task Detect whether a signature is fraudulent Awards + Categories Final Project Awards: 1x NVIDIA RTX 3080 4x Google Home Max 3x Display Monitors Software Lab Awards: Bose headphones (Lab 1) Display monitor (Lab 2) Bebop drone (Lab 3) Important Links and Emails Course website: http://introtodeeplearning.com Course staff: introtodeeplearning-staff@mit.edu Piazza forum (MIT only): https://piazza.com/mit/spring2021/6s191 Canvas (MIT only): https://canvas.mit.edu/courses/8291 Software lab repository: https://github.com/aamini/introtodeeplearning Lab/office hour sessions (MIT only): https://gather.town/app/56toTnlBrsKCyFgj/MITDeepLearning
SlavaVedernikov / C4InterFlowArchitecture as Code (AaC) framework that lets you describe Architecture Model once and then generates many diagrams. Inspired by C4 Model
himanshub1007 / Alzhimers Disease Prediction Using Deep Learning# AD-Prediction Convolutional Neural Networks for Alzheimer's Disease Prediction Using Brain MRI Image ## Abstract Alzheimers disease (AD) is characterized by severe memory loss and cognitive impairment. It associates with significant brain structure changes, which can be measured by magnetic resonance imaging (MRI) scan. The observable preclinical structure changes provides an opportunity for AD early detection using image classification tools, like convolutional neural network (CNN). However, currently most AD related studies were limited by sample size. Finding an efficient way to train image classifier on limited data is critical. In our project, we explored different transfer-learning methods based on CNN for AD prediction brain structure MRI image. We find that both pretrained 2D AlexNet with 2D-representation method and simple neural network with pretrained 3D autoencoder improved the prediction performance comparing to a deep CNN trained from scratch. The pretrained 2D AlexNet performed even better (**86%**) than the 3D CNN with autoencoder (**77%**). ## Method #### 1. Data In this project, we used public brain MRI data from **Alzheimers Disease Neuroimaging Initiative (ADNI)** Study. ADNI is an ongoing, multicenter cohort study, started from 2004. It focuses on understanding the diagnostic and predictive value of Alzheimers disease specific biomarkers. The ADNI study has three phases: ADNI1, ADNI-GO, and ADNI2. Both ADNI1 and ADNI2 recruited new AD patients and normal control as research participants. Our data included a total of 686 structure MRI scans from both ADNI1 and ADNI2 phases, with 310 AD cases and 376 normal controls. We randomly derived the total sample into training dataset (n = 519), validation dataset (n = 100), and testing dataset (n = 67). #### 2. Image preprocessing Image preprocessing were conducted using Statistical Parametric Mapping (SPM) software, version 12. The original MRI scans were first skull-stripped and segmented using segmentation algorithm based on 6-tissue probability mapping and then normalized to the International Consortium for Brain Mapping template of European brains using affine registration. Other configuration includes: bias, noise, and global intensity normalization. The standard preprocessing process output 3D image files with an uniform size of 121x145x121. Skull-stripping and normalization ensured the comparability between images by transforming the original brain image into a standard image space, so that same brain substructures can be aligned at same image coordinates for different participants. Diluted or enhanced intensity was used to compensate the structure changes. the In our project, we used both whole brain (including both grey matter and white matter) and grey matter only. #### 3. AlexNet and Transfer Learning Convolutional Neural Networks (CNN) are very similar to ordinary Neural Networks. A CNN consists of an input and an output layer, as well as multiple hidden layers. The hidden layers are either convolutional, pooling or fully connected. ConvNet architectures make the explicit assumption that the inputs are images, which allows us to encode certain properties into the architecture. These then make the forward function more efficient to implement and vastly reduce the amount of parameters in the network. #### 3.1. AlexNet The net contains eight layers with weights; the first five are convolutional and the remaining three are fully connected. The overall architecture is shown in Figure 1. The output of the last fully-connected layer is fed to a 1000-way softmax which produces a distribution over the 1000 class labels. AlexNet maximizes the multinomial logistic regression objective, which is equivalent to maximizing the average across training cases of the log-probability of the correct label under the prediction distribution. The kernels of the second, fourth, and fifth convolutional layers are connected only to those kernel maps in the previous layer which reside on the same GPU (as shown in Figure1). The kernels of the third convolutional layer are connected to all kernel maps in the second layer. The neurons in the fully connected layers are connected to all neurons in the previous layer. Response-normalization layers follow the first and second convolutional layers. Max-pooling layers follow both response-normalization layers as well as the fifth convolutional layer. The ReLU non-linearity is applied to the output of every convolutional and fully-connected layer.  The first convolutional layer filters the 224x224x3 input image with 96 kernels of size 11x11x3 with a stride of 4 pixels (this is the distance between the receptive field centers of neighboring neurons in a kernel map). The second convolutional layer takes as input the (response-normalized and pooled) output of the first convolutional layer and filters it with 256 kernels of size 5x5x48. The third, fourth, and fifth convolutional layers are connected to one another without any intervening pooling or normalization layers. The third convolutional layer has 384 kernels of size 3x3x256 connected to the (normalized, pooled) outputs of the second convolutional layer. The fourth convolutional layer has 384 kernels of size 3x3x192 , and the fifth convolutional layer has 256 kernels of size 3x3x192. The fully-connected layers have 4096 neurons each. #### 3.2. Transfer Learning Training an entire Convolutional Network from scratch (with random initialization) is impractical[14] because it is relatively rare to have a dataset of sufficient size. An alternative is to pretrain a Conv-Net on a very large dataset (e.g. ImageNet), and then use the ConvNet either as an initialization or a fixed feature extractor for the task of interest. Typically, there are three major transfer learning scenarios: **ConvNet as fixed feature extractor:** We can take a ConvNet pretrained on ImageNet, and remove the last fully-connected layer, then treat the rest structure as a fixed feature extractor for the target dataset. In AlexNet, this would be a 4096-D vector. Usually, we call these features as CNN codes. Once we get these features, we can train a linear classifier (e.g. linear SVM or Softmax classifier) for our target dataset. **Fine-tuning the ConvNet:** Another idea is not only replace the last fully-connected layer in the classifier, but to also fine-tune the parameters of the pretrained network. Due to overfitting concerns, we can only fine-tune some higher-level part of the network. This suggestion is motivated by the observation that earlier features in a ConvNet contains more generic features (e.g. edge detectors or color blob detectors) that can be useful for many kind of tasks. But the later layer of the network becomes progressively more specific to the details of the classes contained in the original dataset. **Pretrained models:** The released pretrained model is usually the final ConvNet checkpoint. So it is common to see people use the network for fine-tuning. #### 4. 3D Autoencoder and Convolutional Neural Network We take a two-stage approach where we first train a 3D sparse autoencoder to learn filters for convolution operations, and then build a convolutional neural network whose first layer uses the filters learned with the autoencoder.  #### 4.1. Sparse Autoencoder An autoencoder is a 3-layer neural network that is used to extract features from an input such as an image. Sparse representations can provide a simple interpretation of the input data in terms of a small number of \parts by extracting the structure hidden in the data. The autoencoder has an input layer, a hidden layer and an output layer, and the input and output layers have same number of units, while the hidden layer contains more units for a sparse and overcomplete representation. The encoder function maps input x to representation h, and the decoder function maps the representation h to the output x. In our problem, we extract 3D patches from scans as the input to the network. The decoder function aims to reconstruct the input form the hidden representation h. #### 4.2. 3D Convolutional Neural Network Training the 3D convolutional neural network(CNN) is the second stage. The CNN we use in this project has one convolutional layer, one pooling layer, two linear layers, and finally a log softmax layer. After training the sparse autoencoder, we take the weights and biases of the encoder from trained model, and use them a 3D filter of a 3D convolutional layer of the 1-layer convolutional neural network. Figure 2 shows the architecture of the network. #### 5. Tools In this project, we used Nibabel for MRI image processing and PyTorch Neural Networks implementation.
jettbrains / L W3C Strategic Highlights September 2019 This report was prepared for the September 2019 W3C Advisory Committee Meeting (W3C Member link). See the accompanying W3C Fact Sheet — September 2019. For the previous edition, see the April 2019 W3C Strategic Highlights. For future editions of this report, please consult the latest version. A Chinese translation is available. ☰ Contents Introduction Future Web Standards Meeting Industry Needs Web Payments Digital Publishing Media and Entertainment Web & Telecommunications Real-Time Communications (WebRTC) Web & Networks Automotive Web of Things Strengthening the Core of the Web HTML CSS Fonts SVG Audio Performance Web Performance WebAssembly Testing Browser Testing and Tools WebPlatform Tests Web of Data Web for All Security, Privacy, Identity Internationalization (i18n) Web Accessibility Outreach to the world W3C Developer Relations W3C Training Translations W3C Liaisons Introduction This report highlights recent work of enhancement of the existing landscape of the Web platform and innovation for the growth and strength of the Web. 33 working groups and a dozen interest groups enable W3C to pursue its mission through the creation of Web standards, guidelines, and supporting materials. We track the tremendous work done across the Consortium through homogeneous work-spaces in Github which enables better monitoring and management. We are in the middle of a period where we are chartering numerous working groups which demonstrate the rapid degree of change for the Web platform: After 4 years, we are nearly ready to publish a Payment Request API Proposed Recommendation and we need to soon charter follow-on work. In the last year we chartered the Web Payment Security Interest Group. In the last year we chartered the Web Media Working Group with 7 specifications for next generation Media support on the Web. We have Accessibility Guidelines under W3C Member review which includes Silver, a new approach. We have just launched the Decentralized Identifier Working Group which has tremendous potential because Decentralized Identifier (DID) is an identifier that is globally unique, resolveable with high availability, and cryptographically verifiable. We have Privacy IG (PING) under W3C Member review which strengthens our focus on the tradeoff between privacy and function. We have a new CSS charter under W3C Member review which maps the group's work for the next three years. In this period, W3C and the WHATWG have succesfully completed the negotiation of a Memorandum of Understanding rooted in the mutual belief that that having two distinct specifications claiming to be normative is generally harmful for the Web community. The MOU, signed last May, describes how the two organizations are to collaborate on the development of a single authoritative version of the HTML and DOM specifications. W3C subsequently rechartered the HTML Working Group to assist the W3C community in raising issues and proposing solutions for the HTML and DOM specifications, and for the production of W3C Recommendations from WHATWG Review Drafts. As the Web evolves continuously, some groups are looking for ways for specifications to do so as well. So-called "evergreen recommendations" or "living standards" aim to track continuous development (and maintenance) of features, on a feature-by-feature basis, while getting review and patent commitments. We see the maturation and further development of an incredible number of new technologies coming to the Web. Continued progress in many areas demonstrates the vitality of the W3C and the Web community, as the rest of the report illustrates. Future Web Standards W3C has a variety of mechanisms for listening to what the community thinks could become good future Web standards. These include discussions with the Membership, discussions with other standards bodies, the activities of thousands of participants in over 300 community groups, and W3C Workshops. There are lots of good ideas. The W3C strategy team has been identifying promising topics and invites public participation. Future, recent and under consideration Workshops include: Inclusive XR (5-6 November 2019, Seattle, WA, USA) to explore existing and future approaches on making Virtual and Augmented Reality experiences more inclusive, including to people with disabilities; W3C Workshop on Data Models for Transportation (12-13 September 2019, Palo Alto, CA, USA) W3C Workshop on Web Games (27-28 June 2019, Redmond, WA, USA), view report Second W3C Workshop on the Web of Things (3-5 June 2019, Munich, Germany) W3C Workshop on Web Standardization for Graph Data; Creating Bridges: RDF, Property Graph and SQL (4-6 March 2019, Berlin, Germany), view report Web & Machine Learning. The Strategy Funnel documents the staff's exploration of potential new work at various phases: Exploration and Investigation, Incubation and Evaluation, and eventually to the chartering of a new standards group. The Funnel view is a GitHub Project where new area are issues represented by “cards” which move through the columns, usually from left to right. Most cards start in Exploration and move towards Chartering, or move out of the funnel. Public input is welcome at any stage but particularly once Incubation has begun. This helps W3C identify work that is sufficiently incubated to warrant standardization, to review the ecosystem around the work and indicate interest in participating in its standardization, and then to draft a charter that reflects an appropriate scope. Ongoing feedback can speed up the overall standardization process. Since the previous highlights document, W3C has chartered a number of groups, and started discussion on many more: Newly Chartered or Rechartered Web Application Security WG (03-Apr) Web Payment Security IG (17-Apr) Patent and Standards IG (24-Apr) Web Applications WG (14-May) Web & Networks IG (16-May) Media WG (23-May) Media and Entertainment IG (06-Jun) HTML WG (06-Jun) Decentralized Identifier WG (05-Sep) Extended Privacy IG (PING) (30-Sep) Verifiable Claims WG (30-Sep) Service Workers WG (31-Dec) Dataset Exchange WG (31-Dec) Web of Things Working Group (31-Dec) Web Audio Working Group (31-Dec) Proposed charters / Advance Notice Accessibility Guidelines WG Privacy IG (PING) RDF Literal Direction WG Timed Text WG CSS WG Web Authentication WG Closed Internationalization Tag Set IG Meeting Industry Needs Web Payments All Web Payments specifications W3C's payments standards enable a streamlined checkout experience, enabling a consistent user experience across the Web with lower front end development costs for merchants. Users can store and reuse information and more quickly and accurately complete online transactions. The Web Payments Working Group has republished Payment Request API as a Candidate Recommendation, aiming to publish a Proposed Recommendation in the Fall 2019, and is discussing use cases and features for Payment Request after publication of the 1.0 Recommendation. Browser vendors have been finalizing implementation of features added in the past year (view the implementation report). As work continues on the Payment Handler API and its implementation (currently in Chrome and Edge Canary), one focus in 2019 is to increase adoption in other browsers. Recently, Mastercard demonstrated the use of Payment Request API to carry out EMVCo's Secure Remote Commerce (SRC) protocol whose payment method definition is being developed with active participation by Visa, Mastercard, American Express, and Discover. Payment method availability is a key factor in merchant considerations about adopting Payment Request API. The ability to get uniform adoption of a new payment method such as Secure Remote Commerce (SRC) also depends on the availability of the Payment Handler API in browsers, or of proprietary alternatives. Web Monetization, which the Web Payments Working Group will discuss again at its face-to-face meeting in September, can be used to enable micropayments as an alternative revenue stream to advertising. Since the beginning of 2019, Amazon, Brave Software, JCB, Certus Cybersecurity Solutions and Netflix have joined the Web Payments Working Group. In April, W3C launched the Web Payment Security Group to enable W3C, EMVCo, and the FIDO Alliance to collaborate on a vision for Web payment security and interoperability. Participants will define areas of collaboration and identify gaps between existing technical specifications in order to increase compatibility among different technologies, such as: How do SRC, FIDO, and Payment Request relate? The Payment Services Directive 2 (PSD2) regulations in Europe are scheduled to take effect in September 2019. What is the role of EMVCo, W3C, and FIDO technologies, and what is the current state of readiness for the deadline? How can we improve privacy on the Web at the same time as we meet industry requirements regarding user identity? Digital Publishing All Digital Publishing specifications, Publication milestones The Web is the universal publishing platform. Publishing is increasingly impacted by the Web, and the Web increasingly impacts Publishing. Topic of particular interest to Publishing@W3C include typography and layout, accessibility, usability, portability, distribution, archiving, offline access, print on demand, and reliable cross referencing. And the diverse publishing community represented in the groups consist of the traditional "trade" publishers, ebook reading system manufacturers, but also publishers of audio book, scholarly journals or educational materials, library scientists or browser developers. The Publishing Working Group currently concentrates on Audiobooks which lack a comprehensive standard, thus incurring extra costs and time to publish in this booming market. Active development is ongoing on the future standard: Publication Manifest Audiobook profile for Web Publications Lightweight Packaging Format The BD Comics Manga Community Group, the Synchronized Multimedia for Publications Community Group, the Publishing Community Group and a future group on archival, are companions to the working group where specific work is developed and incubated. The Publishing Community Group is a recently launched incubation channel for Publishing@W3C. The goal of the group is to propose, document, and prototype features broadly related to: publications on the Web reading modes and systems and the user experience of publications The EPUB 3 Community Group has successfully completed the revision of EPUB 3.2. The Publishing Business Group fosters ongoing participation by members of the publishing industry and the overall ecosystem in the development of Web infrastructure to better support the needs of the industry. The Business Group serves as an additional conduit to the Publishing Working Group and several Community Groups for feedback between the publishing ecosystem and W3C. The Publishing BG has played a vital role in fostering and advancing the adoption and continued development of EPUB 3. In particular the BG provided critical support to the update of EPUBCheck to validate EPUB content to the new EPUB 3.2 specification. This resulted in the development, in conjunction with the EPUB3 Community Group, of a new generation of EPUBCheck, i.e., EPUBCheck 4.2 production-ready release. Media and Entertainment All Media specifications The Media and Entertainment vertical tracks media-related topics and features that create immersive experiences for end users. HTML5 brought standard audio and video elements to the Web. Standardization activities since then have aimed at turning the Web into a professional platform fully suitable for the delivery of media content and associated materials, enabling missing features to stream video content on the Web such as adaptive streaming and content protection. Together with Microsoft, Comcast, Netflix and Google, W3C received an Technology & Engineering Emmy Award in April 2019 for standardization of a full TV experience on the Web. Current goals are to: Reinforce core media technologies: Creation of the Media Working Group, to develop media-related specifications incubated in the WICG (e.g. Media Capabilities, Picture-in-picture, Media Session) and maintain maintain/evolve Media Source Extensions (MSE) and Encrypted Media Extensions (EME). Improve support for Media Timed Events: data cues incubation. Enhance color support (HDR, wide gamut), in scope of the CSS WG and in the Color on the Web CG. Reduce fragmentation: Continue annual releases of a common and testable baseline media devices, in scope of the Web Media APIs CG and in collaboration with the CTA WAVE Project. Maintain the Road-map of Media Technologies for the Web which highlights Web technologies that can be used to build media applications and services, as well as known gaps to enable additional use cases. Create the future: Discuss perspectives for Media and Entertainment for the Web. Bring the power of GPUs to the Web (graphics, machine learning, heavy processing), under incubation in the GPU for the Web CG. Transition to a Working Group is under discussion. Determine next steps after the successful W3C Workshop on Web Games of June 2019. View the report. Timed Text The Timed Text Working Group develops and maintains formats used for the representation of text synchronized with other timed media, like audio and video, and notably works on TTML, profiles of TTML, and WebVTT. Recent progress includes: A robust WebVTT implementation report poises the specification for publication as a proposed recommendation. Discussions around re-chartering, notably to add a TTML Profile for Audio Description deliverable to the scope of the group, and clarify that rendering of captions within XR content is also in scope. Immersive Web Hardware that enables Virtual Reality (VR) and Augmented Reality (AR) applications are now broadly available to consumers, offering an immersive computing platform with both new opportunities and challenges. The ability to interact directly with immersive hardware is critical to ensuring that the web is well equipped to operate as a first-class citizen in this environment. The Immersive Web Working Group has been stabilizing the WebXR Device API while the companion Immersive Web Community Group incubates the next series of features identified as key for the future of the Immersive Web. W3C plans a workshop focused on the needs and benefits at the intersection of VR & Accessibility (Inclusive XR), on 5-6 November 2019 in Seattle, WA, USA, to explore existing and future approaches on making Virtual and Augmented Reality experiences more inclusive. Web & Telecommunications The Web is the Open Platform for Mobile. Telecommunication service providers and network equipment providers have long been critical actors in the deployment of Web technologies. As the Web platform matures, it brings richer and richer capabilities to extend existing services to new users and devices, and propose new and innovative services. Real-Time Communications (WebRTC) All Real-Time Communications specifications WebRTC has reshaped the whole communication landscape by making any connected device a potential communication end-point, bringing audio and video communications anywhere, on any network, vastly expanding the ability of operators to reach their customers. WebRTC serves as the corner-stone of many online communication and collaboration services. The WebRTC Working Group aims to bringing WebRTC 1.0 (and companion specification Media Capture and Streams) to Recommendation by the end of 2019. Intense efforts are focused on testing (supported by a dedicated hackathon at IETF 104) and interoperability. The group is considering pushing features that have not gotten enough traction to separate modules or to a later minor revision of the spec. Beyond WebRTC 1.0, the WebRTC Working Group will focus its efforts on WebRTC NV which the group has started documenting by identifying use cases. Web & Networks Recently launched, in the wake of the May 2018 Web5G workshop, the Web & Networks Interest Group is chaired by representatives from AT&T, China Mobile and Intel, with a goal to explore solutions for web applications to achieve better performance and resource allocation, both on the device and network. The group's first efforts are around use cases, privacy & security requirements and liaisons. Automotive All Automotive specifications To create a rich application ecosystem for vehicles and other devices allowed to connect to the vehicle, the W3C Automotive Working Group is delivering a service specification to expose all common vehicle signals (engine temperature, fuel/charge level, range, tire pressure, speed, etc.) The Vehicle Information Service Specification (VISS), which is a Candidate Recommendation, is seeing more implementations across the industry. It provides the access method to a common data model for all the vehicle signals –presently encapsulating a thousand or so different data elements– and will be growing to accommodate the advances in automotive such as autonomous and driver assist technologies and electrification. The group is already working on a successor to VISS, leveraging the underlying data model and the VIWI submission from Volkswagen, for a more robust means of accessing vehicle signals information and the same paradigm for other automotive needs including location-based services, media, notifications and caching content. The Automotive and Web Platform Business Group acts as an incubator for prospective standards work. One of its task forces is using W3C VISS in performing data sampling and off-boarding the information to the cloud. Access to the wealth of information that W3C's auto signals standard exposes is of interest to regulators, urban planners, insurance companies, auto manufacturers, fleet managers and owners, service providers and others. In addition to components needed for data sampling and edge computing, capturing user and owner consent, information collection methods and handling of data are in scope. The upcoming W3C Workshop on Data Models for Transportation (September 2019) is expected to focus on the need of additional ontologies around transportation space. Web of Things All Web of Things specifications W3C's Web of Things work is designed to bridge disparate technology stacks to allow devices to work together and achieve scale, thus enabling the potential of the Internet of Things by eliminating fragmentation and fostering interoperability. Thing descriptions expressed in JSON-LD cover the behavior, interaction affordances, data schema, security configuration, and protocol bindings. The Web of Things complements existing IoT ecosystems to reduce the cost and risk for suppliers and consumers of applications that create value by combining multiple devices and information services. There are many sectors that will benefit, e.g. smart homes, smart cities, smart industry, smart agriculture, smart healthcare and many more. The Web of Things Working Group is finishing the initial Web of Things standards, with support from the Web of Things Interest Group: Web of Things Architecture Thing Descriptions Strengthening the Core of the Web HTML The HTML Working Group was chartered early June to assist the W3C community in raising issues and proposing solutions for the HTML and DOM specifications, and to produce W3C Recommendations from WHATWG Review Drafts. A few days before, W3C and the WHATWG signed a Memorandum of Understanding outlining the agreement to collaborate on the development of a single version of the HTML and DOM specifications. Issues and proposed solutions for HTML and DOM done via the newly rechartered HTML Working Group in the WHATWG repositories The HTML Working Group is targetting November 2019 to bring HTML and DOM to Candidate Recommendations. CSS All CSS specifications CSS is a critical part of the Open Web Platform. The CSS Working Group gathers requirements from two large groups of CSS users: the publishing industry and application developers. Within W3C, those groups are exemplified by the Publishing groups and the Web Platform Working Group. The former requires things like better pagination support and advanced font handling, the latter needs intelligent (and fast!) scrolling and animations. What we know as CSS is actually a collection of almost a hundred specifications, referred to as ‘modules’. The current state of CSS is defined by a snapshot, updated once a year. The group also publishes an index defining every term defined by CSS specifications. Fonts All Fonts specifications The Web Fonts Working Group develops specifications that allow the interoperable deployment of downloadable fonts on the Web, with a focus on Progressive Font Enrichment as well as maintenance of WOFF Recommendations. Recent and ongoing work includes: Early API experiments by Adobe and Monotype have demonstrated the feasibility of a font enrichment API, where a server delivers a font with minimal glyph repertoire and the client can query the full repertoire and request additional subsets on-the-fly. In other experiments, the Brotli compression used in WOFF 2 was extended to support shared dictionaries and patch update. Metrics to quantify improvement are a current hot discussion topic. The group will meet at ATypi 2019 in Japan, to gather requirements from the international typography community. The group will first produce a report summarizing the strengths and weaknesses of each prototype solution by Q2 2020. SVG All SVG specifications SVG is an important and widely-used part of the Open Web Platform. The SVG Working Group focuses on aligning the SVG 2.0 specification with browser implementations, having split the specification into a currently-implemented 2.0 and a forward-looking 2.1. Current activity is on stabilization, increased integration with the Open Web Platform, and test coverage analysis. The Working Group was rechartered in March 2019. A new work item concerns native (non-Web-browser) uses of SVG as a non-interactive, vector graphics format. Audio The Web Audio Working Group was extended to finish its work on the Web Audio API, expecting to publish it as a Recommendation by year end. The specification enables synthesizing audio in the browser. Audio operations are performed with audio nodes, which are linked together to form a modular audio routing graph. Multiple sources — with different types of channel layout — are supported. This modular design provides the flexibility to create complex audio functions with dynamic effects. The first version of Web Audio API is now feature complete and is implemented in all modern browsers. Work has started on the next version, and new features are being incubated in the Audio Community Group. Performance Web Performance All Web Performance specifications There are currently 18 specifications in development in the Web Performance Working Group aiming to provide methods to observe and improve aspects of application performance of user agent features and APIs. The W3C team is looking at related work incubated in the W3C GPU for the Web (WebGPU) Community Group which is poised to transition to a W3C Working Group. A preliminary draft charter is available. WebAssembly All WebAssembly specifications WebAssembly improves Web performance and power by being a virtual machine and execution environment enabling loaded pages to run native (compiled) code. It is deployed in Firefox, Edge, Safari and Chrome. The specification will soon reach Candidate Recommendation. WebAssembly enables near-native performance, optimized load time, and perhaps most importantly, a compilation target for existing code bases. While it has a small number of native types, much of the performance increase relative to Javascript derives from its use of consistent typing. WebAssembly leverages decades of optimization for compiled languages and the byte code is optimized for compactness and streaming (the web page starts executing while the rest of the code downloads). Network and API access all occurs through accompanying Javascript libraries -- the security model is identical to that of Javascript. Requirements gathering and language development occur in the Community Group while the Working Group manages test development, community review and progression of specifications on the Recommendation Track. Testing Browser testing plays a critical role in the growth of the Web by: Improving the reliability of Web technology definitions; Improving the quality of implementations of these technologies by helping vendors to detect bugs in their products; Improving the data available to Web developers on known bugs and deficiencies of Web technologies by publishing results of these tests. Browser Testing and Tools The Browser Testing and Tools Working Group is developing WebDriver version 2, having published last year the W3C Recommendation of WebDriver. WebDriver acts as a remote control interface that enables introspection and control of user agents, provides a platform- and language-neutral wire protocol as a way for out-of-process programs to remotely instruct the behavior of Web, and emulates the actions of a real person using the browser. WebPlatform Tests The WebPlatform Tests project now provides a mechanism which allows to fully automate tests that previously needed to be run manually: TestDriver. TestDriver enables sending trusted key and mouse events, sending complex series of trusted pointer and key interactions for things like in-content drag-and-drop or pinch zoom, and even file upload. Since 2014 W3C began work on this coordinated open-source effort to build a cross-browser test suite for the Web Platform, which WHATWG, and all major browsers adopted. Web of Data All Data specifications There have been several great success stories around the standardization of data on the web over the past year. Verifiable Claims seems to have significant uptake. It is also significant that the Distributed Identifier WG charter has received numerous favorable reviews, and was just recently launched. JSON-LD has been a major success with the large deployment on Web sites via schema.org. JSON-LD 1.1 completed technical work, about to transition to CR More than 25% of websites today include schema.org data in JSON-LD The Web of Things description is in CR since May, making use of JSON-LD Verifiable Credentials data model is in CR since July, also making use of JSON-LD Continued strong interest in decentralized identifiers Engagement from the TAG with reframing core documents, such as Ethical Web Principles, to include data on the web within their scope Data is increasingly important for all organizations, especially with the rise of IoT and Big Data. W3C has a mature and extensive suite of standards relating to data that were developed over two decades of experience, with plans for further work on making it easier for developers to work with graph data and knowledge graphs. Linked Data is about the use of URIs as names for things, the ability to dereference these URIs to get further information and to include links to other data. There are ever-increasing sources of open Linked Data on the Web, as well as data services that are restricted to the suppliers and consumers of those services. The digital transformation of industry is seeking to exploit advanced digital technologies. This will facilitate businesses to integrate horizontally along the supply and value chains, and vertically from the factory floor to the office floor. W3C is seeking to make it easier to support enterprise-wide data management and governance, reflecting the strategic importance of data to modern businesses. Traditional approaches to data have focused on tabular databases (SQL/RDBMS), Comma Separated Value (CSV) files, and data embedded in PDF documents and spreadsheets. We're now in midst of a major shift to graph data with nodes and labeled directed links between them. Graph data is: Faster than using SQL and associated JOIN operations More favorable to integrating data from heterogeneous sources Better suited to situations where the data model is evolving In the wake of the recent W3C Workshop on Graph Data we are in the process of launching a Graph Standardization Business Group to provide a business perspective with use cases and requirements, to coordinate technical standards work and liaisons with external organizations. Web for All Security, Privacy, Identity All Security specifications, all Privacy specifications Authentication on the Web As the WebAuthn Level 1 W3C Recommendation published last March is seeing wide implementation and adoption of strong cryptographic authentication, work is proceeding on Level 2. The open standard Web API gives native authentication technology built into native platforms, browsers, operating systems (including mobile) and hardware, offering protection against hacking, credential theft, phishing attacks, thus aiming to end the era of passwords as a security construct. You may read more in our March press release. Privacy An increasing number of W3C specifications are benefitting from Privacy and Security review; there are security and privacy aspects to every specification. Early review is essential. Working with the TAG, the Privacy Interest Group has updated the Self-Review Questionnaire: Security and Privacy. Other recent work of the group includes public blogging further to the exploration of anti-patterns in standards and permission prompts. Security The Web Application Security Working Group adopted Feature Policy, aiming to allow developers to selectively enable, disable, or modify the behavior of some of these browser features and APIs within their application; and Fetch Metadata, aiming to provide servers with enough information to make a priori decisions about whether or not to service a request based on the way it was made, and the context in which it will be used. The Web Payment Security Interest Group, launched last April, convenes members from W3C, EMVCo, and the FIDO Alliance to discuss cooperative work to enhance the security and interoperability of Web payments (read more about payments). Internationalization (i18n) All Internationalization specifications, educational articles related to Internationalization, spec developers checklist Only a quarter or so current Web users use English online and that proportion will continue to decrease as the Web reaches more and more communities of limited English proficiency. If the Web is to live up to the "World Wide" portion of its name, and for the Web to truly work for stakeholders all around the world engaging with content in various languages, it must support the needs of worldwide users as they engage with content in the various languages. The growth of epublishing also brings requirements for new features and improved typography on the Web. It is important to ensure the needs of local communities are captured. The W3C Internationalization Initiative was set up to increase in-house resources dedicated to accelerating progress in making the World Wide Web "worldwide" by gathering user requirements, supporting developers, and education & outreach. For an overview of current projects see the i18n radar. W3C's Internationalization efforts progressed on a number of fronts recently: Requirements: New African and European language groups will work on the gap analysis, errata and layout requirements. Gap analysis: Japanese, Devanagari, Bengali, Tamil, Lao, Khmer, Javanese, and Ethiopic updated in the gap-analysis documents. Layout requirements document: notable progress tracked in the Southeast Asian Task Force while work continues on Chinese layout requirements. Developer support: Spec reviews: the i18n WG continues active review of specifications of the WHATWG and other W3C Working Groups. Short review checklist: easy way to begin a self-review to help spec developers understand what aspects of their spec are likely to need attention for internationalization, and points them to more detailed checklists for the relevant topics. It also helps those reviewing specs for i18n issues. Strings on the Web: Language and Direction Metadata lays out issues and discusses potential solutions for passing information about language and direction with strings in JSON or other data formats. The document was rewritten for clarity, and expanded. The group is collaborating with the JSON-LD and Web Publishing groups to develop a plan for updating RDF, JSON-LD and related specifications to handle metadata for base direction of text (bidi). User-friendly test format: a new format was developed for Internationalization Test Suite tests, which displays helpful information about how the test works. This particularly useful because those tests are pointed to by educational materials and gap-analysis documents. Web Platform Tests: a large number of tests in the i18n test suite have been ported to the WPT repository, including: css-counter-styles, css-ruby, css-syntax, css-test, css-text-decor, css-writing-modes, and css-pseudo. Education & outreach: (for all educational materials, see the HTML & CSS Authoring Techniques) Web Accessibility All Accessibility specifications, WAI resources The Web Accessibility Initiative supports W3C's Web for All mission. Recent achievements include: Education and training: Inaccessibility of CAPTCHA updated to bring our analysis and recommendations up to date with CAPTCHA practice today, concluding two years of extensive work and invaluable input from the public (read more on the W3C Blog Learn why your web content and applications should be accessible. The Education and Outreach Working Group has completed revision and updating of the Business Case for Digital Accessibility. Accessibility guidelines: The Accessibility Guidelines Working Group has continued to update WCAG Techniques and Understanding WCAG 2.1; and published a Candidate Recommendation of Accessibility Conformance Testing Rules Format 1.0 to improve inter-rater reliability when evaluating conformance of web content to WCAG An updated charter is being developed to host work on "Silver", the next generation accessibility guidelines (WCAG 2.2) There are accessibility aspects to most specifications. Check your work with the FAST checklist. Outreach to the world W3C Developer Relations To foster the excellent feedback loop between Web Standards development and Web developers, and to grow participation from that diverse community, recent W3C Developer Relations activities include: @w3cdevs tracks the enormous amount of work happening across W3C W3C Track during the Web Conference 2019 in San Francisco Tech videos: W3C published the 2019 Web Games Workshop videos The 16 September 2019 Developer Meetup in Fukuoka, Japan, is open to all and will combine a set of technical demos prepared by W3C groups, and a series of talks on a selected set of W3C technologies and projects W3C is involved with Mozilla, Google, Samsung, Microsoft and Bocoup in the organization of ViewSource 2019 in Amsterdam (read more on the W3C Blog) W3C Training In partnership with EdX, W3C's MOOC training program, W3Cx offers a complete "Front-End Web Developer" (FEWD) professional certificate program that consists of a suite of five courses on the foundational languages that power the Web: HTML5, CSS and JavaScript. We count nearly 900K students from all over the world. Translations Many Web users rely on translations of documents developed at W3C whose official language is English. W3C is extremely grateful to the continuous efforts of its community in ensuring our various deliverables in general, and in our specifications in particular, are made available in other languages, for free, ensuring their exposure to a much more diverse set of readers. Last Spring we developed a more robust system, a new listing of translations of W3C specifications and updated the instructions on how to contribute to our translation efforts. W3C Liaisons Liaisons and coordination with numerous organizations and Standards Development Organizations (SDOs) is crucial for W3C to: make sure standards are interoperable coordinate our respective agenda in Internet governance: W3C participates in ICANN, GIPO, IGF, the I* organizations (ICANN, IETF, ISOC, IAB). ensure at the government liaison level that our standards work is officially recognized when important to our membership so that products based on them (often done by our members) are part of procurement orders. W3C has ARO/PAS status with ISO. W3C participates in the EU MSP and Rolling Plan on Standardization ensure the global set of Web and Internet standards form a compatible stack of technologies, at the technical and policy level (patent regime, fragmentation, use in policy making) promote Standards adoption equally by the industry, the public sector, and the public at large Coralie Mercier, Editor, W3C Marketing & Communications $Id: Overview.html,v 1.60 2019/10/15 12:05:52 coralie Exp $ Copyright © 2019 W3C ® (MIT, ERCIM, Keio, Beihang) Usage policies apply.