Links and Resources

Table of Contents

Deep learning resources

  1. CS231n: Convolutional Neural Networks for Visual Recognition/Video
  2. Neural Networks for Machine Learning: Learn about artificial neural networks and how they’re being used for machine learning, as applied to speech and object recognition, image segmentation, modeling language and human motion, etc. We’ll emphasize both the basic algorithms and the practical tricks needed to get them to work well. Geoffrey Hinton, Professor.
  3. Nando de Freita’s class on Machine/Deep Learning
  4. CS224d: Deep Learning for Natural Language Processing: The course provides a deep excursion into cutting-edge research in deep learning applied to NLP. The final project will involve training a complex recurrent neural network and applying it to a large scale NLP problem.
  5. “Deep Learning”, An MIT Press book Goodfellow and Yoshua Bengio and Aaron Courville
  6. Neural Networks and Deep Learning, a free online book, by Michael Nielsen
  7. Deep Learning Summer School, Montreal 2015/Deep Learning Summer School, Montreal 2016
  8. Caffe
  9. Theano
  10. TensorFlow
  11. MXNet
  12. Torch: Torch is a scientific computing framework with wide support for machine learning algorithms that puts GPUs first. It is easy to use and efficient, thanks to an easy and fast scripting language, LuaJIT, and an underlying C/CUDA implementation.

Database and Chanllenge

  1. ImageNet/ILSVRC: ImageNet is an image database organized according to the WordNet hierarchy (currently only the nouns), in which each node of the hierarchy is depicted by hundreds and thousands of images. Currently we have an average of over five hundred images per node. Large Scale Visual Recognition Challenge.
  2. Visual Genome: Visual Genome is a dataset, a knowledge base, an ongoing effort to connect structured image concepts to language.
  3. MS-Celeb-1M: Challenge of Recognizing One Million Celebrities in the Real World.
  4. Person Re-Identification: We list some datasets often used by researchers.
    • VIPeR: The VIPeR dataset contains 632 pedestrian image pairs taken from arbitrary viewpoints under varying illumination conditions. Each image is scaled to 128x48 pixels.
    • iLID-119: Dataset was captured at an airport arrival hall in the busy times under a multi-camera CCTV network. 476 person images for 119 pedestrians are extracted, most of which are with 4 images. All person images were normalized to 64×128 pixels.
    • CAVIAR4REID: Images of 72 pedestrians are captured in a shopping centre in Lisbon. 50 of them with both the camera views and the remaining 22 with one camera view. The minimum and maximum size of the images is 17 X 39 and 72 X 144, respectively. For each person we have a set of 5 or 10 images.
    • PRID 2011: The dataset consists of images extracted from multiple person trajectories recorded from two different, static surveillance cameras. Camera view A shows 385 persons, camera view B shows 749 persons. The first 200 persons appear in both camera views. Two versions of the dataset are provided, one representing the single-shot scenario and one representing the multi-shot scenario.
    • 3DPES: 1012 snapshot of 200 persons.
    • CUHK01/CUHK02/CUHK03: CUHK01 dataset contains 971 identities from two disjoint camera views. Each identity has two samples per camera view. CUHK02 contains 1,816 persons and five pairs of camera views (P1-P5, ten camera views). CUHK03 includes 13,164 images of 1,360 pedestrians.
    • Market-1501: The Market-1501 dataset is collected in front of a supermarket in Tsinghua University. A total of six cameras are used, including 5 high-resolution cameras, and one low-resolution camera. Overlap exists among different cameras. Overall, this dataset contains 32,668 annotated bounding boxes of 1,501 identities.
    • Partial-ReID: The dataset includes 600 images of 60 people, with 5 full-body images and 5 partial images per person.
    • iLIDS-VID: The iLIDS-VID dataset comprises 600 image sequences of 300 distinct individuals, with one pair of image sequences from two camera views for each person.
    • MARS: MARS consists of 1,261 different pedestrians whom are captured by at least 2 cameras. MARS is an extension of the Market-1501 dataset.
  5. Face Recognition Database: In this website, some face data sets often used by researchers are listed.
  6. Image Annotation:

Toolbox & Open Source Lirary

  1. SPArse Modeling Software: SPAMS (SPArse Modeling Software) is an optimization toolbox for solving various sparse estimation problems.
  2. VLFeat: The VLFeat open source library implements popular computer vision algorithms specializing in image understanding and local features extraction and matching.
  3. OpenCV: OpenCV (Open Source Computer Vision Library) is an open-source BSD-licensed library that includes several hundreds of computer vision algorithms.
  4. Matlab Toolbox for Dimensionality Reduction: The Matlab Toolbox for Dimensionality Reduction contains Matlab implementations of 34 techniques for dimensionality reduction and metric learning.
  5. CVX: Matlab Software for Disciplined Convex Programming: CVX is a Matlab-based modeling system for convex optimization. CVX turns Matlab into a modeling language, allowing constraints and objectives to be specified using standard Matlab expression syntax.
  6. LIBSVM – A Library for Support Vector Machines: LIBSVM is an integrated software for support vector classification, (C-SVC, nu-SVC), regression (epsilon-SVR, nu-SVR) and distribution estimation (one-class SVM). It supports multi-class classification.
  7. Dlib C++ library: Dlib is a modern C++ toolkit containing machine learning algorithms and tools for creating complex software in C++ to solve real world problems. It is used in both industry and academia in a wide range of domains including robotics, embedded devices, mobile phones, and large high performance computing environments.
  8. Shogun - A Large Scale Machine Learning Toolbox: The Shogun Machine learning toolbox provides a wide range of unified and efficient Machine Learning (ML) methods. The toolbox seamlessly allows to easily combine multiple data representations, algorithm classes, and general purpose tools. This enables both rapid prototyping of data pipelines and extensibility in terms of new algorithms. We combine modern software architecture in C++ with both efficient low-level computing backends and cutting edge algorithm implementations to solve large-scale Machine Learning problems (yet) on single machines.
  9. 28款GitHub最流行的开源机器学习项目

Learning Online

  1. MLSS Machine Learning Summer Schools: The machine learning summer school series was started in 2002 with the motivation to promulgate modern methods of statistical machine learning and inference.
  2. TechTalks.tv : Founded in 2011, TechTalks.tv allows thousands of people to publish, search and learn from slide-based videos, for free! Learn from the most respected and noteworthy experts on technology topics.
  3. VideoLectures.NET: VideoLectures.NET is an award-winning free and open access educational video lectures repository. The lectures are given by distinguished scholars and scientists at the most important and prominent events like conferences, summer schools, workshops and science promotional events from many fields of Science.
  4. ValseWebinar: 视觉与学习青年学者研讨会(Vision And Learning SEminar, 简称VALSE)的主要目标是为计算机视觉、图像处理、模式识别与机器学习等研究领域内的华人青年学者提供深入学术交流的舞台。
  5. MIT 18.06 Linear Algebra: This course features a complete set of video lectures by Professor Gilbert Strang. This is a basic subject on matrix theory and linear algebra. Emphasis is given to topics that will be useful in other disciplines, including systems of equations, vector spaces, determinants, eigenvalues, similarity, and positive definite matrices.
  6. EE364a: Convex Optimization I/EE364b: Convex Optimization II
  7. 斯坦福大学公开课 :机器学习课程
  8. ICML2016 Video OnLine: Tutorials/Orals/Plenary
  9. CVPR2016

Open Bibliographic Information

  1. JMLR: The Journal of Machine Learning Research (JMLR) provides an international forum for the electronic and paper publication of high-quality scholarly articles in all areas of machine learning. All published papers are freely available online.
  2. TPAMI: IEEE Transactions on Pattern Analysis & Machine Intelligence
  3. CVPapers - Computer Vision Resource
  4. Computer Vision Foundation open access
  5. DBLP: TPAMI, TIP, AI, IJCV, JMLR, UAI, AAAI, IJCAI, COLT, ICML, NIPS, CVPR, ICCV, ECCV, SIGIR, KDD, ACMM
  6. others: BMVC, ACCV, ICME, WACV
  7. ICLR2013, ICLR2014, ICLR2015, ICLR2016, ICLR2017
  8. arXiv.org: Artificial Intelligence, Computer Vision and Pattern Recognition, Learning, Multimedia, Neural and Evolutionary Computing, Numerical Analysis, Systems and Control, Information Retrieval, Information Theory, Mathematics, , Statistics
  9. Library Genesis: Library Genesis is a scientific community targeting collection of books on natural science disciplines and engineering.