Skip to main content

Blog

Learn About Our Meetup

5000+ Members

MEETUPS

LEARN, CONNECT, SHARE

Join our meetup, learn, connect, share, and get to know your Toronto AI community. 

JOB POSTINGS

INDEED POSTINGS

Browse through the latest deep learning, ai, machine learning postings from Indeed for the GTA.

CONTACT

CONNECT WITH US

Are you looking to sponsor space, be a speaker, or volunteer, feel free to give us a shout.

Author: torontoai

What’s the Difference Between Developing AI on Premises and in the Cloud?

Choosing between an on-premises GPU system and the cloud is a bit like deciding between buying or renting a home.

Renting takes less capital up front. It’s pay as you go, and features like the washer-dryer unit or leaky roof repair might be handled by the property owner. If their millennial children finally move out and it’s time to move to a different-sized home, a renter is only obligated to stick around for as long as contract terms dictate.

Those are the key benefits of renting GPUs in the cloud: a low financial barrier to entry, support from cloud service providers and the ability to quickly scale up or down to a different-sized computing cluster.

Buying, on the other hand, is a one-time, fixed cost — once you purchase a property, stay there as long as you’d like. Unless they’re living with teenagers, the owner has full sovereignty over what goes on inside. There’s no lease agreement, so as long as everyone fits in the house, it’s okay to invite over a few friends and relatives for an extended stay.

And that’s the same reasoning for investing in GPUs on premises. An on-prem system can be used for as much time and as many projects as the hardware can handle, making it easier to iterate and try different methods without considering cost. For sensitive data like financial information or healthcare records, it might be essential to keep everything behind an organization’s firewall.

Depending on the use case at hand and the kind of data involved, developers may choose to build their AI tools on a deskside system, on-prem data center or in the cloud. More likely than not, they’ll move from one environment to another at different points in the journey from initial experimentation to large-scale deployment.

Using GPUs in the Cloud

Cloud-based GPUs can be used for tasks as diverse as training multilingual AI speech engines, detecting early signs of diabetes-induced blindness and developing media-compression technology. Startups, academics and creators can quickly get started, explore new ideas and experiment without a long-term commitment to a specific size or configuration of GPUs.

NVIDIA data center GPUs can be accessed through all major cloud platforms, including Alibaba Cloud, Amazon Web Services, Google Cloud, IBM Cloud, Microsoft Azure and Oracle Cloud Infrastructure.

Cloud service providers aid users with setup and troubleshooting by offering helpful resources such as development tools, pre-trained neural networks and technical support for developers. When a flood of training data comes in, a pilot program launches or a ton of new users arrive, the cloud lets companies easily scale their infrastructure to cope with fluctuating demand for computing resources.

Adding to cost-effectiveness, developers using the cloud for research, containerized applications, experiments or other projects that aren’t time-sensitive can get discounts of up to 90 percent by using excess capacity. This usage, known as “spot instances,” effectively subleases space on cloud GPUs not in use by other customers.

Users working on the cloud long term can also upgrade to the latest, most powerful data center GPUs as cloud providers update their offerings — and can often take advantage of discounts for their continued use of the platform.

Using GPUs On Prem 

When building complex AI models with huge datasets, operating costs for a long-term project can sometimes escalate. That might cause developers to be mindful of each iteration or training run they undertake, leaving less freedom to experiment. An on-prem GPU system gives developers unlimited iteration and testing time for a one-time, fixed cost.

Data scientists, students and enterprises using on-prem GPUs don’t have to count how many hours of system use they’re racking up or budget how many runs they can afford over a particular timespan.

If a new methodology fails at first, there’s no added investment required to try a different variation of code, encouraging developer creativity. The more an on-prem system is used, the greater the developer’s return on investment.

From powerful desktop GPUs to workstations and enterprise systems, on-prem AI machines come in a broad spectrum of choices. Depending on the price and performance needs, developers might start off with a single NVIDIA GPU or workstation and eventually ramp up to a cluster of AI supercomputers.

NVIDIA and VMware support modern, virtualized data centers with vComputeServer software and the NVIDIA NGC container registry. These help organizations streamline the deployment and management of AI workloads on virtual environments using GPU servers.

Healthcare companies, human rights organizations and the financial services industry all have strict standards for data sovereignty and privacy. On-prem deep learning systems can make it easier to adopt AI while following regulations and minimizing cybersecurity risks.

Using a Hybrid Cloud Architecture

For many enterprises, it’s not enough to pick just one method. Hybrid cloud computing combines both, taking advantage of the security and manageability of on-prem systems while also leveraging public cloud resources from a service provider.

The hybrid cloud can be used when demand is high and on-prem resources are maxed out, a tactic known as cloud bursting. Or a business could rely on its on-prem data center for processing its most sensitive data, while running dynamic, computationally intensive tasks in the hybrid cloud.

Many enterprise data centers are already virtualized and looking to deploy a hybrid cloud that’s consistent with the business’ existing computing resources. NVIDIA partners with VMware Cloud on AWS to deliver accelerated GPU services for modern enterprise applications, including AI, machine learning and data analytics workflows.

The service will allow hybrid cloud users to seamlessly orchestrate and live-migrate AI workloads between GPU-accelerated virtual servers in data centers and the VMware Cloud.

Get the Best of Both Worlds: A Developer’s AI Roadmap

Making a choice between cloud and on-prem GPUs isn’t a one-time decision taken by a company or research team before starting an AI project. It’s a question developers can ask themselves at multiple stages during the lifespan of their projects.

A startup might do some early prototyping in the cloud, then switch to a desktop system or GPU workstation to develop and train its deep learning models. It could move back to the cloud when scaling up for production, fluctuating the number of clusters used based on customer demand. As the company builds up its global infrastructure, it may invest in a GPU-powered data center on premises.

Some organizations, such as ones building AI models to handle highly classified information, may stick to on-prem machines from start to finish. Others may build a cloud-first company that never builds out an on-prem data center.

One key tenet for organizations is to train where their data lands. If a business’s data lives in a cloud server, it may be most cost-effective to develop AI models in the cloud to avoid shuttling the data to an on-prem system for training. If training datasets are in a server onsite, investing in a cluster of on-prem GPUs might be the way to go.

Whichever route a team takes to accelerate their AI development with GPUs, NVIDIA developer resources are available to support engineers with SDKs, containers and open-source projects. Additionally, the NVIDIA Deep Learning Institute offers hands-on training for developers, data scientists, researchers and students learning how to use accelerated computing tools.

Visit the NVIDIA Deep Learning and AI page for more.

Main image by MyGuysMoving.com, licensed from Flickr under CC BY-SA 2.0.

The post What’s the Difference Between Developing AI on Premises and in the Cloud? appeared first on The Official NVIDIA Blog.

Recursive Sketches for Modular Deep Learning

Much of classical machine learning (ML) focuses on utilizing available data to make more accurate predictions. More recently, researchers have considered other important objectives, such as how to design algorithms to be small, efficient, and robust. With these goals in mind, a natural research objective is the design of a system on top of neural networks that efficiently stores information encoded within—in other words, a mechanism to compute a succinct summary (a “sketch”) of how a complex deep network processes its inputs. Sketching is a rich field of study that dates back to the foundational work of Alon, Matias, and Szegedy, which can enable neural networks to efficiently summarize information about their inputs.

For example: Imagine stepping into a room and briefly viewing the objects within. Modern machine learning is excellent at answering immediate questions, known at training time, about this scene: “Is there a cat? How big is said cat?” Now, suppose we view this room every day over the course of a year. People can reminisce about the times they saw the room: “How often did the room contain a cat? Was it usually morning or night when we saw the room?”. However, can one design systems that are also capable of efficiently answering such memory-based questions even if they are unknown at training time?

In “Recursive Sketches for Modular Deep Learning”, recently presented at ICML 2019, we explore how to succinctly summarize how a machine learning model understands its input. We do this by augmenting an existing (already trained) machine learning model with “sketches” of its computation, using them to efficiently answer memory-based questions—for example, image-to-image-similarity and summary statistics—despite the fact that they take up much less memory than storing the entire original computation.

Basic Sketching Algorithms
In general, sketching algorithms take a vector x and produce an output sketch vector that behaves like x but whose storage cost is much smaller. The fact that the storage cost is much smaller allows one to succinctly store information about the network, which is critical for efficiently answering memory-based questions. In the simplest case, a linear sketch x is given by the matrix-vector product Ax where A is a wide matrix, i.e., the number of columns is equal to the original dimension of x and the number of rows is equal to the new reduced dimension. Such methods have led to a variety of efficient algorithms for basic tasks on massive datasets, such as estimating fundamental statistics (e.g., histogram, quantiles and interquartile range), finding popular items (known as frequent elements), as well as estimating the number of distinct elements (known as support size) and the related tasks of norms and entropy estimation.

A simple method to sketch the vector x is to multiply it by a wide matrix A to produce a lower-dimensional vector y.

This basic approach works well in the relatively simple case of linear regression, where it is possible to identify important data dimensions simply by the magnitude of weights (under the common assumption that they have uniform variance). However, many modern machine learning models are actually deep neural networks and are based on high-dimensional embeddings (such as Word2Vec, Image Embeddings, Glove, DeepWalk and BERT), which makes the task of summarizing the operation of the model on the input much more difficult. However, a large subset of these more complex networks are modular, allowing us to generate accurate sketches of their behavior, in spite of their complexity.

Neural Network Modularity
A modular deep network consists of several independent neural networks (modules) that only communicate via one’s output serving as another’s input. This concept has inspired several practical architectures, including Neural Modular Networks, Capsule Neural Networks and PathNet. It is also possible to split other canonical architectures to view them as modular networks and apply our approach. For example, convolutional neural networks (CNNs) are traditionally understood to behave in a modular fashion; they detect basic concepts and attributes in their lower layers and build up to detecting more complex objects in their higher layers. In this view, the convolution kernels correspond to modules. A cartoon depiction of a modular network is given below.

This is a cartoon depiction of a modular network for image processing. Data flows from the bottom of the figure to the top through the modules represented with blue boxes. Note that modules in the lower layers correspond to basic objects, such as edges in an image, while modules in upper layers correspond to more complex objects, like humans or cats. Also notice that in this imaginary modular network, the output of the face module is generic enough to be used by both the human and cat modules.

Sketch Requirements
To optimize our approach for these modular networks, we identified several desired properties that a network sketch should satisfy:

  • Sketch-to-Sketch Similarity: The sketches of two unrelated network operations (either in terms of the present modules or in terms of the attribute vectors) should be very different; on the other hand, the sketches of two similar network operations should be very close.
  • Attribute Recovery: The attribute vector, e.g., the activations of any node of the graph can be approximately recovered from the top-level sketch.
  • Summary Statistics: If there are multiple similar objects, we can recover summary statistics about them. For example, if an image has multiple cats, we can count how many there are. Note that we want to do this without knowing the questions ahead of time.
  • Graceful Erasure: Erasing a suffix of the top-level sketch maintains the above properties (but would smoothly increase the error).
  • Network Recovery: Given sufficiently many (input, sketch) pairs, the wiring of the edges of the network as well as the sketch function can be approximately recovered.
This is a 2D cartoon depiction of the sketch-to-sketch similarity property. Each vector represents a sketch and related sketches are more likely to cluster together.

The Sketching Mechanism
The sketching mechanism we propose can be applied to a pre-trained modular network. It produces a single top-level sketch summarizing the operation of this network, simultaneously satisfying all of the desired properties above. To understand how it does this, it helps to first consider a one-layer network. In this case, we ensure that all the information pertaining to a specific node is “packed” into two separate subspaces, one corresponding to the node itself and one corresponding to its associated module. Using suitable projections, the first subspace lets us recover the attributes of the node whereas the second subspace facilitates quick estimates of summary statistics. Both subspaces help enforce the aforementioned sketch-to-sketch similarity property. We demonstrate that these properties hold if all the involved subspaces are chosen independently at random.

Of course, extra care has to be taken when extending this idea to networks with more than one layer—which leads to our recursive sketching mechanism. Due to their recursive nature, these sketches can be “unrolled” to identify sub-components, capturing even complicated network structures. Finally, we utilize a dictionary learning algorithm tailored to our setup to prove that the random subspaces making up the sketching mechanism together with the network architecture can be recovered from a sufficiently large number of (input, sketch) pairs.

Future Directions
The question of succinctly summarizing the operation of a network seems to be closely related to that of model interpretability. It would be interesting to investigate whether ideas from the sketching literature can be applied to this domain. Our sketches could also be organized in a repository to implicitly form a “knowledge graph”, allowing patterns to be identified and quickly retrieved. Moreover, our sketching mechanism allows for seamlessly adding new modules to the sketch repository—it would be interesting to explore whether this feature can have applications to architecture search and evolving network topologies. Finally, our sketches can be viewed as a way of organizing previously encountered information in memory, e.g., images that share the same modules or attributes would share subcomponents of their sketches. This, on a very high level, is similar to the way humans use prior knowledge to recognize objects and generalize to unencountered situations.

Acknowledgements
This work was the joint effort of Badih Ghazi, Rina Panigrahy and Joshua R. Wang.

[Discussion] Personal framework preference.

Hello everyone. I started a job in ML half a year ago and found myself to like pytorch the most. I started with 7 years in python beforehand and suppose that influenced my preference. The documentation is great, the functionality clean and simple. I just like the full control over everything and neat features like auto registration of layers in a Module. I also very much love the style of cuda detection during runtime, just pleasant to use. What are your favorites and why ?

submitted by /u/PanTheRiceMan
[link] [comments]

[D] Advice on domain adaption / transfer?

I am training a 2d image landmark estimation network where I have lots of synthetic data, but limited realistic labelled data. I can gather lots of un-labelled real data, but labelling is difficult. I have come across a few techniques out there like DANN As a non-expert, I’m not sure where exactly to start. Are there any common techniques that are worth trying first? Thanks!

submitted by /u/gecko39
[link] [comments]

[P] SpeechBrain: A PyTorch-based Speech Toolkit.

Hi there!

We are happy to announce the SpeechBrain project, that aims to develop an open-source and all-in-one toolkit based on PyTorch. The goal is to develop a single, flexible, and user-friendly toolkit that can be used to easily develop state-of-the-art speech systems for speech recognition (both end-to-end and HMM-DNN), speaker recognition, speech separation, multi-microphone signal processing (e.g, beamforming), self-supervised learning, and many others.

The project will be led by Mila (Montréal) and is sponsored by Samsung, Nvidia, and Dolby.

SpeechBrain will also benefit from the collaboration and expertise of other partners such as Avignon Université, Facebook/PyTorch, IBM Research, and Fluent.ai.

Check out https://speechbrain.github.io!

(Also, we are looking for interns 😉 check the website!)

Reddit is an awesome place to discuss, so please, let us know what you would like to see implemented for the speech community! This is a great opportunity to start building an user-friendly and AIO toolkit 😀 !

submitted by /u/TParcollet
[link] [comments]

[P] In search of the perfect Video Annotation Tool

I have tried a bunch of these types of tools, but can never find one that fits all my criteria. The tool should be:

  • Open source
  • Specifically made for video annotation
  • GUI-based, scrolling through frames and box-annotating objects with ease
  • Data can be exported to tensorflow and used to train neural network to recognize objects in similar videos

Does anyone have a video annotation tool that fulfills all these requirements?

I am trying to make a program where one can input a video, and out comes a list of frames where different pre-made objects were detected.

submitted by /u/kramerkee
[link] [comments]

[D] Version Control for Data Science — Tracking Machine Learning Models and Datasets with DVC

Unlike usual software dev projects, ML projects have additional huge files like datasets, trained models, label-encodings etc. which can easily go to the size of a few GBs and therefore cannot be tracked using Git.

The article explains how DVC (Data Version Control) tool helps us to version large data files, similar to how we version control source code files using Git and how we can track all the artifacts with DVC — which will make the workflow a lot more productive, as we don’t have to manually keep track of what we did to achieve the state, and also we don’t lose time in the processing of data and building models to reproduce the same state: Version Control for Data Science — Tracking Machine Learning Models and Datasets

submitted by /u/thumbsdrivesmecrazy
[link] [comments]