Connect with us

AI

Optimizing the cost of training AWS DeepRacer reinforcement learning models

AWS DeepRacer is a cloud-based 3D racing simulator, an autonomous 1/18th scale race car driven by reinforcement learning, and a global racing league. Reinforcement learning (RL), an advanced machine learning (ML) technique, enables models to learn complex behaviors without labeled training data and make short-term decisions while optimizing for longer-term goals. But as we humans can attest, learning something […]

Published

on

AWS DeepRacer is a cloud-based 3D racing simulator, an autonomous 1/18th scale race car driven by reinforcement learning, and a global racing league. Reinforcement learning (RL), an advanced machine learning (ML) technique, enables models to learn complex behaviors without labeled training data and make short-term decisions while optimizing for longer-term goals. But as we humans can attest, learning something well takes time—and time is money. You can build and train a simple “all-wheels-on-track” model in the AWS DeepRacer console in just a couple of hours. However, if you’re building complex models involving multiple parameters, a reward function using trigonometry, or generally diving deep into RL, there are steps you can take to optimize the cost of training.

As a Senior Solutions Architect and an AWS DeepRacer PitCrew member, you ultimately rack up a lot of training time. Recently we shared tips for keeping it frugal with Blaine Sundrud, host of DeepRacer TV News. This post discusses that advice in more detail. To see the interview, check out the August 2020 Qualifiers edition of DRTV.

Also, look out for the cost-optimization article coming soon to the AWS DeepRacer Developer Guide for step-by-step procedures on these topics.

The AWS DeepRacer console provides you with many tools to help you get the most of training and evaluating your RL models. After you build a model based on a reward function, which is the incentive plan you create for the agent, your AWS DeepRacer vehicle, you need to train it. This means you enable the agent to explore various actions in its environment, which, for your vehicle is its track. There it attempts to take actions that result in rewards. Over time it learns the behaviors that will lead to a maximum reward—training time that takes machine time and costs money. My goal is to share how avoiding overtraining, validating your model, analyzing logs, using transfer learning, and creating a budget can help keep the focus on fun, not cost.

Overview

In this post, we walk you through some strategies for training better performing and more cost-effective AWS DeepRacer models:

Avoid overtraining

When training an RL model, more isn’t always better. Training longer than necessary can lead to overfitting, which means a model doesn’t adapt, or generalize well, from the environment it’s trained in to a novel environment, real or online. For AWS DeepRacer, a model that is overfit may perform well on a virtual track, but conditions like gravity, shadows on the track, the friction of the wheels on the track, wear in the gears, degradation of the battery, and even smudges on the camera lens can lead to the car running slowly or veering off a replica of that track in the real world. When training and racing exclusively in the AWS DeepRacer console, a model overfitted to an oval track will not do as well on a track with s-curves. In practical terms, you can think of an email spam filter that has been overtrained on messages about window replacements, credit card programs, and rich relatives in foreign lands. It might do an excellent job detecting spam related to those topics, but a terrible job finding spam related to scam insurance plans, gutters, home food delivery, and more original get-rich-quick schemes. To learn more about overfitting, watch AWS DeepRacer League – Overfitting.

We now know overtraining that leads to overfitting isn’t a good thing, but one of the first lessons an ML practitioner learns is that undertraining isn’t good either. So how much training is enough? The key is to stop training at the point when performance begins to degrade. With AWS DeepRacer, the Training Reward graph shows the cumulative reward received per training episode. You can expect this graph to be volatile initially, but over time the graph should trend upwards and to the right, and, as your model starts converging, the average should flatten out. As you watch the reward graph, also keep an eye on the agent’s driving behavior during training. You should stop training when the percentage of the track the car completes is no longer improving. In the following image, you can see a sample reward graph with the “best model” indicated. When the model’s track completion progress per episode continuously reaches 100% and the reward levels out, more training will lead to overfitting, a poorly generalized model, and wasted training time.

When to stop training

Validate your model

A reward function describes the immediate feedback, as a reward or penalty score, your model receives when your AWS DeepRacer vehicle moves from one position on the track to a new one. The function’s purpose is to encourage the vehicle to make moves along the track that reach a destination quickly, without incident or accident. A desirable move earns a higher score for the action, or target state, and an illegal or wasteful move earns a lower score. It may seem simple, but it’s easy to overlook errors in your code or find that your reward function unintentionally incentivizes undesirable moves. Validating your reward function both in theory and practice helps you avoid wasting time and money training a model that doesn’t do what you want it to do.

The validate function is similar to a Python lint tool. Choosing Validate checks the syntax of the reward function, and if successful, results in a “passed validation” message.

After checking the code, validate the performance of your reward function early and often. When first experimenting with a new reward function, train for a short period of time, such as 15 minutes, and observe the results to determine whether or not the reward function is performing as expected. Look at the reward results and percentage of track completion on the reward graph to see that they’re increasing (see the following example graph). If it looks like a well performing model, you can clone that model and train for additional time or start over with the same reward function. If the reward doesn’t improve, you can investigate and make adjustments without wasting training time and putting a dent in your pocketbook.

Analyze logs to improve efficiency

Focusing on the training graph alone does not give you a complete picture. Fortunately, AWS DeepRacer produces logs of actions taken during training. Log analysis involves a detailed look at the outputs produced by the AWS DeepRacer training job. Log analysis might involve an aggregation of the model’s performance at various locations on the track or at different speeds. Analysis often includes various kinds of visualization, such as plotting the agent’s behavior on the track, the reward values at various times or locations, or even plotting the racing line around the track to make sure you’re not oversteering and that your agent is taking the most efficient path. You can also include Python print() statements in your reward function to output interim results to the logs for each iteration of the reward function.

Without studying the logs, you’re likely only making guesses about where to improve. It’s better to rely on data to make these adjustments. You usually get a better model sooner by studying the logs and tweaking the reward function. When you get a decent model, try conducting log analysis before investing in further training time.

The following graph is an example of plotting the racing line around a track.

For more information about log analysis, see Using Jupyter Notebook for analysing DeepRacer’s logs.

Try transfer learning

In ML, as in life, there is no point in reinventing the wheel. Transfer learning involves relying on knowledge gained while solving one problem and applying it to a different, but related, problem. The shape of the AWS DeepRacer Convolutional Neural Network (CNN) is determined by the number of inputs (such as the cameras or LIDAR) and the outputs (such as the action space). A new model has weights set to random values, and a certain amount of training is required to converge to get a working model.

Instead of starting with random weights, you can copy an existing trained model. In the AWS DeepRacer environment, this is called cloning. Cloning works by making a deep copy of the neural network—the AWS DeepRacer CNN—including all the nodes and their weights. This can save training time and money.

The learning rate is one of the hyperparameters that controls the RL training. During each update, a portion of the new weight for each node results from the gradient-descent (or ascent) contribution, and the rest comes from the existing node weight. The learning rate controls how much a gradient-descent (or ascent) update contributes to the network weights. If you are interested in learning more about gradient descent, check out this post on optimizing deep learning.

You can use a higher learning rate to include more gradient-descent contributions for faster training, but the expected reward may not converge if the learning rate is too large. Try setting the learning rate reasonably high for the initial training. When it’s complete, clone and train the network for additional time with a reduced learning rate. This can save a significant amount of training time by allowing you to train quickly at first and then explore more slowly when you’re nearing an optimal solution.

Developers often ask why they can’t modify the action space during or after cloning. It’s because cloning a model results in a duplicate of the original network, and both the inputs and the action space are fixed. If you increase the action space, the behavior of a network with additional output nodes that had no connections to the other layers and no weights is unpredictable, and could lead to a lot more training or even a model that can’t converge at all. CNNs with node weights equal to zero are unpredictable. The nodes might even be deactivated (recall that 0 times anything is 0). Likewise, pruning one or more nodes from the output layer also drives unknown outcomes. Both situations require additional training to ensure the model works as expected, and there is no guarantee it will ever converge. Radically changing the reward function may result in a cloned model that doesn’t converge quickly or at all, which is a waste of time and money.

To try transfer learning following steps in the AWS DeepRacer Developer Guide, see Clone a Trained Model to Start a New Training Pass.

Create a budget

So far, we’ve looked at things you can do within the RL training process to save money. Aside from those I’ve discussed in the AWS DeepRacer console, there is another tool in AWS Management console that can help you keep your spend where you want it—AWS Budgets. You can set monthly, quarterly, and annual budgets for cost, usage, reservations, and savings plans.

On the Cost Management page, choose Budgets and create a budget for AWS DeepRacer.

To set a budget, sign in to the console and navigate to AWS Budgets. Then select a period, effective dates, and a budget amount. Next, configure an alert so that you receive an email notification when usage exceeds a stated percentage of that budget.

You can also configure an Amazon Simple Notification Service (Amazon SNS) topic to have chatbot alerts sent to Amazon Chime or Slack.

Clean up when done

When you’re done training, evaluating, and racing, it’s good practice to shut down unneeded resources and perform cleanup actions. Storage costs are minimal, but delete any models or log files that aren’t needed. If you used Amazon SageMaker or AWS RoboMaker, save and stop your notebooks and if they are no longer needed, delete them. Make sure you end any running training jobs in both services.

Conclusion

In this post, we covered several tips for optimizing spend for AWS DeepRacer, which you can apply to many other ML projects. Try any or all of these tips to minimize your expenses while having fun learning ML, by getting started in the AWS DeepRacer Console today!


About the Authors

 Tim O’Brien brings over 30 years of experience in information technology, security, and accounting to his customers. Tim has worked as a Senior Solutions Architect at AWS since 2018 and is focused on Machine Learning and Artificial Intelligence.
Previously, as a CTO and VP of Engineering, he led product design and technical delivery for three startups. Tim has served numerous businesses in the Pacific Northwest conducting security related activities, including data center reviews, lottery security reviews, and disaster planning.

A wordsmith, futurist, and relatively fresh recruit to the position of technical writer – AI/ML at AWS, Heather Johnston-Robinson is excited to leverage her background as a maker and educator to help people of all ages and backgrounds find and foster their spark of ingenuity with AWS DeepRacer. She recently migrated from adventures in the maker world with Foxbot Industries, Makerologist, MyOpen3D, and LEGO robotics to take on her current role at AWS.

Source: https://aws.amazon.com/blogs/machine-learning/optimizing-the-cost-of-training-aws-deepracer-reinforcement-learning-models/

AI

How does it know?! Some beginner chatbot tech for newbies.

Published

on

Wouter S. Sligter

Most people will know by now what a chatbot or conversational AI is. But how does one design and build an intelligent chatbot? Let’s investigate some essential concepts in bot design: intents, context, flows and pages.

I like using Google’s Dialogflow platform for my intelligent assistants. Dialogflow has a very accurate NLP engine at a cost structure that is extremely competitive. In Dialogflow there are roughly two ways to build the bot tech. One is through intents and context, the other is by means of flows and pages. Both of these design approaches have their own version of Dialogflow: “ES” and “CX”.

Dialogflow ES is the older version of the Dialogflow platform which works with intents, context and entities. Slot filling and fulfillment also help manage the conversation flow. Here are Google’s docs on these concepts: https://cloud.google.com/dialogflow/es/docs/concepts

Context is what distinguishes ES from CX. It’s a way to understand where the conversation is headed. Here’s a diagram that may help understand how context works. Each phrase that you type triggers an intent in Dialogflow. Each response by the bot happens after your message has triggered the most likely intent. It’s Dialogflow’s NLP engine that decides which intent best matches your message.

Wouter Sligter, 2020

What’s funny is that even though you typed ‘yes’ in exactly the same way twice, the bot gave you different answers. There are two intents that have been programmed to respond to ‘yes’, but only one of them is selected. This is how we control the flow of a conversation by using context in Dialogflow ES.

Unfortunately the way we program context into a bot on Dialogflow ES is not supported by any visual tools like the diagram above. Instead we need to type this context in each intent without seeing the connection to other intents. This makes the creation of complex bots quite tedious and that’s why we map out the design of our bots in other tools before we start building in ES.

The newer Dialogflow CX allows for a more advanced way of managing the conversation. By adding flows and pages as additional control tools we can now visualize and control conversations easily within the CX platform.

source: https://cloud.google.com/dialogflow/cx/docs/basics

This entire diagram is a ‘flow’ and the blue blocks are ‘pages’. This visualization shows how we create bots in Dialogflow CX. It’s immediately clear how the different pages are related and how the user will move between parts of the conversation. Visuals like this are completely absent in Dialogflow ES.

It then makes sense to use different flows for different conversation paths. A possible distinction in flows might be “ordering” (as seen here), “FAQs” and “promotions”. Structuring bots through flows and pages is a great way to handle complex bots and the visual UI in CX makes it even better.

At the time of writing (October 2020) Dialogflow CX only supports English NLP and its pricing model is surprisingly steep compared to ES. But bots are becoming critical tech for an increasing number of companies and the cost reductions and quality of conversations are enormous. Building and managing bots is in many cases an ongoing task rather than a single, rounded-off project. For these reasons it makes total sense to invest in a tool that can handle increasing complexity in an easy-to-use UI such as Dialogflow CX.

This article aims to give insight into the tech behind bot creation and Dialogflow is used merely as an example. To understand how I can help you build or manage your conversational assistant on the platform of your choice, please contact me on LinkedIn.

Source: https://chatbotslife.com/how-does-it-know-some-beginner-chatbot-tech-for-newbies-fa75ff59651f?source=rss—-a49517e4c30b—4

Continue Reading

AI

Who is chatbot Eliza?

Between 1964 and 1966 Eliza was born, one of the very first conversational agents. Discover the whole story.

Published

on


Frédéric Pierron

Between 1964 and 1966 Eliza was born, one of the very first conversational agents. Its creator, Joseph Weizenbaum was a researcher at the famous Artificial Intelligence Laboratory of the MIT (Massachusetts Institute of Technology). His goal was to enable a conversation between a computer and a human user. More precisely, the program simulates a conversation with a Rogérian psychoanalyst, whose method consists in reformulating the patient’s words to let him explore his thoughts himself.

Joseph Weizenbaum (Professor emeritus of computer science at MIT). Location: Balcony of his apartment in Berlin, Germany. By Ulrich Hansen, Germany (Journalist) / Wikipedia.

The program was rather rudimentary at the time. It consists in recognizing key words or expressions and displaying in return questions constructed from these key words. When the program does not have an answer available, it displays a “I understand” that is quite effective, albeit laconic.

Weizenbaum explains that his primary intention was to show the superficiality of communication between a human and a machine. He was very surprised when he realized that many users were getting caught up in the game, completely forgetting that the program was without real intelligence and devoid of any feelings and emotions. He even said that his secretary would discreetly consult Eliza to solve his personal problems, forcing the researcher to unplug the program.

Conversing with a computer thinking it is a human being is one of the criteria of Turing’s famous test. Artificial intelligence is said to exist when a human cannot discern whether or not the interlocutor is human. Eliza, in this sense, passes the test brilliantly according to its users.
Eliza thus opened the way (or the voice!) to what has been called chatbots, an abbreviation of chatterbot, itself an abbreviation of chatter robot, literally “talking robot”.

Source: https://chatbotslife.com/who-is-chatbot-eliza-bfeef79df804?source=rss—-a49517e4c30b—4

Continue Reading

AI

FermiNet: Quantum Physics and Chemistry from First Principles

Weve developed a new neural network architecture, the Fermionic Neural Network or FermiNet, which is well-suited to modeling the quantum state of large collections of electrons, the fundamental building blocks of chemical bonds.

Published

on

Unfortunately, 0.5% error still isn’t enough to be useful to the working chemist. The energy in molecular bonds is just a tiny fraction of the total energy of a system, and correctly predicting whether a molecule is stable can often depend on just 0.001% of the total energy of a system, or about 0.2% of the remaining “correlation” energy. For instance, while the total energy of the electrons in a butadiene molecule is almost 100,000 kilocalories per mole, the difference in energy between different possible shapes of the molecule is just 1 kilocalorie per mole. That means that if you want to correctly predict butadiene’s natural shape, then the same level of precision is needed as measuring the width of a football field down to the millimeter.

With the advent of digital computing after World War II, scientists developed a whole menagerie of computational methods that went beyond this mean field description of electrons. While these methods come in a bewildering alphabet soup of abbreviations, they all generally fall somewhere on an axis that trades off accuracy with efficiency. At one extreme, there are methods that are essentially exact, but scale worse than exponentially with the number of electrons, making them impractical for all but the smallest molecules. At the other extreme are methods that scale linearly, but are not very accurate. These computational methods have had an enormous impact on the practice of chemistry – the 1998 Nobel Prize in chemistry was awarded to the originators of many of these algorithms.

Fermionic Neural Networks

Despite the breadth of existing computational quantum mechanical tools, we felt a new method was needed to address the problem of efficient representation. There’s a reason that the largest quantum chemical calculations only run into the tens of thousands of electrons for even the most approximate methods, while classical chemical calculation techniques like molecular dynamics can handle millions of atoms. The state of a classical system can be described easily – we just have to track the position and momentum of each particle. Representing the state of a quantum system is far more challenging. A probability has to be assigned to every possible configuration of electron positions. This is encoded in the wavefunction, which assigns a positive or negative number to every configuration of electrons, and the wavefunction squared gives the probability of finding the system in that configuration. The space of all possible configurations is enormous – if you tried to represent it as a grid with 100 points along each dimension, then the number of possible electron configurations for the silicon atom would be larger than the number of atoms in the universe!

This is exactly where we thought deep neural networks could help. In the last several years, there have been huge advances in representing complex, high-dimensional probability distributions with neural networks. We now know how to train these networks efficiently and scalably. We surmised that, given these networks have already proven their mettle at fitting high-dimensional functions in artificial intelligence problems, maybe they could be used to represent quantum wavefunctions as well. We were not the first people to think of this – researchers such as Giuseppe Carleo and Matthias Troyer and others have shown how modern deep learning could be used for solving idealised quantum problems. We wanted to use deep neural networks to tackle more realistic problems in chemistry and condensed matter physics, and that meant including electrons in our calculations.

There is just one wrinkle when dealing with electrons. Electrons must obey the Pauli exclusion principle, which means that they can’t be in the same space at the same time. This is because electrons are a type of particle known as fermions, which include the building blocks of most matter – protons, neutrons, quarks, neutrinos, etc. Their wavefunction must be antisymmetric – if you swap the position of two electrons, the wavefunction gets multiplied by -1. That means that if two electrons are on top of each other, the wavefunction (and the probability of that configuration) will be zero.

This meant we had to develop a new type of neural network that was antisymmetric with respect to its inputs, which we have dubbed the Fermionic Neural Network, or FermiNet. In most quantum chemistry methods, antisymmetry is introduced using a function called the determinant. The determinant of a matrix has the property that if you swap two rows, the output gets multiplied by -1, just like a wavefunction for fermions. So you can take a bunch of single-electron functions, evaluate them for every electron in your system, and pack all of the results into one matrix. The determinant of that matrix is then a properly antisymmetric wavefunction. The major limitation of this approach is that the resulting function – known as a Slater determinant – is not very general. Wavefunctions of real systems are usually far more complicated. The typical way to improve on this is to take a large linear combination of Slater determinants – sometimes millions or more – and add some simple corrections based on pairs of electrons. Even then, this may not be enough to accurately compute energies.

Source: https://deepmind.com/blog/article/FermiNet

Continue Reading
AI16 hours ago

How does it know?! Some beginner chatbot tech for newbies.

AI16 hours ago

Who is chatbot Eliza?

AI1 day ago

FermiNet: Quantum Physics and Chemistry from First Principles

AI1 day ago

How to take S3 backups with DejaDup on Ubuntu 20.10

AI3 days ago

How banks and finance enterprises can strengthen their support with AI-powered customer service…

AI3 days ago

GBoard Introducing Voice — Smooth Texting and Typing

AI3 days ago

Automatically detecting personal protective equipment on persons in images using Amazon Rekognition

AI3 days ago

Automatically detecting personal protective equipment on persons in images using Amazon Rekognition

AI3 days ago

Automatically detecting personal protective equipment on persons in images using Amazon Rekognition

AI3 days ago

Automatically detecting personal protective equipment on persons in images using Amazon Rekognition

AI3 days ago

Automatically detecting personal protective equipment on persons in images using Amazon Rekognition

AI3 days ago

Automatically detecting personal protective equipment on persons in images using Amazon Rekognition

AI3 days ago

Automatically detecting personal protective equipment on persons in images using Amazon Rekognition

AI3 days ago

Automatically detecting personal protective equipment on persons in images using Amazon Rekognition

AI3 days ago

Automatically detecting personal protective equipment on persons in images using Amazon Rekognition

AI3 days ago

Automatically detecting personal protective equipment on persons in images using Amazon Rekognition

AI3 days ago

Automatically detecting personal protective equipment on persons in images using Amazon Rekognition

AI3 days ago

Automatically detecting personal protective equipment on persons in images using Amazon Rekognition

AI3 days ago

Automatically detecting personal protective equipment on persons in images using Amazon Rekognition

AI3 days ago

Automatically detecting personal protective equipment on persons in images using Amazon Rekognition

Trending