Connect with us

AI

Predicting Defender Trajectories in NFL’s Next Gen Stats

NFL’s Next Gen Stats (NGS) powered by AWS accurately captures player and ball data in real time for every play and every NFL game—over 300 million data points per season—through the extensive use of sensors in players’ pads and the ball. With this rich set of tracking data, NGS uses AWS machine learning (ML) technology […]

Published

on

NFL’s Next Gen Stats (NGS) powered by AWS accurately captures player and ball data in real time for every play and every NFL game—over 300 million data points per season—through the extensive use of sensors in players’ pads and the ball. With this rich set of tracking data, NGS uses AWS machine learning (ML) technology to uncover deeper insights and develop a better understanding of various aspects and trends of the game. To date, NGS metrics have focused on helping fans better appreciate and understand the offense and defense in gameplay through the application of advanced analytics, particularly in the passing game. Thanks to tracking data, it’s possible to quantify the difficulty of passes, model expected yards after catch, and determine the value of various play outcomes. A logical next step with this analytical information is to evaluate quarterback decision-making, such as whether the quarterback has considered all eligible receivers and evaluated tradeoffs accurately.

To effectively model quarterback decision-making, we considered a few key metrics—mainly the probability of different events occurring on a pass, and the value of said events. A pass can result in three outcomes: completion, incompletion, or interception. NGS has already created models that provide probabilities of these outcomes, but these events rely on information that’s available at only two points during the play: when the ball is thrown (termed as pass-forward), and when the ball arrives to a receiver (pass-arrived). Because of this, creating accurate probabilities requires modeling the trajectory of players between those two points in time.

For these probabilities, the quarterback’s decision is heavily influenced by the quality of defensive coverage on various receivers, because a receiver with a closely covered defender has a lower likelihood of pass completion compared to a receiver who is wide open due to blown coverage. Furthermore, defenders are inherently reactive to how the play progresses. Defenses move in completely different ways depending on which receiver is targeted on the pass. This means that a trajectory model for defenders has to similarly be reactive to the specified targeted receiver in a believable manner.

The following diagram is a top-down view of a play, with the blue circles representing offensive players and red representing the defensive players. The dotted red lines are examples of projected player trajectories. For the highlighted defender, their trajectory depends on who the targeted receiver is (13 to the left or 81 to the right).

With the help of Amazon ML Solutions Lab, we have jointly developed a model that successfully uses this tracking data to provide league-average predictions of defender trajectories. Specifically, we predict the trajectories of defensive backs from when the pass is thrown to when the pass should arrive to the receiver. Our methodology for this is a deep-learning sequence model, which we call our Defender Ghosting model. In this post, we share how we developed an ML model to predict defender trajectories (first describing the data preprocessing and feature engineering, followed by a description of the model architecture), and metrics to evaluate the quality of these trajectory predictions.

Data and feature engineering

We primarily use data from recent seasons of 2018 and 2019 to train and test the ML models that predict the defender position (x, y) and speed (s). The sensors in the players’ shoulder pads provide information on every player on the field in increments of 0.1 second; tracking devices in the football provide additional information. This provides a relatively large feature set over multiple time steps compared to the number of observations, and we decided to also evaluate feature importance to guide modeling decisions. We didn’t consider any team-specific or player-specific features, in order to have a player-agnostic model. We evaluated information such as down number, yards to first down, and touchdown during the feature selection phase, but they weren’t particularly useful for our analysis.

The models predict location and speed up to 15 time steps ahead (t + 15 steps), or 1.5 seconds after the quarterback releases the ball, also known as pass-forward. For passes longer than 1.5 seconds, we use the same model to predict beyond (t + 15) location and speed with the starting time shifted forward and resultant predictions concatenated together. The input data contains player and ball information up to five-time steps prior (t, t-1, …, t-5). We randomly segmented the train-test split by plays to prevent information leak within a single play.

We used an XGBoost model to explore and sub-select a variety of raw and engineered features, such as acceleration, personnel on the field for each play, location of the player a few time steps prior, direction and orientation of the players in motion, and ball trajectory. Useful feature engineering steps include differencing (which stationarize the time series) and directional decomposition (which decomposes a player’s rotational direction into x and y, respectively).

We trained the XGBoost model using Amazon SageMaker, which allows developers to quickly build, train, and deploy ML models. You can quickly and easily achieve model training by uploading the training data to an Amazon Simple Storage Service (Amazon S3) bucket and launching an Amazon SageMaker notebook. See the following code:

# format dataframe, target then features
output_label = target + str(ts)
all_columns = [output_label]
all_columns.extend(feature_lst) # write training data to file
prefix = main_foldername + '/' + output_label
train_df_tos3 = train_df.loc[:, all_columns]
print(train_df_tos3.head()) if not os.path.isdir('./tmp'): os.makedirs('./tmp') train_df_tos3.to_csv('./tmp/cur_train_df.csv', index=False, header=False)
s3.upload_file('./tmp/cur_train_df.csv', bucketname, f'{prefix}/train/train.csv') # get pointer to file
s3_input_train = sagemaker.s3_input( s3_data='s3://{}/{}/train'.format(bucketname, prefix), content_type='csv') start_time = time.time() # setup training
xgb = sagemaker.estimator.Estimator( container, role, train_instance_count=1, train_instance_type='ml.m5.12xlarge', output_path='s3://{}/{}/output'.format(bucketname, prefix), sagemaker_session=sess) xgb.set_hyperparameters(max_depth=5, num_round=20, objective='reg:linear')
xgb.fit({'train': s3_input_train}) # find model name
model_name = xgb.latest_training_job.name
print(f'model_name:{model_name}')
model_path = 's3://{}/{}/output/{}/output/model.tar.gz'.format( bucketname, prefix, model_name)

You can easily achieve inferencing by deploying this model to an endpoint:

from sagemaker.predictor import csv_serializer
xgb_predictor = xgb.deploy(initial_instance_count = 1, instance_type = 'ml.m4.xlarge')
xgb_predictor.content_type = 'text/csv'
xgb_predictor.serializer = csv_serializer
xgb_predictor.deserializer = None ## Function to chunk down test set into smaller increments
def predict(data, model, rows=500): split_array = np.array_split(data, int(data.shape[0] / float(rows) + 1)) predictions = '' for array in split_array: predictions = ','.join([predictions, model.predict(array).decode('utf-8')]) return np.fromstring(predictions[1:], sep=',') ## Generate predictions on the test set for the difference models
predictions = predict(test_df[feature_lst].astype(float).values, xgb_predictor) xgb_predictor.delete_endpoint() xgb.fit({'train': s3_input_train})

You can easily extract feature importance from the trained XGBoost model, which is by default saved in a tar.gz format, using the following code:

tar = tarfile.open(local_model_path)
tar.extractall(local_model_dir)
tar.close() print(local_model_dir)
with open(local_model_dir + '/xgboost-model', 'rb') as f: model = pkl.load(f) model.feature_names = all_columns[1:] #map names correctly fig, ax = plt.subplots(figsize=(12,12))
xgboost.plot_importance(model, importance_type='gain', max_num_features=10, height=0.8, ax=ax, show_values = False)
plt.title(f'Feature Importance: {target}')
plt.show() 

The following graph shows an example of the resultant feature importance plot.

 

Deep learning model for predicting defender trajectory

We used a multi-output XGBoost model as the baseline or benchmark model for comparison, with each target (x, y, speed) considered individually. In all three targets, we trained the models using Amazon SageMaker over 20–25 epochs with batch sizes of 256, using the Adam optimizer and mean squared error (MSE) loss, and achieved about two times better root mean squared error (RMSE) values compared to the baseline models.

The model architecture consists of a one-dimensional convolutional neural network (1D-CNN) and a long short-term memory (LSTM), as shown in the following diagram. The 1D-CNN block sizes extract time-dependent information from the features over different time scales, and dimensionality is subsequently reduced by max pooling. The concatenated vectors are then passed to an LSTM with a fully connected output layer to generate the output sequence.

The following diagram is a schematic of the Defender Ghosting deep learning model architecture. We evaluated models independently predicting each of the targets (x, y, speed) as well as jointly, and the model with independent targets slightly outperformed the joint model.

The code defining the model in Keras is as follows:

# define the model
def create_cnn_lstm_model_functional(n_filter=32, kw=1): """ :param n_filter: number of filters to use in convolution layer :param kw: filter kernel size :return: compiled model """ input_player = Input(shape=(4, 25)) input_receiver = Input(shape=(19, 25)) input_ball = Input(shape=(19, 13)) submodel_player = Conv1D(filters=n_filter, kernel_size=kw, activation='relu')(input_player) submodel_player = GlobalMaxPooling1D()(submodel_player) submodel_receiver = Conv1D(filters=n_filter, kernel_size=kw, activation='relu')(input_receiver) submodel_receiver = GlobalMaxPooling1D()(submodel_receiver) submodel_ball = Conv1D(filters=n_filter, kernel_size=kw, activation='relu')(input_ball) submodel_ball = GlobalMaxPooling1D()(submodel_ball) x = Concatenate()([submodel_player, submodel_receiver, submodel_ball]) x = RepeatVector(15)(x) x = LSTM(50, activation='relu', return_sequences=True)(x) x = TimeDistributed(Dense(10, activation='relu'))(x) x = TimeDistributed(Dense(1))(x) model = Model(inputs=[input_player, input_receiver, input_ball], outputs=x) model.compile(optimizer='adam', loss='mse') return model

Evaluating defender trajectory

We developed custom metrics to quantify performance of a defender’s trajectory relative to the targeted receiver. The typical ideal behavior of a defender, from the moment the ball leaves the quarterback’s hands, is to rush towards the targeted receiver and ball. With that knowledge, we define the positional convergence (PS) metric as the weighted average of the rate of change of distance between the two players. When equally weighted across all time steps, the PS metric indicates that the two players are:

  • Spatially converging when negative
  • Zero when running in parallel
  • Spatially diverging (moving away from each other) when positive

The following schematic shows the position of a targeted receiver and predicted defender trajectory at four time steps. The distance at each time step is denoted in arrows, and we use the average rate of change of this distance to compute the PS metric.

The PS metric alone is insufficient to evaluate the quality of a play, because a defender could be running too slowly towards the targeted receiver. The PS metric is thus modulated by another metric, termed the distance ratio (DR). The DR approximates the optimal distance that a defender should cover and rewards trajectories that indicate that the defender has covered close to optimal or humanly possible distances. This is approximated by calculating the distance between the defender’s location pass-forward and the position of the receiver at pass-arrived.

Putting this together, we can score every defender trajectory as a combination of PS and DR, and we apply a constraint for any predictions that exceed the maximum humanly possible distance, speed, and acceleration. The quality of a defensive play, called defensive play score, is a weighted average of every defender trajectory within the play. Defenders close to the targeted receiver are weighted higher than defenders positioned far away from the targeted receiver, because the close defenders’ actions have the most ability to influence the outcome of the play. Aggregating the scores of all the defensive plays provides a quantitative measure of how well models perform relative to each other, as well as compared to real plays. In the case of the deep learning model, the overall score was similar to the score computed from real plays and indicative that the model had captured realistic and desired defensive characteristics.

Evaluating a model’s performance after changing the targeted receiver from the actual events in the play proved to be more challenging, because there was no actual data to help determine the quality of our predictions. We shared the modified trajectories with football experts within NGS to determine the validity of the trajectory change; they deemed the trajectories reasonable. Features that were important to reasonable trajectory changes include ball information, the targeted receiver’s location relative to the defender, and the direction of the receiver. For both baseline and deep learning models, increasing the number of previous time steps in the inputs to the model beyond three time steps increased the model’s dependency on previous trajectories and made trajectory changes much harder.

Summary

The quarterback must very quickly scan the field during a play and determine the optimal receiver to target. The defensive backs are also observing and moving in response to the receivers’ and quarterback’s actions to put an end to the offensive play. Our Defender Ghosting model, which Amazon ML Solutions Lab and NFL NGS jointly developed, successfully uses tracking data from both players and the ball to provide league-wide predictions based on prior trajectory and the hypothetical receiver on the play.

You can find full, end-to-end examples of creating custom training jobs, training state-of-the-art object detection and tracking models, implementing hyperparameter optimization (HPO), and deploying models on Amazon SageMaker at the AWSLabs GitHub repo. If you’d like help accelerating your use of ML, please contact the Amazon ML Solutions Lab program.


About the Authors

Lin Lee Cheong is a Senior Scientist and Manager with the Amazon ML Solutions Lab team at Amazon Web Services. She works with strategic AWS customers to explore and apply artificial intelligence and machine learning to discover new insights and solve complex problems.  

  

Ankit Tyagi is a Senior Software Engineer with the NFL’s Next Gen Stats team. He focuses on backend data pipelines and machine learning for delivering stats to fans. Outside of work, you can find him playing tennis, experimenting with brewing beer, or playing guitar.

 

 

Xiangyu Zeng is an Applied Scientist with the Amazon ML Solution Lab team at Amazon Web Services. He leverages Machine Learning and Deep Learning to solve critical real-word problems for AWS customers. He loves sports, especially basketball and football in his spare time.

Michael Schaefer is the Director of Product and Analytics for NFL’s Next Gen Stats. His work focuses on the design and execution of statistics, applications, and content delivered to NFL Media, NFL Broadcaster Partners, and fans.

Michael Chi is the Director of Technology for NFL’s Next Gen Stats. He is responsible for all technical aspects of the platform which is used by all 32 clubs, NFL Media and Broadcast Partners. In his free time, he enjoys being outdoors and spending time with his family

Mehdi Noori is a Data Scientist at the Amazon ML Solutions Lab, where he works with customers across various verticals, and helps them to accelerate their cloud migration journey, and to solve their ML problems using state-of-the-art solutions and technologies.

Source: https://aws.amazon.com/blogs/machine-learning/predicting-defender-trajectories-in-nfls-next-gen-stats/

AI

How does it know?! Some beginner chatbot tech for newbies.

Published

on

Wouter S. Sligter

Most people will know by now what a chatbot or conversational AI is. But how does one design and build an intelligent chatbot? Let’s investigate some essential concepts in bot design: intents, context, flows and pages.

I like using Google’s Dialogflow platform for my intelligent assistants. Dialogflow has a very accurate NLP engine at a cost structure that is extremely competitive. In Dialogflow there are roughly two ways to build the bot tech. One is through intents and context, the other is by means of flows and pages. Both of these design approaches have their own version of Dialogflow: “ES” and “CX”.

Dialogflow ES is the older version of the Dialogflow platform which works with intents, context and entities. Slot filling and fulfillment also help manage the conversation flow. Here are Google’s docs on these concepts: https://cloud.google.com/dialogflow/es/docs/concepts

Context is what distinguishes ES from CX. It’s a way to understand where the conversation is headed. Here’s a diagram that may help understand how context works. Each phrase that you type triggers an intent in Dialogflow. Each response by the bot happens after your message has triggered the most likely intent. It’s Dialogflow’s NLP engine that decides which intent best matches your message.

Wouter Sligter, 2020

What’s funny is that even though you typed ‘yes’ in exactly the same way twice, the bot gave you different answers. There are two intents that have been programmed to respond to ‘yes’, but only one of them is selected. This is how we control the flow of a conversation by using context in Dialogflow ES.

Unfortunately the way we program context into a bot on Dialogflow ES is not supported by any visual tools like the diagram above. Instead we need to type this context in each intent without seeing the connection to other intents. This makes the creation of complex bots quite tedious and that’s why we map out the design of our bots in other tools before we start building in ES.

The newer Dialogflow CX allows for a more advanced way of managing the conversation. By adding flows and pages as additional control tools we can now visualize and control conversations easily within the CX platform.

source: https://cloud.google.com/dialogflow/cx/docs/basics

This entire diagram is a ‘flow’ and the blue blocks are ‘pages’. This visualization shows how we create bots in Dialogflow CX. It’s immediately clear how the different pages are related and how the user will move between parts of the conversation. Visuals like this are completely absent in Dialogflow ES.

It then makes sense to use different flows for different conversation paths. A possible distinction in flows might be “ordering” (as seen here), “FAQs” and “promotions”. Structuring bots through flows and pages is a great way to handle complex bots and the visual UI in CX makes it even better.

At the time of writing (October 2020) Dialogflow CX only supports English NLP and its pricing model is surprisingly steep compared to ES. But bots are becoming critical tech for an increasing number of companies and the cost reductions and quality of conversations are enormous. Building and managing bots is in many cases an ongoing task rather than a single, rounded-off project. For these reasons it makes total sense to invest in a tool that can handle increasing complexity in an easy-to-use UI such as Dialogflow CX.

This article aims to give insight into the tech behind bot creation and Dialogflow is used merely as an example. To understand how I can help you build or manage your conversational assistant on the platform of your choice, please contact me on LinkedIn.

Source: https://chatbotslife.com/how-does-it-know-some-beginner-chatbot-tech-for-newbies-fa75ff59651f?source=rss—-a49517e4c30b—4

Continue Reading

AI

Who is chatbot Eliza?

Between 1964 and 1966 Eliza was born, one of the very first conversational agents. Discover the whole story.

Published

on


Frédéric Pierron

Between 1964 and 1966 Eliza was born, one of the very first conversational agents. Its creator, Joseph Weizenbaum was a researcher at the famous Artificial Intelligence Laboratory of the MIT (Massachusetts Institute of Technology). His goal was to enable a conversation between a computer and a human user. More precisely, the program simulates a conversation with a Rogérian psychoanalyst, whose method consists in reformulating the patient’s words to let him explore his thoughts himself.

Joseph Weizenbaum (Professor emeritus of computer science at MIT). Location: Balcony of his apartment in Berlin, Germany. By Ulrich Hansen, Germany (Journalist) / Wikipedia.

The program was rather rudimentary at the time. It consists in recognizing key words or expressions and displaying in return questions constructed from these key words. When the program does not have an answer available, it displays a “I understand” that is quite effective, albeit laconic.

Weizenbaum explains that his primary intention was to show the superficiality of communication between a human and a machine. He was very surprised when he realized that many users were getting caught up in the game, completely forgetting that the program was without real intelligence and devoid of any feelings and emotions. He even said that his secretary would discreetly consult Eliza to solve his personal problems, forcing the researcher to unplug the program.

Conversing with a computer thinking it is a human being is one of the criteria of Turing’s famous test. Artificial intelligence is said to exist when a human cannot discern whether or not the interlocutor is human. Eliza, in this sense, passes the test brilliantly according to its users.
Eliza thus opened the way (or the voice!) to what has been called chatbots, an abbreviation of chatterbot, itself an abbreviation of chatter robot, literally “talking robot”.

Source: https://chatbotslife.com/who-is-chatbot-eliza-bfeef79df804?source=rss—-a49517e4c30b—4

Continue Reading

AI

FermiNet: Quantum Physics and Chemistry from First Principles

Weve developed a new neural network architecture, the Fermionic Neural Network or FermiNet, which is well-suited to modeling the quantum state of large collections of electrons, the fundamental building blocks of chemical bonds.

Published

on

Unfortunately, 0.5% error still isn’t enough to be useful to the working chemist. The energy in molecular bonds is just a tiny fraction of the total energy of a system, and correctly predicting whether a molecule is stable can often depend on just 0.001% of the total energy of a system, or about 0.2% of the remaining “correlation” energy. For instance, while the total energy of the electrons in a butadiene molecule is almost 100,000 kilocalories per mole, the difference in energy between different possible shapes of the molecule is just 1 kilocalorie per mole. That means that if you want to correctly predict butadiene’s natural shape, then the same level of precision is needed as measuring the width of a football field down to the millimeter.

With the advent of digital computing after World War II, scientists developed a whole menagerie of computational methods that went beyond this mean field description of electrons. While these methods come in a bewildering alphabet soup of abbreviations, they all generally fall somewhere on an axis that trades off accuracy with efficiency. At one extreme, there are methods that are essentially exact, but scale worse than exponentially with the number of electrons, making them impractical for all but the smallest molecules. At the other extreme are methods that scale linearly, but are not very accurate. These computational methods have had an enormous impact on the practice of chemistry – the 1998 Nobel Prize in chemistry was awarded to the originators of many of these algorithms.

Fermionic Neural Networks

Despite the breadth of existing computational quantum mechanical tools, we felt a new method was needed to address the problem of efficient representation. There’s a reason that the largest quantum chemical calculations only run into the tens of thousands of electrons for even the most approximate methods, while classical chemical calculation techniques like molecular dynamics can handle millions of atoms. The state of a classical system can be described easily – we just have to track the position and momentum of each particle. Representing the state of a quantum system is far more challenging. A probability has to be assigned to every possible configuration of electron positions. This is encoded in the wavefunction, which assigns a positive or negative number to every configuration of electrons, and the wavefunction squared gives the probability of finding the system in that configuration. The space of all possible configurations is enormous – if you tried to represent it as a grid with 100 points along each dimension, then the number of possible electron configurations for the silicon atom would be larger than the number of atoms in the universe!

This is exactly where we thought deep neural networks could help. In the last several years, there have been huge advances in representing complex, high-dimensional probability distributions with neural networks. We now know how to train these networks efficiently and scalably. We surmised that, given these networks have already proven their mettle at fitting high-dimensional functions in artificial intelligence problems, maybe they could be used to represent quantum wavefunctions as well. We were not the first people to think of this – researchers such as Giuseppe Carleo and Matthias Troyer and others have shown how modern deep learning could be used for solving idealised quantum problems. We wanted to use deep neural networks to tackle more realistic problems in chemistry and condensed matter physics, and that meant including electrons in our calculations.

There is just one wrinkle when dealing with electrons. Electrons must obey the Pauli exclusion principle, which means that they can’t be in the same space at the same time. This is because electrons are a type of particle known as fermions, which include the building blocks of most matter – protons, neutrons, quarks, neutrinos, etc. Their wavefunction must be antisymmetric – if you swap the position of two electrons, the wavefunction gets multiplied by -1. That means that if two electrons are on top of each other, the wavefunction (and the probability of that configuration) will be zero.

This meant we had to develop a new type of neural network that was antisymmetric with respect to its inputs, which we have dubbed the Fermionic Neural Network, or FermiNet. In most quantum chemistry methods, antisymmetry is introduced using a function called the determinant. The determinant of a matrix has the property that if you swap two rows, the output gets multiplied by -1, just like a wavefunction for fermions. So you can take a bunch of single-electron functions, evaluate them for every electron in your system, and pack all of the results into one matrix. The determinant of that matrix is then a properly antisymmetric wavefunction. The major limitation of this approach is that the resulting function – known as a Slater determinant – is not very general. Wavefunctions of real systems are usually far more complicated. The typical way to improve on this is to take a large linear combination of Slater determinants – sometimes millions or more – and add some simple corrections based on pairs of electrons. Even then, this may not be enough to accurately compute energies.

Source: https://deepmind.com/blog/article/FermiNet

Continue Reading
AI14 hours ago

How does it know?! Some beginner chatbot tech for newbies.

AI14 hours ago

Who is chatbot Eliza?

AI1 day ago

FermiNet: Quantum Physics and Chemistry from First Principles

AI1 day ago

How to take S3 backups with DejaDup on Ubuntu 20.10

AI3 days ago

How banks and finance enterprises can strengthen their support with AI-powered customer service…

AI3 days ago

GBoard Introducing Voice — Smooth Texting and Typing

AI3 days ago

Automatically detecting personal protective equipment on persons in images using Amazon Rekognition

AI3 days ago

Automatically detecting personal protective equipment on persons in images using Amazon Rekognition

AI3 days ago

Automatically detecting personal protective equipment on persons in images using Amazon Rekognition

AI3 days ago

Automatically detecting personal protective equipment on persons in images using Amazon Rekognition

AI3 days ago

Automatically detecting personal protective equipment on persons in images using Amazon Rekognition

AI3 days ago

Automatically detecting personal protective equipment on persons in images using Amazon Rekognition

AI3 days ago

Automatically detecting personal protective equipment on persons in images using Amazon Rekognition

AI3 days ago

Automatically detecting personal protective equipment on persons in images using Amazon Rekognition

AI3 days ago

Automatically detecting personal protective equipment on persons in images using Amazon Rekognition

AI3 days ago

Automatically detecting personal protective equipment on persons in images using Amazon Rekognition

AI3 days ago

Automatically detecting personal protective equipment on persons in images using Amazon Rekognition

AI3 days ago

Automatically detecting personal protective equipment on persons in images using Amazon Rekognition

AI3 days ago

Automatically detecting personal protective equipment on persons in images using Amazon Rekognition

AI3 days ago

Automatically detecting personal protective equipment on persons in images using Amazon Rekognition

Trending