<?xml version="1.0" encoding="utf-8" standalone="yes"?><rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom"><channel><title>Soccer Analytics | Tyler Richardett</title><link>https://tylerrichardett.com/tag/soccer-analytics/</link><atom:link href="https://tylerrichardett.com/tag/soccer-analytics/index.xml" rel="self" type="application/rss+xml"/><description>Soccer Analytics</description><generator>Wowchemy (https://wowchemy.com)</generator><language>en-us</language><copyright>© 2022 Tyler Richardett</copyright><lastBuildDate>Thu, 26 Aug 2021 00:00:00 +0000</lastBuildDate><image><url>https://tylerrichardett.com/media/logo_hu63450d64904b2806a1378e907e93ac54_121699_300x300_fit_lanczos_2.png</url><title>Soccer Analytics</title><link>https://tylerrichardett.com/tag/soccer-analytics/</link></image><item><title>Using Optical Tracking Data from Second Spectrum to Visualize Long Goal-scoring Build-ups</title><link>https://tylerrichardett.com/blog/2021/08/26/using-optical-tracking-data-from-second-spectrum-to-visualize-long-goal-scoring-build-ups/</link><pubDate>Thu, 26 Aug 2021 00:00:00 +0000</pubDate><guid>https://tylerrichardett.com/blog/2021/08/26/using-optical-tracking-data-from-second-spectrum-to-visualize-long-goal-scoring-build-ups/</guid><description>&lt;p>Two weeks ago, Nashville SC scored a lovely team goal. All 10 field players contributed to a 15-pass sequence, putting Nashville up 2-1 over D.C. United.&lt;/p>
&lt;p>Thanks to &lt;a href="https://www.mlssoccer.com/" target="_blank" rel="noopener">Major League Soccer&lt;/a> and &lt;a href="https://www.secondspectrum.com/index.html" target="_blank" rel="noopener">Second Spectrum&lt;/a>, some of us lucky folks at &lt;a href="https://www.americansocceranalysis.com/" target="_blank" rel="noopener">American Soccer Analysis&lt;/a> have access to 2D tracking data for every regular season and playoff MLS game. Just by plotting each frame of these special goal-scoring sequences, it&amp;rsquo;s enlightening to see how these attacking teams move and exploit space to ultimately unlock the opposing defense.&lt;/p>
&lt;p>Because I used &lt;code>{gridExtra}&lt;/code> to arrange each component of the visualizations, I&amp;rsquo;m constructing and exporting each individual frame as a still image. To speed things up, I took advantage of &lt;code>{doParallel}&lt;/code> and &lt;code>{foreach}&lt;/code> to distribute the workload. And in order for each process&amp;rsquo;s &lt;code>{ggsave}&lt;/code> call to be able to write to the same location, I had to set &lt;code>type = &amp;quot;cairo-png&amp;quot;&lt;/code>.&lt;/p>
&lt;p>Finally, to stitch all the frames together into an MP4 file, I&amp;rsquo;m leveraging the &lt;code>{mapmate}&lt;/code> &lt;a href="https://leonawicz.github.io/mapmate/articles/ffmpeg.html" target="_blank" rel="noopener">wrapper&lt;/a> around the &lt;a href="https://www.ffmpeg.org/" target="_blank" rel="noopener">FFmpeg command line tool&lt;/a>:&lt;/p>
&lt;pre>&lt;code class="language-r">mapmate::ffmpeg(
dir = tile_save_path,
pattern = glue::glue(&amp;quot;{save_file_prefix}_%06d.png&amp;quot;),
output_dir = gif_mp4_save_path,
output = glue::glue(&amp;quot;{save_file_prefix}.mp4&amp;quot;),
rate = export_fps,
start = start_frame,
overwrite = TRUE
)
&lt;/code>&lt;/pre>
&lt;p>This content was originally developed as part of a Twitter thread, and each original tweet is embedded below. The videos are best viewed on desktop.&lt;/p>
&lt;h4 id="august-15-2021--nbspnbsp-nashville-sc-2---1-dc-united">August 15, 2021  |  Nashville SC (2) - 1 D.C. United&lt;/h4>
&lt;blockquote class="twitter-tweet">&lt;p lang="en" dir="ltr">Two weeks ago, &lt;a href="https://twitter.com/hashtag/EveryoneN?src=hash&amp;amp;ref_src=twsrc%5Etfw">#EveryoneN&lt;/a> scored a lovely team goal. All 10 field players contributed to a 15-pass sequence, putting Nashville up 2-1.&lt;br>&lt;br>Thanks to &lt;a href="https://twitter.com/MLS?ref_src=twsrc%5Etfw">@MLS&lt;/a> and &lt;a href="https://twitter.com/SecondSpectrum?ref_src=twsrc%5Etfw">@SecondSpectrum&lt;/a>, we &lt;a href="https://twitter.com/AnalysisEvolved?ref_src=twsrc%5Etfw">@AnalysisEvolved&lt;/a> get an aerial view of these special build-ups. They’re more common than you might think! &lt;a href="https://t.co/gqEUDqrLt1">pic.twitter.com/gqEUDqrLt1&lt;/a>&lt;/p>&amp;mdash; Tyler Richardett (@TylerRichardett) &lt;a href="https://twitter.com/TylerRichardett/status/1430980050220994561?ref_src=twsrc%5Etfw">August 26, 2021&lt;/a>&lt;/blockquote> &lt;script async src="https://platform.twitter.com/widgets.js" charset="utf-8">&lt;/script>
&lt;h4 id="august-21-2021--nbspnbsp-columbus-crew-1---2-seattle-sounders-fc">August 21, 2021  |  Columbus Crew 1 - (2) Seattle Sounders FC&lt;/h4>
&lt;blockquote class="twitter-tweet" data-conversation="none">&lt;p lang="en" dir="ltr">More recently, a late-game winner from &lt;a href="https://twitter.com/hashtag/Sounders?src=hash&amp;amp;ref_src=twsrc%5Etfw">#Sounders&lt;/a> was the product of 19 consecutive passes, a forced turnover immediately after the &lt;a href="https://twitter.com/hashtag/Crew96?src=hash&amp;amp;ref_src=twsrc%5Etfw">#Crew96&lt;/a> restart, and an effective long ball from &lt;a href="https://twitter.com/hashtag/MLSAllStar?src=hash&amp;amp;ref_src=twsrc%5Etfw">#MLSAllStar&lt;/a> captain Cristian Roldan. &lt;a href="https://t.co/muW7nIhU3x">pic.twitter.com/muW7nIhU3x&lt;/a>&lt;/p>&amp;mdash; Tyler Richardett (@TylerRichardett) &lt;a href="https://twitter.com/TylerRichardett/status/1430980520452710400?ref_src=twsrc%5Etfw">August 26, 2021&lt;/a>&lt;/blockquote> &lt;script async src="https://platform.twitter.com/widgets.js" charset="utf-8">&lt;/script>
&lt;h4 id="july-25-2021--nbspnbsp-inter-miami-cf-1---1-philadelphia-union">July 25, 2021  |  Inter Miami CF 1 - (1) Philadelphia Union&lt;/h4>
&lt;blockquote class="twitter-tweet" data-conversation="none">&lt;p lang="en" dir="ltr">This 27-pass sequence from &lt;a href="https://twitter.com/hashtag/DOOP?src=hash&amp;amp;ref_src=twsrc%5Etfw">#DOOP&lt;/a> had a few quick switches in the &lt;a href="https://twitter.com/hashtag/InterMiamiCF?src=hash&amp;amp;ref_src=twsrc%5Etfw">#InterMiamiCF&lt;/a> half and an impressive exchange between Sullivan and Gazdag at the top of the 18-yard box.&lt;br>&lt;br>But it was a poor pressing attempt from Gonzalo Higuaín that allowed Philadelphia to strike quickly. &lt;a href="https://t.co/e0POv6RV0S">pic.twitter.com/e0POv6RV0S&lt;/a>&lt;/p>&amp;mdash; Tyler Richardett (@TylerRichardett) &lt;a href="https://twitter.com/TylerRichardett/status/1430980932845154313?ref_src=twsrc%5Etfw">August 26, 2021&lt;/a>&lt;/blockquote> &lt;script async src="https://platform.twitter.com/widgets.js" charset="utf-8">&lt;/script>
&lt;h4 id="july-17-2021--nbspnbsp-toronto-fc-1---0-orlando-city-sc">July 17, 2021  |  Toronto FC (1) - 0 Orlando City SC&lt;/h4>
&lt;blockquote class="twitter-tweet" data-conversation="none">&lt;p lang="en" dir="ltr">&lt;a href="https://twitter.com/hashtag/TFCLive?src=hash&amp;amp;ref_src=twsrc%5Etfw">#TFCLive&lt;/a> scored their opener in July’s game against &lt;a href="https://twitter.com/hashtag/VamosOrlando?src=hash&amp;amp;ref_src=twsrc%5Etfw">#VamosOrlando&lt;/a> after stringing together 19 consecutive passes in under 50 seconds.&lt;br>&lt;br>Altidore’s link-up play also pulled Jansson well out of position and ultimately forced an aerial matchup against Moutinho. &lt;a href="https://t.co/wsqKem6QeM">pic.twitter.com/wsqKem6QeM&lt;/a>&lt;/p>&amp;mdash; Tyler Richardett (@TylerRichardett) &lt;a href="https://twitter.com/TylerRichardett/status/1430981203553886217?ref_src=twsrc%5Etfw">August 26, 2021&lt;/a>&lt;/blockquote> &lt;script async src="https://platform.twitter.com/widgets.js" charset="utf-8">&lt;/script>
&lt;h4 id="july-3-2021--nbspnbsp-cf-montréal-1---0-inter-miami-cf">July 3, 2021  |  CF Montréal (1) - 0 Inter Miami CF&lt;/h4>
&lt;blockquote class="twitter-tweet" data-conversation="none">&lt;p lang="en" dir="ltr">Side-to-side movement from &lt;a href="https://twitter.com/hashtag/CFMTL?src=hash&amp;amp;ref_src=twsrc%5Etfw">#CFMTL&lt;/a> seemed to lull &lt;a href="https://twitter.com/hashtag/InterMiamiCF?src=hash&amp;amp;ref_src=twsrc%5Etfw">#InterMiamiCF&lt;/a> to sleep in this 22-pass sequence.&lt;br>&lt;br>And Toye’s run at 40:29 enables Quioto, Mihailović, and Choinière to turn and run hard at what’s left of Miami’s defense. &lt;a href="https://t.co/LQ5AcP1vOb">pic.twitter.com/LQ5AcP1vOb&lt;/a>&lt;/p>&amp;mdash; Tyler Richardett (@TylerRichardett) &lt;a href="https://twitter.com/TylerRichardett/status/1430981269597343746?ref_src=twsrc%5Etfw">August 26, 2021&lt;/a>&lt;/blockquote> &lt;script async src="https://platform.twitter.com/widgets.js" charset="utf-8">&lt;/script>
&lt;h4 id="june-19-2021--nbspnbsp-fc-cincinnati-0---2-colorado-rapids">June 19, 2021  |  FC Cincinnati 0 - (2) Colorado Rapids&lt;/h4>
&lt;blockquote class="twitter-tweet" data-conversation="none">&lt;p lang="en" dir="ltr">&lt;a href="https://twitter.com/hashtag/Rapids96?src=hash&amp;amp;ref_src=twsrc%5Etfw">#Rapids96&lt;/a> scored their goal on the road against &lt;a href="https://twitter.com/hashtag/FCCincy?src=hash&amp;amp;ref_src=twsrc%5Etfw">#FCCincy&lt;/a> after 16 consecutive passes.&lt;br>&lt;br>Cruz and Kubo simultaneously apply pressure to the ball, leaving Price with plenty of space to drive at the Cincinnati defense and play an incisive pass toward Lewis. &lt;a href="https://t.co/MkFrhm5D9A">pic.twitter.com/MkFrhm5D9A&lt;/a>&lt;/p>&amp;mdash; Tyler Richardett (@TylerRichardett) &lt;a href="https://twitter.com/TylerRichardett/status/1430981323443822601?ref_src=twsrc%5Etfw">August 26, 2021&lt;/a>&lt;/blockquote> &lt;script async src="https://platform.twitter.com/widgets.js" charset="utf-8">&lt;/script>
&lt;h4 id="may-2-2021--nbspnbsp-seattle-sounders-fc-2---0-la-galaxy">May 2, 2021  |  Seattle Sounders FC (2) - 0 LA Galaxy&lt;/h4>
&lt;blockquote class="twitter-tweet" data-conversation="none">&lt;p lang="en" dir="ltr">During week 3, &lt;a href="https://twitter.com/hashtag/Sounders?src=hash&amp;amp;ref_src=twsrc%5Etfw">#Sounders&lt;/a> strung together 27 consecutive passes en route to Brad Smith’s second goal in two games.&lt;br>&lt;br>Seattle took advantage of the space out wide in their build-up, pulling the &lt;a href="https://twitter.com/hashtag/LAGalaxy?src=hash&amp;amp;ref_src=twsrc%5Etfw">#LAGalaxy&lt;/a> defense to one side while Smith floated in at the back post. &lt;a href="https://t.co/yqX6mbXVTk">pic.twitter.com/yqX6mbXVTk&lt;/a>&lt;/p>&amp;mdash; Tyler Richardett (@TylerRichardett) &lt;a href="https://twitter.com/TylerRichardett/status/1430981390254977024?ref_src=twsrc%5Etfw">August 26, 2021&lt;/a>&lt;/blockquote> &lt;script async src="https://platform.twitter.com/widgets.js" charset="utf-8">&lt;/script>
&lt;h4 id="november-22-2020--nbspnbsp-minnesota-united-fc-3---0-colorado-rapids">November 22, 2020  |  Minnesota United FC (3) - 0 Colorado Rapids&lt;/h4>
&lt;blockquote class="twitter-tweet" data-conversation="none">&lt;p lang="en" dir="ltr">Leading up to their third goal in the opening round of last year’s playoffs, &lt;a href="https://twitter.com/hashtag/MNUFC?src=hash&amp;amp;ref_src=twsrc%5Etfw">#MNUFC&lt;/a> completed 24 consecutive passes.&lt;br>&lt;br>A smart run from Molino ultimately draws out a &lt;a href="https://twitter.com/hashtag/Rapids96?src=hash&amp;amp;ref_src=twsrc%5Etfw">#Rapids96&lt;/a> center back, leaving Reynoso with plenty of space to drive forward as other Minnesota attackers join. &lt;a href="https://t.co/Z4PJLJzN3B">pic.twitter.com/Z4PJLJzN3B&lt;/a>&lt;/p>&amp;mdash; Tyler Richardett (@TylerRichardett) &lt;a href="https://twitter.com/TylerRichardett/status/1430981455371456518?ref_src=twsrc%5Etfw">August 26, 2021&lt;/a>&lt;/blockquote> &lt;script async src="https://platform.twitter.com/widgets.js" charset="utf-8">&lt;/script>
&lt;h4 id="november-1-2020--nbspnbsp-colorado-rapids-1---0-seattle-sounders-fc">November 1, 2020  |  Colorado Rapids (1) - 0 Seattle Sounders FC&lt;/h4>
&lt;blockquote class="twitter-tweet" data-conversation="none">&lt;p lang="en" dir="ltr">This 17-pass sequence from &lt;a href="https://twitter.com/hashtag/Rapids96?src=hash&amp;amp;ref_src=twsrc%5Etfw">#Rapids96&lt;/a> against &lt;a href="https://twitter.com/hashtag/Sounders?src=hash&amp;amp;ref_src=twsrc%5Etfw">#Sounders&lt;/a> is a great illustration of absorbing the opponent’s press and striking quickly when space opens up.&lt;br>&lt;br>Price is fortunate not to turn the ball over to Ruidíaz in a very dangerous area, but Colorado makes no mistake afterward. &lt;a href="https://t.co/iU1mfssJ2e">pic.twitter.com/iU1mfssJ2e&lt;/a>&lt;/p>&amp;mdash; Tyler Richardett (@TylerRichardett) &lt;a href="https://twitter.com/TylerRichardett/status/1430981533452615684?ref_src=twsrc%5Etfw">August 26, 2021&lt;/a>&lt;/blockquote> &lt;script async src="https://platform.twitter.com/widgets.js" charset="utf-8">&lt;/script>
&lt;h4 id="october-14-2020--nbspnbsp-vancouver-whitecaps-fc-1---0-los-angeles-fc">October 14, 2020  |  Vancouver Whitecaps FC (1) - 0 Los Angeles FC&lt;/h4>
&lt;blockquote class="twitter-tweet" data-conversation="none">&lt;p lang="en" dir="ltr">Another example comes courtesy of &lt;a href="https://twitter.com/hashtag/VWFC?src=hash&amp;amp;ref_src=twsrc%5Etfw">#VWFC&lt;/a> and their 19-pass sequence against &lt;a href="https://twitter.com/hashtag/LAFC?src=hash&amp;amp;ref_src=twsrc%5Etfw">#LAFC&lt;/a>.&lt;br>&lt;br>With Vancouver inviting pressure from LAFC&amp;#39;s forwards, Rossi and BWP commit, Bush breaks the first line with a pass to Bikel, Bikel finds Dájome with a long ball, and Vancouver is off to the races. &lt;a href="https://t.co/GLlMagCIeo">pic.twitter.com/GLlMagCIeo&lt;/a>&lt;/p>&amp;mdash; Tyler Richardett (@TylerRichardett) &lt;a href="https://twitter.com/TylerRichardett/status/1430981598380498956?ref_src=twsrc%5Etfw">August 26, 2021&lt;/a>&lt;/blockquote> &lt;script async src="https://platform.twitter.com/widgets.js" charset="utf-8">&lt;/script>
&lt;h4 id="september-23-2020--nbspnbsp-real-salt-lake-2---0-la-galaxy">September 23, 2020  |  Real Salt Lake (2) - 0 LA Galaxy&lt;/h4>
&lt;blockquote class="twitter-tweet" data-conversation="none">&lt;p lang="en" dir="ltr">Against &lt;a href="https://twitter.com/hashtag/LAGalaxy?src=hash&amp;amp;ref_src=twsrc%5Etfw">#LAGalaxy&lt;/a> last September, &lt;a href="https://twitter.com/hashtag/RSL?src=hash&amp;amp;ref_src=twsrc%5Etfw">#RSL&lt;/a> put 17 consecutive passes together before doubling their lead.&lt;br>&lt;br>After a switch from Glad, smart running from Baird and Rusnák draws out LA’s right back — leaving Rusnák with ample space and time to find Kreilach alone at the penalty spot. &lt;a href="https://t.co/yS35khkgCZ">pic.twitter.com/yS35khkgCZ&lt;/a>&lt;/p>&amp;mdash; Tyler Richardett (@TylerRichardett) &lt;a href="https://twitter.com/TylerRichardett/status/1430981663123779589?ref_src=twsrc%5Etfw">August 26, 2021&lt;/a>&lt;/blockquote> &lt;script async src="https://platform.twitter.com/widgets.js" charset="utf-8">&lt;/script>
&lt;h4 id="august-26-2020--nbspnbsp-orlando-city-sc-1---1-nashville-sc">August 26, 2020  |  Orlando City SC (1) - 1 Nashville SC&lt;/h4>
&lt;blockquote class="twitter-tweet" data-conversation="none">&lt;p lang="en" dir="ltr">And here, effective counter-pressing from &lt;a href="https://twitter.com/hashtag/VamosOrlando?src=hash&amp;amp;ref_src=twsrc%5Etfw">#VamosOrlando&lt;/a> deep in the &lt;a href="https://twitter.com/hashtag/EveryoneN?src=hash&amp;amp;ref_src=twsrc%5Etfw">#EveryoneN&lt;/a> half yields 17 consecutive passes and an equalizer from Mueller.&lt;br>&lt;br>Controlled passing in tight quarters — particularly from Méndez — and a killer first-time ball from Pereyra make the difference. &lt;a href="https://t.co/CN1LqblkuY">pic.twitter.com/CN1LqblkuY&lt;/a>&lt;/p>&amp;mdash; Tyler Richardett (@TylerRichardett) &lt;a href="https://twitter.com/TylerRichardett/status/1430981724054360066?ref_src=twsrc%5Etfw">August 26, 2021&lt;/a>&lt;/blockquote> &lt;script async src="https://platform.twitter.com/widgets.js" charset="utf-8">&lt;/script>
&lt;p>&lt;em>Data: Second Spectrum&lt;/em>&lt;/p></description></item><item><title>Building a New In-Game Win Probability Model for American Soccer Analysis</title><link>https://tylerrichardett.com/blog/2021/07/19/building-a-new-in-game-win-probability-model-for-american-soccer-analysis/</link><pubDate>Mon, 19 Jul 2021 00:00:00 +0000</pubDate><guid>https://tylerrichardett.com/blog/2021/07/19/building-a-new-in-game-win-probability-model-for-american-soccer-analysis/</guid><description>&lt;p>Across recent weeks, we’ve set out to improve the performance of our in-game win probability model, while:&lt;/p>
&lt;ol>
&lt;li>
&lt;p>starting to take each team’s strength into account, based on its performance in prior games; and&lt;/p>
&lt;/li>
&lt;li>
&lt;p>introducing more fluctuation between goal-scoring events, to better reflect teams’ chance creation throughout the game.&lt;/p>
&lt;/li>
&lt;/ol>
&lt;p>In this article, we’ll cover our methods for accomplishing those goals, how we plan to use this new and improved model, and how deconstructing that model can teach us more about the conditions under which goal-scoring events occur.&lt;/p>
&lt;p>&lt;em>Read the remainder of this post on &lt;a href="https://www.americansocceranalysis.com/home/2021/7/16/we-have-a-new-win-probability-model" target="_blank" rel="noopener">American Soccer Analysis&lt;/a>.&lt;/em>&lt;/p></description></item><item><title>Proposing a Decision Support System for Scouting Talent and Weighing Competing Objectives</title><link>https://tylerrichardett.com/blog/2019/03/31/proposing-a-decision-support-system-for-scouting-talent-and-weighing-competing-objectives/</link><pubDate>Sun, 31 Mar 2019 00:00:00 +0000</pubDate><guid>https://tylerrichardett.com/blog/2019/03/31/proposing-a-decision-support-system-for-scouting-talent-and-weighing-competing-objectives/</guid><description>&lt;p>As part of a semester-long assignment for my graduate studies, I&amp;rsquo;ve been working with a group of fellow students to envision and construct a prototype &lt;a href="https://en.wikipedia.org/wiki/Decision_support_system" target="_blank" rel="noopener">decision support system (DSS)&lt;/a>. Still in development, our DSS intends to pair professional soccer players' event-level data with multi-criteria decision analysis methods&amp;mdash;resulting in a system which &amp;ldquo;assists clubs with the identification, scouting, and acquisition of world-class talent,&amp;rdquo; as we state in our initial proposal.&lt;/p>
&lt;p>Even just building out a small-scale version of this, I&amp;rsquo;ve come to learn, is an incredibly tall task.&lt;/p>
&lt;h2 id="the-data-component-and-model-design">The Data Component and Model Design&lt;/h2>
&lt;p>In order to power our system with a sample of real players' demographic information and event- and match-level statistics, I turned to the series of JSON files powering Wyscout&amp;rsquo;s platform. Our system will be primarily targeted at clubs with mid-sized budgets from the top leagues in the world, so I decided our sample of players should accurately reflect the environments in which their typical low-risk, high-reward prospects are found.&lt;/p>
&lt;p>Thus, I limited our search to the (10) European countries whose top-level competitions are just below the level of the top leagues in the world, according to &lt;a href="https://www.uefa.com/memberassociations/uefarankings/country/#/yr/2019" target="_blank" rel="noopener">UEFA country coefficients&lt;/a>: Russia, Portugal, Belgium, Ukraine, Turkey, Netherlands, Austria, Greece, Denmark, and Switzerland (\(n\) = 28 competitions). From there, I identified the players currently contracted with the clubs competing in those 28 leagues (as of February 17, 2019; \(n\) = 11,458 players). And lastly, I retrieved match-level statistics for each player, dating back to the 2015&amp;ndash;16 season (\(n\) = 486,789 cumulative matches played).&lt;/p>
&lt;p>While other attributes are sure to be considered by scouts and other decision makers, the efficacy of our model hinges on two key components: talent maximization (in the short-, medium-, or long-term) and cost minimization.&lt;/p>
&lt;p>As for the latter piece, for now, we&amp;rsquo;re relying solely on the market value provided for each player in the data set, derived from Transfermarkt&amp;rsquo;s model. (It&amp;rsquo;s worth noting that actual transfer values &lt;a href="https://nytimes.com/2017/06/22/sports/soccer/premier-league-transfers.html" target="_blank" rel="noopener">often do not reflect&lt;/a> these ones, for any number of explicable or inexplicable reasons.)&lt;/p>
&lt;p>Whether we deliver a viable proof of concept, then, falls heavily on our ability to quantify players' skill level in the most objective manner possible. Our approach, which I&amp;rsquo;ll touch on in greater detail in the next section, is sketched out in left-hand side of the model diagram below:&lt;/p>
&lt;p>
&lt;figure >
&lt;div class="d-flex justify-content-center">
&lt;div class="w-100" >&lt;img src="https://tylerrichardett.com/img/dss-functional-diagram.png" alt="Functional diagram of the decision support system and its underlying data." loading="lazy" data-zoomable />&lt;/div>
&lt;/div>&lt;/figure>
&lt;/p>
&lt;p>Adapted from &lt;a href="https://arxiv.org/pdf/1802.04987.pdf" target="_blank" rel="noopener">an analysis carried out by Italian researchers&lt;/a>, our process for developing current skill ratings for each player in our data set is as follows:&lt;/p>
&lt;ol>
&lt;li>
&lt;p>Aggregate players' event-level data, \(p\), to the match level, \(m\), for each team, \(T(p_T^m)\)&lt;/p>
&lt;/li>
&lt;li>
&lt;p>Assign a binary outcome variable to each tuple, where \(o_T^m = 1\) resembles a win for the corresponding team and \(o_T^m = 0\) resembles a draw or loss for the corresponding team&lt;/p>
&lt;/li>
&lt;li>
&lt;p>On that matrix, train a classification model, such as linear support vector machines (LSVM), and extract the coefficients for each type of action&lt;/p>
&lt;/li>
&lt;li>
&lt;p>Apply those coefficients as feature weights to the matrix containing each players' (\(u\)) event-level data, \(p\), for each match, \(m(p_u^m)\)&lt;/p>
&lt;/li>
&lt;li>
&lt;p>The result, \(r(u,m)\), is used to determine a player&amp;rsquo;s current skill level, after undergoing the following transformations:&lt;/p>
&lt;ol>
&lt;li>Using Bayesian statistics, relative competition strength is determined and weighted into each rating value&lt;/li>
&lt;li>Each performance is normalized by the number of minutes played&lt;/li>
&lt;li>For smoothing purposes, the value used by the decision model will be a rolling average of some to-be-determined timeframe (e.g., 5 or 10 weeks)&lt;/li>
&lt;/ol>
&lt;/li>
&lt;/ol>
&lt;p>Overall, the logic behind this approach checks out. Ultimately, a player&amp;rsquo;s inherent value lies in the degree to which he or she improves the team&amp;rsquo;s likelihood of winning any given match. So, it makes sense to first determine which types of actions help or hinder the team, and then to evaluate a player&amp;rsquo;s performance based on those findings.&lt;/p>
&lt;p>In practice, however, applying that logic proved to be far more difficult than it had seemed.&lt;/p>
&lt;h2 id="training-the-classification-model">Training the Classification Model&lt;/h2>
&lt;p>At first, I pulled match-level data from nearly 5,000 unique games&amp;mdash;yielding nearly 10,000 observations&amp;mdash;across the 28 competitions for the 2018/19 season. As best I could, I attempted to replicate the model constructed by the Italian researchers, given the variables I had available. After several futile hours, however, all I had was a dozen or more classification models, all of which predicted the same outcome for every observation.&lt;/p>
&lt;p>As such, I was forced examine two key differences between my approach and theirs. The first was obvious, yet manageable. Their set of inputs were far more granular and had higher dimensionality; I could solve for that by aggregating the player-level data&amp;mdash;which contains 300+ types of actions&amp;mdash;up to the match level.&lt;/p>
&lt;p>The second, I believe, is a structural flaw in their approach. In their research, they explicitly lay out a sound argument for the exclusion of goals as features: The difference in the number of goals scored singularly determines the outcome of any given match. Why, then, were assists not also excluded? Goals don&amp;rsquo;t always come with assists, but assists are inherently tied to goals.&lt;/p>
&lt;p>This, I felt, boosted their model performance, and that assertion was evident in their results. The top seven positively weighted features were &lt;em>all&lt;/em> of the seven types of assists used as inputs.&lt;/p>
&lt;p>With a new set of training data, and with that added knowledge of our performance benchmark for the model, I found much greater success in my next attempt. In the feature selection process, one additional piece of insight I gathered from the previous, unsuccessful attempts was to include &lt;em>only&lt;/em> on-the-ball actions.&lt;/p>
&lt;p>As you may well imagine, this makes it particularly difficult to evaluate defensive abilities&amp;mdash;a gripe with event-level data &lt;a href="https://tylerrichardett.com/blog/2019/03/06/valuing-defensive-actions-using-offensive-possession-models">I&amp;rsquo;ve shared in past&lt;/a>. In this particular context, including defensive actions such as blocks and clearances in the model tends to backfire for defensive players. This is because, at the match level, a greater number of these types of events typically means that a team is under siege, and thus that they are more likely to lose the match, yielding a negative coefficient. When applying that coefficient to a player&amp;rsquo;s performance, it dramatically decreases the ratings of players actively involved in thwarting opponents' attacks, which runs counter to how their overall performance impacts the outcome of the match. Defenders typically don&amp;rsquo;t have much impact on whether they are placed in those types of situations, and conversely, much of their most impactful play happens off the ball&amp;mdash;closing down valuable spaces on the field.&lt;/p>
&lt;p>I acknowledge, however, that the role of modern fullbacks and center backs is continually evolving. In this era, a greater share of the responsibility is placed on them to contribute to offensive actions&amp;mdash;igniting the start of an attacking move and generally positioning themselves higher up the pitch. And for those reasons, the model may well capture enough of what today&amp;rsquo;s scouts are seeking in emerging talents who will fill those positions.&lt;/p>
&lt;p>The new training set consisted of more than 5,000 observations of 45 input variables each, and the new LSVM model was tuned using 10-fold cross validation and a cost parameter of 1. Its performance was considerably better than previous efforts (0.71 AUC and 74% accuracy), and the resulting coefficients more closely resembled a perceived reality. Actions which most negatively impacted a match&amp;rsquo;s outcome included red cards, yellow cards, and losses in dangerous areas of the player&amp;rsquo;s own half, while some of the actions with the greatest positive values included shots from dangerous areas, opportunity creations, and actions during a counter attacking sequence.&lt;/p>
&lt;p>Content with those results, I weighted each player&amp;rsquo;s performance using the coefficients, resulting in their raw score for each match.&lt;/p>
&lt;h2 id="weighting-by-competition-strength">Weighting by Competition Strength&lt;/h2>
&lt;p>One of a scout&amp;rsquo;s greatest challenges must be projecting how a player&amp;rsquo;s performance will translate across leagues and/or national boundaries. As a result, this was another key consideration in constructing our model.&lt;/p>
&lt;p>Recently, I&amp;rsquo;ve seen an analysis such as this &lt;a href="https://www.optasportspro.com/news-analysis/blog-counting-across-borders/" target="_blank" rel="noopener">conducted by Ben Torvaney&lt;/a> for OptaPro&amp;rsquo;s 2018 Analytics Forum, and the same approach would have transferred well to our application. However, given some functional constraints&amp;mdash;namely, time and computational power&amp;mdash;we determined that it would be beyond the scope of our assignment to carry out an analysis to such an extent. That is, while our data set was large enough to have a considerable number of players transferring among different leagues, our approach would need to produce a similar outcome, while not holding constant the performance of each individual player, nor differences in age.&lt;/p>
&lt;p>Instead, for the domestic competitions for which there was enough data, I calculated each player&amp;rsquo;s average performance rating, standardized per 96 minutes on the field (the average length of the matches in our data set), and excluded instances of those who played fewer than 10 matches (i.e., 960 minutes) in each competition. Using a single competition as an index, I applied the &lt;a href="http://www.sumsar.net/best_online/" target="_blank" rel="noopener">Bayesian estimation supersedes the t-test (BEST) model&lt;/a> to determine the expected difference of means between each pair of competitions. The outputs of this model informed the competition weights later applied to each raw player performance rating.&lt;/p>
&lt;p>Once again, the results of this analysis mostly held up against a perceived reality. My index, the Russian Premier League, was weaker in strength than Italy&amp;rsquo;s Serie A, the German Bundesliga, and the top divisions in both Argentina and Brazil, while it was far greater in strength than all of the regional and developmental leagues included in the data set.&lt;/p>
&lt;h2 id="player-positions-and-versatility">Player Positions and Versatility&lt;/h2>
&lt;p>In their analysis, the Italian researchers clustered outfield players into eight different position types, based on the locations of their on-the-ball actions. A depiction of the resulting clusters is below.&lt;/p>
&lt;p>
&lt;figure >
&lt;div class="d-flex justify-content-center">
&lt;div class="w-100" >&lt;img src="https://tylerrichardett.com/img/dss-position-clusters.png" alt="Clusters of the eight different position types." loading="lazy" data-zoomable />&lt;/div>
&lt;/div>&lt;/figure>
&lt;/p>
&lt;p>We determined that these clusters will inform the position variable applied to our final decision model. However, given that Wyscout&amp;rsquo;s data already includes positional information for each match and player, itself informed by cluster analysis, there was no demonstrated need to draw our own positional clusters.&lt;/p>
&lt;p>Though we&amp;rsquo;ve yet to apply it, another key attribute of the Italian researchers' findings is the concept of versatility. In short, by evaluating how well and/or consistently a given player performs when assigned to different types of positions, we could feasibly quantify how versatile he or she is&amp;mdash;often a sought-after trait, particularly for younger players.&lt;/p>
&lt;h2 id="future-considerations">Future Considerations&lt;/h2>
&lt;p>Our next major task will be using the weighted performance values, derived using the steps above, to project the future skill levels of young talents. In a perfect world, we would have a data set of players robust enough to establish projections similarly to the &lt;a href="https://drive.google.com/file/d/13o9yf8A6SXmMPEZWpM-_v0MS3K0aFSEo/view?usp=sharing" target="_blank" rel="noopener">approach taken by Dutch company SciSports&lt;/a>. They are able to use a clustering algorithm to associate young players with established players who exhibited similar traits when they were of a similar age. An average trajectory, therefore, of that set of established players is used to quantify the young players' likely progression and ceiling.&lt;/p>
&lt;p>However, given that we have not developed such a robust data set for our prototype, we&amp;rsquo;ll likely apply a forecasting model to serve the same purpose. One major wrinkle, for which we&amp;rsquo;ve yet to come up with a solution, is sufficiently accounting for gaps in performances caused by 1) injuries or 2) off-seasons.&lt;/p>
&lt;p>Once our projections are established, that will put us in a strong position to deliver a prototype DSS which accurately weighs all factors, as defined by the user. Again, the efficacy of our model hinges on two key components: talent maximization (in the short-, medium-, or long-term) and cost minimization. By default, these two features will already be the top two when the user is prompted to rank-order the factors which matter most to them. (The user will also be prompted to state whether they are building for the present or future with each transfer, which will adjust the emphasis that the decision model places on current versus future skill levels.)&lt;/p>
&lt;p>In order to generate the set of feature weights for the decision model, the &lt;a href="https://taragonmd.github.io/2018/08/11/criteria-weights-for-decision-making-the-easy-way/" target="_blank" rel="noopener">ratio ordinal method&lt;/a> will be applied, with the decision maker&amp;rsquo;s ranked-order selections serving as the inputs. The resulting weights will inform the &lt;a href="https://www.sciencedirect.com/science/article/pii/S0305048315001243" target="_blank" rel="noopener">Technique for Order of Preference by Similarity to Ideal Solution (TOPSIS) model&lt;/a> and help generate the final output: a ranked-order list of optimal transfer targets. The TOPSIS model is based on the assumption that, &amp;ldquo;the optimal alternative should have the shortest distance from the ideal solution and the farthest distance from the negative-ideal solution.&amp;rdquo; It was selected for its ability to both maximize benefit and minimize cost criteria.&lt;/p>
&lt;h2 id="top-performing-players-as-things-stand">Top-Performing Players, As Things Stand&lt;/h2>
&lt;p>Taking into account only each player&amp;rsquo;s current skill level&amp;mdash;based on performances between July 1, 2018 and February 17, 2019, weighted by competition strength, standardized by minutes played, and segmented by the positional clusters referenced above&amp;mdash;below are the top performers who are currently contracted with clubs in one of the 10 aforementioned countries:&lt;/p>
&lt;h4 id="left-sided-attacker">Left-sided Attacker&lt;/h4>
&lt;ol>
&lt;li>O. Sparre Klitten, AaB U19&lt;/li>
&lt;li>A. Erokhin, Zenit&lt;/li>
&lt;li>J. Okkels, Silkeborg&lt;/li>
&lt;li>G. Masouras, Olympiakos Piraeus&lt;/li>
&lt;li>Rafa Silva, Benfica&lt;/li>
&lt;/ol>
&lt;h4 id="central-forward">Central Forward&lt;/h4>
&lt;ol>
&lt;li>P. Onuachu, Midtjylland&lt;/li>
&lt;li>L. de Jong, PSV&lt;/li>
&lt;li>S. Kulish, Dnipro-1&lt;/li>
&lt;li>A. Razborov, Irtysh&lt;/li>
&lt;li>I. de Camargo, Mechelen&lt;/li>
&lt;/ol>
&lt;h4 id="right-sided-attacker">Right-sided Attacker&lt;/h4>
&lt;ol>
&lt;li>O. Tkachuk, Oleksandria U21&lt;/li>
&lt;li>E. Smyrnyi, Dynamo Kyiv U21&lt;/li>
&lt;li>Marcos Junior, UD Oliveirense&lt;/li>
&lt;li>R. van Wolfswinkel, Basel&lt;/li>
&lt;li>Z. Aboukhlal, PSV II&lt;/li>
&lt;/ol>
&lt;h4 id="left-sided-midfielderfullback">Left-sided Midfielder/Fullback&lt;/h4>
&lt;ol>
&lt;li>T. Damgaard, Vejle U19&lt;/li>
&lt;li>J. Christjansen, Lyngby&lt;/li>
&lt;li>X. Schlager, Salzburg&lt;/li>
&lt;li>D. Mironov, Leningradets&lt;/li>
&lt;li>R. van der Venne, Go Ahead Eagles&lt;/li>
&lt;/ol>
&lt;h4 id="central-midfielder">Central Midfielder&lt;/h4>
&lt;ol>
&lt;li>Alejandro Pozuelo, Genk&lt;/li>
&lt;li>K. Aliev, Khimki&lt;/li>
&lt;li>Taison, Shakhtar Donetsk&lt;/li>
&lt;li>I. Kogut, Dnipro-1&lt;/li>
&lt;li>N. Holm Pedersen, Silkeborg U17&lt;/li>
&lt;/ol>
&lt;h4 id="right-sided-midfielderfullback">Right-sided Midfielder/Fullback&lt;/h4>
&lt;ol>
&lt;li>D. de Wit, Ajax&lt;/li>
&lt;li>V. Grubeck, FC Juniors OÖ&lt;/li>
&lt;li>A. Grünwald, Austria Wien&lt;/li>
&lt;li>J. Ekkelenkamp, Ajax II&lt;/li>
&lt;li>K. Vermeulen, RKC Waalwijk&lt;/li>
&lt;/ol>
&lt;h4 id="left-sided-central-defender">Left-sided Central Defender&lt;/h4>
&lt;ol>
&lt;li>A. Plotnikov, Zenit St. Petersburg II&lt;/li>
&lt;li>M. Shorkin, Torpedo Moskva&lt;/li>
&lt;li>S. Reese, Horsens&lt;/li>
&lt;li>V. Demin, Ufa II&lt;/li>
&lt;li>Ü. Kurt, Boluspor&lt;/li>
&lt;/ol>
&lt;h4 id="right-sided-central-defender">Right-sided Central Defender&lt;/h4>
&lt;ol>
&lt;li>D. Chistyakov, Tambov&lt;/li>
&lt;li>M. Goropevšek, Volyn&lt;/li>
&lt;li>E. Mbende, Cambuur&lt;/li>
&lt;li>E. Dudikov, Sokol Saratov&lt;/li>
&lt;li>N. Havenaar, Wil&lt;/li>
&lt;/ol>
&lt;p>Seeing the inclusion of names such as that of PSV forward Luuk de Jong and recent Toronto FC debutant Alejandro Pozuelo is reassuring, and seeing a smattering of reserve and academy players is intriguing. It is either the case that these players are wildly outperforming their peers and the competition weights aren&amp;rsquo;t sensitive enough to these outliers, or that they are actually ready to make the jump up to top-tier play.&lt;/p>
&lt;h2 id="closing-thoughts">Closing Thoughts&lt;/h2>
&lt;p>In no particular order, here are some parting thoughts on the topic:&lt;/p>
&lt;ul>
&lt;li>
&lt;p>From a scouting perspective, it is near impossible to boil down a player&amp;rsquo;s performance to a single metric&amp;mdash;especially if that metric only takes into account event-level data. Even just going through those motions on a relatively small sample of data proved to be immensely difficult.&lt;/p>
&lt;/li>
&lt;li>
&lt;p>Given that the volume of on-the-ball defensive actions has a negative correlation with match outcomes, this would not be an optimal method for evaluating defensive abilities. Incorporating some other method, as well as some tracking data, would be necessary.&lt;/p>
&lt;/li>
&lt;li>
&lt;p>As performance data becomes more readily integrated into the day-to-day operations of clubs, one or more of the leading providers of scouting data should consider adding a decision support system to their suite of products. As with any other type of DSS, the need for internal culture shifts and system buy-in will represent the largest obstacles. But considering just how vast the talent pool is, I could foresee a well-built, enterprise-level tool such as this having abundant positive impacts on the efficiency and potency of scouting departments.&lt;/p>
&lt;/li>
&lt;/ul>
&lt;p>&lt;em>Data: Wyscout&lt;/em>&lt;/p></description></item><item><title>Valuing Defensive Actions Using Offensive Possession Models</title><link>https://tylerrichardett.com/blog/2019/03/06/valuing-defensive-actions-using-offensive-possession-models/</link><pubDate>Wed, 06 Mar 2019 00:00:00 +0000</pubDate><guid>https://tylerrichardett.com/blog/2019/03/06/valuing-defensive-actions-using-offensive-possession-models/</guid><description>&lt;p>What&amp;rsquo;s in a pass? What&amp;rsquo;s in a dribble? In isolation, there&amp;rsquo;s not much.&lt;/p>
&lt;p>However, when we chain those passes and dribbles together into sequences and possessions, we can learn a lot about buildup play through event-level data. It may be that the pass which came &lt;em>four&lt;/em> actions prior to the shot was more consequential than the key pass which led &lt;em>directly&lt;/em> to the shot&amp;mdash;and new models are beginning to draw out these subtle nuances.&lt;/p>
&lt;p>&lt;strong>Flip that concept on its head, and we have a serviceable metric for defensive ability.&lt;/strong> Among other examples, we can determine when pressure scenarios are forcing the ball into less dangerous areas, which players are consistently turning the ball over at inopportune moments, and which players are shutting plays down in the most crucial moments.&lt;/p>
&lt;p>If you haven&amp;rsquo;t already, take a moment to read Karun Singh&amp;rsquo;s post on what he calls &lt;a href="https://karun.in/blog/expected-threat.html" target="_blank" rel="noopener">expected threat (XT)&lt;/a> and Derrick Yam&amp;rsquo;s post on StatsBomb&amp;rsquo;s &lt;a href="https://statsbomb.com/2019/02/attacking-contributions-markov-models-for-football/" target="_blank" rel="noopener">ball progression model&lt;/a>. Both center on offensive production and chance &lt;em>creation&lt;/em>, whereas this post will focus primarily on defensive ability and chance &lt;em>elimination&lt;/em>. Karun&amp;rsquo;s model in particular is integral to the analysis that follows, so I&amp;rsquo;ll do my best to reprise that information in the next section.&lt;/p>
&lt;h2 id="karuns-xt-model">Karun&amp;rsquo;s xT Model&lt;/h2>
&lt;p>Traditional approaches to measuring value are extensions of &lt;a href="https://www.americansocceranalysis.com/explanation" target="_blank" rel="noopener">expected goal (xG) models&lt;/a>&amp;mdash;they only recognize the actions which lead directly to optimal shooting opportunities. Motivated by a desire to credit the incisive pass that came from the visionary creative midfielder three moves prior to the goal, analyst Karun Singh &amp;ldquo;looked beyond checkmate.&amp;rdquo;&lt;/p>
&lt;p>First, the field is divided into a number of zones&amp;mdash;192, in this case&amp;mdash;and each zone holds an inherent value. That value is determined by historical event data, and it signifies the probability of a goal being scored in the current position, given that the attacking team holds possession in that zone.&lt;/p>
&lt;p>When in possession, an attacking player has two choices: move the ball or shoot. So, the value of a zone is the weighted sum of the probability of scoring after passing or dribbling the ball and the probability of scoring directly from a shot.&lt;/p>
&lt;p>For example, if an attacking player chooses to shoot from a particular zone 15 percent of the time, and shots from that position result in goals just 10 percent of the time, the weighted value of a shot is 1.5 percent, or 0.015. The remaining 85 percent of the time, an attacking player chooses to move the ball into another zone, and sequences originating from the current position lead to goals just 8 percent of the time&amp;mdash;resulting in a weighted value of 6.8 percent, or 0.068. The zone, then, holds a value of 0.083 (0.015 + 0.068).&lt;/p>
&lt;p>The original formula is detailed below, where:&lt;/p>
&lt;ul>
&lt;li>
&lt;p>\(xT_{x,y}\) is the value of a given zone;&lt;/p>
&lt;/li>
&lt;li>
&lt;p>\(s_{x,y}\) is the probability that an attacking player will shoot from a given zone;&lt;/p>
&lt;/li>
&lt;li>
&lt;p>\(g_{x,y}\) is the probability that a shot from a given zone will result in a goal;&lt;/p>
&lt;/li>
&lt;li>
&lt;p>\(m_{x,y}\) is the probability that an attacking player will pass or dribble from a given zone; and&lt;/p>
&lt;/li>
&lt;li>
&lt;p>\(\sum_{z=1}^{16} \sum_{w=1}^{12} T_{(x,y) \rightarrow (z,w)} V_{(z,w)}\) is the probability that a pass or dribble from a given zone will lead to goal within the same sequence (weighted sum over the other 191 zones).&lt;/p>
&lt;/li>
&lt;/ul>
&lt;p>$$xT_{x,y}=(s_{x,y} \times g_{x,y})+(m_{x,y} \times \sum_{z=1}^{16} \sum_{w=1}^{12} T_{(x,y) \rightarrow (z,w)} V_{(z,w)})$$&lt;/p>
&lt;p>Attacking players are then scored on each pass and dribble, where the value added or deducted is equal to the value of the zone in which the move ends, subtracted by the value of the zone in which the move began (i.e., \(xT_{z,w}-xT_{x,y}\)).&lt;/p>
&lt;h2 id="xt-adaptation">xT Adaptation&lt;/h2>
&lt;p>For the purposes of this analysis, I made two adjustments to Karun’s approach. First, I included incomplete moves as a third potential outcome. I set their value at zero, and this deflated the xT values across the board. The formula itself doesn’t change, but \(s_{x,y}+m_{x,y}=1-i_{x,y}\), where \(i_{x,y}\) is the probability that an incomplete move will occur from a given zone (rather than \(s_{x,y}+m_{x,y}=1\) from the original xT formula).&lt;/p>
&lt;p>Second, I had a burning desire for all my zones to be perfect squares, so I divided the pitch 18 ways from endline to endline and 12 ways from touchline to touchline, resulting in 216 zones. Each zone is 44.4 square yards.&lt;/p>
&lt;h2 id="applying-the-revised-xt-model-to-the-2018--19-fa-womens-super-league-season">Applying the Revised xT Model to the 2018&amp;ndash;19 FA Women&amp;rsquo;s Super League Season&lt;/h2>
&lt;p>Taking advantage of &lt;a href="https://github.com/statsbomb/open-data" target="_blank" rel="noopener">event-level data&lt;/a> generously made available for free by StatsBomb, I applied my adapted xT model to the latest FA Women&amp;rsquo;s Super League season, through mid-February. The results, below, are consistent with expectations: Possessing the ball in and around the goal area yields a higher chance of scoring. Of note, the relative probability of scoring from corners is greater in the FA WSL than it was in the &lt;a href="https://karun.in/blog/expected-threat.html" target="_blank" rel="noopener">2017&amp;ndash;18 Premier League season&lt;/a>.&lt;/p>
&lt;p>
&lt;figure >
&lt;div class="d-flex justify-content-center">
&lt;div class="w-100" >&lt;img src="https://tylerrichardett.com/img/fawsl-xt-heatmap.png" alt="Heatmap showing the zones on the field from which the most xT is generated in the 2018-19 FA Women&amp;amp;rsquo;s Super League season (through February 19)." loading="lazy" data-zoomable />&lt;/div>
&lt;/div>&lt;/figure>
&lt;/p>
&lt;h2 id="valuing-defensive-actions-xte">Valuing Defensive Actions (xTE)&lt;/h2>
&lt;p>With values assigned to each of the 216 zones, I created value functions for each of the following actions with defensive implications:&lt;/p>
&lt;ul>
&lt;li>Blocked shots&lt;/li>
&lt;li>Clearances&lt;/li>
&lt;li>Fouls committed&lt;/li>
&lt;li>Interceptions&lt;/li>
&lt;li>Pressure situations&lt;/li>
&lt;li>Shots allowed&lt;/li>
&lt;li>Turnovers&lt;/li>
&lt;/ul>
&lt;p>Missing from this list are own goals and offside decisions. From this point forward, bemoaning a lack of tracking data will become a common theme: Without it, accurately assigning blame or credit to the relevant player(s) on own goals and offside decisions would be far too difficult.&lt;/p>
&lt;p>To keep with the trend of affixing an &lt;em>x&lt;/em> to the beginning each advanced metric in the sport, we&amp;rsquo;ll refer to these defensive values as &lt;em>xTE&lt;/em>, or &lt;em>expected threat eliminated&lt;/em>.&lt;/p>
&lt;h3 id="blocked-shots">Blocked Shots&lt;/h3>
&lt;p>The expected threat (xT) of a shot could be derived from its expected goals (xG) value, which has already been modeled out and included in the StatsBomb data set. By blocking a goal-bound shot, a defender is effectively eliminating that threat altogether&amp;mdash;unless, of course, the ball is ricocheted toward an attacker in another position on the field, or out for a corner.&lt;/p>
&lt;p>As such, the xTE function for blocked shots is as follows, where \(xG_{Shot}\) is the expected goals value of the shot that is blocked, and \(xT_{z,w}\) is the expected threat value for the zone in which the ball is retrieved by an attacker, following the blocked shot. When the defense regains possession immediately following a blocked shot, \(xT_{z,w}\) is equal to zero. If the offense gains a corner kick as the direct result of a blocked shot, \(xT_{z,w}\) is equal to the probability of scoring in the sequence initiated by a corner kick.&lt;/p>
&lt;p>$$xTE_{Blocked,Shot}=xG_{Shot}-xT_{z,w}$$&lt;/p>
&lt;h3 id="clearances">Clearances&lt;/h3>
&lt;p>The threat eliminated by way of clearances can be thought of as the reverse of the threat gained via passing or dribbling. In this context, I subtract the value of the zone to which the ball is cleared (\(xT_{z,w}\)) from the value of the zone from which the ball is cleared (\(xT_{x,y}\)).&lt;/p>
&lt;p>Positive values are the result of clearing the ball from a dangerous area to a less dangerous one, whereas negative values are the result of clearing the ball into a more dangerous area. The xTE function for clearances is as follows. When the defense regains possession immediately following a clearance, \(xT_{z,w}\) is equal to zero. If the offense gains a corner kick as the direct result of a clearance, \(xT_{z,w}\) is equal to the probability of scoring in the sequence initiated by a corner kick.&lt;/p>
&lt;p>$$xTE_{Clearance}=xT_{x,y}-xT_{z,w}$$&lt;/p>
&lt;h3 id="fouls-committed">Fouls Committed&lt;/h3>
&lt;p>Due to a small sample size of direct free kicks, I was only able to apply the following function in order to calculate the value of conceding a penalty kick. (In an ideal world, with enough data, we could use this same approach to quantitatively define what a “good foul” looks like, outside of the penalty area.)&lt;/p>
&lt;p>The xTE function for fouls committed is as follows, where \(xT_{x,y}\) is the expected threat of the zone in which the foul occurred, \(P(G|DK_{x,y},or,PK)\) is the probability of scoring in the sequence initiated by a direct free kick taken in the same zone, or of scoring directly from a penalty kick.&lt;/p>
&lt;p>$$xTE_{Foul,Committed}=xT_{x,y}-P(G|DK_{x,y},or,PK)$$&lt;/p>
&lt;h3 id="interceptions">Interceptions&lt;/h3>
&lt;p>An interception reverses possession, thus effectively eliminating all attacking threats. Therefore, the xTE function for interceptions is simply equal to the value of the zone in which the intended recipient of the pass is located (\(xT_{z,w}\)).&lt;/p>
&lt;p>$$xTE_{Interception}=xT_{z,w}$$&lt;/p>
&lt;h3 id="pressure-situations">Pressure Situations&lt;/h3>
&lt;p>A defensive player’s decision to pressure the ball is evaluated by looking at the corresponding pass or dribble attempt. Similarly to clearances, I subtract the value of the zone to which the ball ends up (\(xT_{z,w}\)) from the value of the zone from which the movement began (\(xT_{x,y}\)).&lt;/p>
&lt;p>Positive values imply that the defensive player has forced the attacker to move the ball away from goal—either in a backward direction, or toward one of the two touchlines—while negative values imply that the defensive player’s pressure was ineffective, directly resulting in a more dangerous attacking scenario. The xTE function for pressure situations is as follows. When the defense regains possession immediately following a pressure situation, \(xT_{z,w}\) is equal to zero. If the offense gains a corner kick as the direct result of a pressure situation, \(xT_{z,w}\) is equal to the probability of scoring in the sequence initiated by a corner kick.&lt;/p>
&lt;p>$$xTE_{Pressure}=xT_{x,y}-xT_{z,w}$$&lt;/p>
&lt;h3 id="shots-allowed">Shots Allowed&lt;/h3>
&lt;p>The primary objective of this component was to identify instances in which a defender did not take an optimal position in reference to their marker, thus leading to a shot. To borrow again from Karun&amp;rsquo;s work, observe the following scenario:&lt;/p>
&lt;p>
&lt;figure >
&lt;div class="d-flex justify-content-center">
&lt;div class="w-100" >&lt;img src="https://tylerrichardett.com/img/ozil-second-assist.gif" alt="An incisive ball from Arsenal&amp;amp;rsquo;s Mesut Özil indirectly leads to a goal for Pierre-Emerick Aubameyang against Burnley." loading="lazy" data-zoomable />&lt;/div>
&lt;/div>&lt;/figure>
&lt;/p>
&lt;p>Guilty of ball-watching, Burnley defender Kevin Long loses sight of Arsenal striker Pierre-Emerick Aubameyang, allowing Aubameyang just enough space to receive the ball and score. As a result, Long ought to shoulder the blame.&lt;/p>
&lt;p>The xTE function for shots allowed is as follows, where \(xG_{Shot}\) is the expected goals value of the corresponding shot, and \(n\) is the number of defenders within a certain distance of the shooter. In my execution, I chose to share the blame among all defenders within 2 yards of the shooter.&lt;/p>
&lt;p>$$xTE_{Shot,Allowed}= \frac{-xG_{Shot}}{n}$$&lt;/p>
&lt;p>I also chose not to place any blame on defenders caught in breakaway scenarios (i.e., when there is no other defender, other than the goalkeeper, vertically within 10 yards of the ball). Many times, it isn&amp;rsquo;t &lt;em>that&lt;/em> defender&amp;rsquo;s turnover which leads to such a dangerous opportunity; though if it is, they will be penalized by the turnovers function. This exception isn&amp;rsquo;t captured by the function; rather, these situations were simply filtered out.&lt;/p>
&lt;p>Either of the two aforementioned parameters (i.e., distance to the shot and breakaway distance) could be adapted as needed. Regardless, this isn&amp;rsquo;t a perfect solution for identifying marking responsibilities, and it&amp;rsquo;s another situation in which the application of tracking data (to &lt;a href="http://www.lukebornn.com/papers/franks_ssac_2015.pdf" target="_blank" rel="noopener">capture defensive responsibilities&lt;/a>) would be particularly welcome.&lt;/p>
&lt;h3 id="turnovers">Turnovers&lt;/h3>
&lt;p>Finally, turnovers were important to account for from a defensive standpoint, as they gift an attacking threat to the opposing team. Each of the three possible types—incomplete passes, being dispossessed, and miscontrols—can be applied to the following xTE function, where \(xT_{z,w}\) is the value of the zone in which the pass or dribble immediately following the turnover ends. As with shots allowed, all values are negative.&lt;/p>
&lt;p>$$xTE_{Turnover}= -xT_{z,w}$$&lt;/p>
&lt;h2 id="rating-defensive-players-in-the-2018--19-fa-wsl-season">Rating Defensive Players in the 2018&amp;ndash;19 FA WSL Season&lt;/h2>
&lt;p>The table below shows the top 20 players with the highest cumulative xTE values. Ten of the 11 teams in the league have at least one representative, with Bristol City, Liverpool, and Yeovil Town having three. Based on a quick glance at the &lt;a href="http://web.archive.org/web/20190306234431/https://www.bbc.com/sport/football/womens-super-league/table" target="_blank" rel="noopener">league table&lt;/a>, the first two check out. Bristol City and Liverpool offenses are relatively ineffective, and it has been their defensive performances that have kept them out of the relegation spot. Last-place Yeovil Town is a different story&amp;mdash;they have conceded 42 goals over 14 matches&amp;mdash;and the inclusion of these players is likely due to the opposing offenses' constant bombardment more than anything else.&lt;/p>
&lt;p>Finally, seeing the name of England and league-leading Manchester City captain, Steph Houghton, tells me I&amp;rsquo;ve done something correct here.&lt;/p>
&lt;table>
&lt;thead>
&lt;tr>
&lt;th>Rank&lt;/th>
&lt;th>Player&lt;/th>
&lt;th>Team&lt;/th>
&lt;th>xTE&lt;/th>
&lt;/tr>
&lt;/thead>
&lt;tbody>
&lt;tr>
&lt;td>1&lt;/td>
&lt;td>Frankie Brown&lt;/td>
&lt;td>Bristol City WFC&lt;/td>
&lt;td>5.062&lt;/td>
&lt;/tr>
&lt;tr>
&lt;td>2&lt;/td>
&lt;td>Sophie Elizabeth Bradley-Auckland&lt;/td>
&lt;td>Liverpool WFC&lt;/td>
&lt;td>5.01&lt;/td>
&lt;/tr>
&lt;tr>
&lt;td>3&lt;/td>
&lt;td>Gemma Evans&lt;/td>
&lt;td>Bristol City WFC&lt;/td>
&lt;td>4.46&lt;/td>
&lt;/tr>
&lt;tr>
&lt;td>4&lt;/td>
&lt;td>Stephanie Houghton&lt;/td>
&lt;td>Manchester City WFC&lt;/td>
&lt;td>4.019&lt;/td>
&lt;/tr>
&lt;tr>
&lt;td>5&lt;/td>
&lt;td>Hannah Short&lt;/td>
&lt;td>Yeovil Town LFC&lt;/td>
&lt;td>3.986&lt;/td>
&lt;/tr>
&lt;tr>
&lt;td>6&lt;/td>
&lt;td>Gilly Louise Scarlett Flaherty&lt;/td>
&lt;td>West Ham United LFC&lt;/td>
&lt;td>3.341&lt;/td>
&lt;/tr>
&lt;tr>
&lt;td>7&lt;/td>
&lt;td>Kirstyn Pearce&lt;/td>
&lt;td>Reading WFC&lt;/td>
&lt;td>3.263&lt;/td>
&lt;/tr>
&lt;tr>
&lt;td>8&lt;/td>
&lt;td>Ellie Mason&lt;/td>
&lt;td>Yeovil Town LFC&lt;/td>
&lt;td>2.749&lt;/td>
&lt;/tr>
&lt;tr>
&lt;td>9&lt;/td>
&lt;td>Victoria Williams&lt;/td>
&lt;td>Brighton &amp;amp; Hove Albion WFC&lt;/td>
&lt;td>2.692&lt;/td>
&lt;/tr>
&lt;tr>
&lt;td>10&lt;/td>
&lt;td>Danique Kerkdijk&lt;/td>
&lt;td>Bristol City WFC&lt;/td>
&lt;td>2.618&lt;/td>
&lt;/tr>
&lt;tr>
&lt;td>11&lt;/td>
&lt;td>Georgia Brougham&lt;/td>
&lt;td>Everton LFC&lt;/td>
&lt;td>2.617&lt;/td>
&lt;/tr>
&lt;tr>
&lt;td>12&lt;/td>
&lt;td>Brooke Hendrix&lt;/td>
&lt;td>West Ham United LFC&lt;/td>
&lt;td>2.593&lt;/td>
&lt;/tr>
&lt;tr>
&lt;td>13&lt;/td>
&lt;td>Gabrielle George&lt;/td>
&lt;td>Everton LFC&lt;/td>
&lt;td>2.474&lt;/td>
&lt;/tr>
&lt;tr>
&lt;td>14&lt;/td>
&lt;td>Meaghan Sargeant&lt;/td>
&lt;td>Birmingham City WFC&lt;/td>
&lt;td>2.421&lt;/td>
&lt;/tr>
&lt;tr>
&lt;td>15&lt;/td>
&lt;td>Nicola Cousins&lt;/td>
&lt;td>Yeovil Town LFC&lt;/td>
&lt;td>2.256&lt;/td>
&lt;/tr>
&lt;tr>
&lt;td>16&lt;/td>
&lt;td>Millie Bright&lt;/td>
&lt;td>Chelsea LFC&lt;/td>
&lt;td>2.195&lt;/td>
&lt;/tr>
&lt;tr>
&lt;td>17&lt;/td>
&lt;td>Rhiannon Roberts&lt;/td>
&lt;td>Liverpool WFC&lt;/td>
&lt;td>2.128&lt;/td>
&lt;/tr>
&lt;tr>
&lt;td>18&lt;/td>
&lt;td>Kerys Harrop&lt;/td>
&lt;td>Birmingham City WFC&lt;/td>
&lt;td>1.783&lt;/td>
&lt;/tr>
&lt;tr>
&lt;td>19&lt;/td>
&lt;td>Magdalena Ericsson&lt;/td>
&lt;td>Chelsea LFC&lt;/td>
&lt;td>1.58&lt;/td>
&lt;/tr>
&lt;tr>
&lt;td>20&lt;/td>
&lt;td>Niamh Fahey&lt;/td>
&lt;td>Liverpool WFC&lt;/td>
&lt;td>1.565&lt;/td>
&lt;/tr>
&lt;/tbody>
&lt;/table>
&lt;h2 id="accounting-for-opponents-offensive-strength">Accounting for Opponents' Offensive Strength&lt;/h2>
&lt;p>In an attempt to take these rankings one step further, I weighted the xTE value of each defensive action by the opposing team&amp;rsquo;s offensive potency. So, when a defender squares up against a consistently dangerous attack, her positive defensive actions (such as clearances) will earn a greater positive value, but her negative unfavorable defensive actions (such as turnovers) will earn a greater negative value.&lt;/p>
&lt;p>First, I attempted to create an xT matrix for each team in the league, but I was unsuccessful, as the sample sizes were too small&amp;mdash;particularly for the weakest teams:&lt;/p>
&lt;p>
&lt;figure >
&lt;div class="d-flex justify-content-center">
&lt;div class="w-100" >&lt;img src="https://tylerrichardett.com/img/fawsl-xt-team-heatmap.png" alt="Heatmap showing the zones on the field from which the most xT is generated in the 2018-19 FA Women&amp;amp;rsquo;s Super League season (through February 19), broken out by attacking team." loading="lazy" data-zoomable />&lt;/div>
&lt;/div>&lt;/figure>
&lt;/p>
&lt;p>To weight the opposing team&amp;rsquo;s attack, I instead used the offensive coefficients generated for each team by the &lt;a href="http://web.math.ku.dk/~rolf/teaching/thesis/DixonColes.pdf" target="_blank" rel="noopener">Dixon-Coles model&lt;/a>, the inputs for which are the current season&amp;rsquo;s results. (The &lt;code>dixoncoles&lt;/code> function in Ben Torvaney&amp;rsquo;s &lt;a href="https://github.com/Torvaney/regista" target="_blank" rel="noopener">&lt;code>regista&lt;/code> package&lt;/a> proved helpful in doing so.)&lt;/p>
&lt;p>The revised table of top 20 players, based on weighted xTE values, is below. It includes the same 20 players, though their order has shuffled. Upward movement could be an indicator of which defenders &amp;ldquo;show up [or make fewer mistakes] in the big games.&amp;rdquo;&lt;/p>
&lt;table>
&lt;thead>
&lt;tr>
&lt;th>Rank&lt;/th>
&lt;th>&lt;/th>
&lt;th>Player&lt;/th>
&lt;th>Team&lt;/th>
&lt;th>Weighted xTE&lt;/th>
&lt;/tr>
&lt;/thead>
&lt;tbody>
&lt;tr>
&lt;td>1&lt;/td>
&lt;td>&amp;ndash;&lt;/td>
&lt;td>Frankie Brown&lt;/td>
&lt;td>Bristol City WFC&lt;/td>
&lt;td>7.959&lt;/td>
&lt;/tr>
&lt;tr>
&lt;td>2&lt;/td>
&lt;td>▲ 1&lt;/td>
&lt;td>Gemma Evans&lt;/td>
&lt;td>Bristol City WFC&lt;/td>
&lt;td>5.991&lt;/td>
&lt;/tr>
&lt;tr>
&lt;td>3&lt;/td>
&lt;td>▼ 1&lt;/td>
&lt;td>Sophie Elizabeth Bradley-Auckland&lt;/td>
&lt;td>Liverpool WFC&lt;/td>
&lt;td>5.692&lt;/td>
&lt;/tr>
&lt;tr>
&lt;td>4&lt;/td>
&lt;td>&amp;ndash;&lt;/td>
&lt;td>Stephanie Houghton&lt;/td>
&lt;td>Manchester City WFC&lt;/td>
&lt;td>4.936&lt;/td>
&lt;/tr>
&lt;tr>
&lt;td>5&lt;/td>
&lt;td>▲ 5&lt;/td>
&lt;td>Danique Kerkdijk&lt;/td>
&lt;td>Bristol City WFC&lt;/td>
&lt;td>4.118&lt;/td>
&lt;/tr>
&lt;tr>
&lt;td>6&lt;/td>
&lt;td>▲ 8&lt;/td>
&lt;td>Meaghan Sargeant&lt;/td>
&lt;td>Birmingham City WFC&lt;/td>
&lt;td>3.928&lt;/td>
&lt;/tr>
&lt;tr>
&lt;td>7&lt;/td>
&lt;td>▲ 2&lt;/td>
&lt;td>Victoria Williams&lt;/td>
&lt;td>Brighton &amp;amp; Hove Albion WFC&lt;/td>
&lt;td>3.856&lt;/td>
&lt;/tr>
&lt;tr>
&lt;td>8&lt;/td>
&lt;td>▲ 5&lt;/td>
&lt;td>Gabrielle George&lt;/td>
&lt;td>Everton LFC&lt;/td>
&lt;td>3.84&lt;/td>
&lt;/tr>
&lt;tr>
&lt;td>9&lt;/td>
&lt;td>▼ 3&lt;/td>
&lt;td>Gilly Louise Scarlett Flaherty&lt;/td>
&lt;td>West Ham United LFC&lt;/td>
&lt;td>3.819&lt;/td>
&lt;/tr>
&lt;tr>
&lt;td>10&lt;/td>
&lt;td>▼ 3&lt;/td>
&lt;td>Kirstyn Pearce&lt;/td>
&lt;td>Reading WFC&lt;/td>
&lt;td>3.808&lt;/td>
&lt;/tr>
&lt;tr>
&lt;td>11&lt;/td>
&lt;td>▼ 6&lt;/td>
&lt;td>Hannah Short&lt;/td>
&lt;td>Yeovil Town LFC&lt;/td>
&lt;td>3.699&lt;/td>
&lt;/tr>
&lt;tr>
&lt;td>12&lt;/td>
&lt;td>&amp;ndash;&lt;/td>
&lt;td>Brooke Hendrix&lt;/td>
&lt;td>West Ham United LFC&lt;/td>
&lt;td>3.624&lt;/td>
&lt;/tr>
&lt;tr>
&lt;td>13&lt;/td>
&lt;td>▼ 2&lt;/td>
&lt;td>Georgia Brougham&lt;/td>
&lt;td>Everton LFC&lt;/td>
&lt;td>3.532&lt;/td>
&lt;/tr>
&lt;tr>
&lt;td>14&lt;/td>
&lt;td>▼ 6&lt;/td>
&lt;td>Ellie Mason&lt;/td>
&lt;td>Yeovil Town LFC&lt;/td>
&lt;td>3.474&lt;/td>
&lt;/tr>
&lt;tr>
&lt;td>15&lt;/td>
&lt;td>▲ 4&lt;/td>
&lt;td>Magdalena Ericsson&lt;/td>
&lt;td>Chelsea LFC&lt;/td>
&lt;td>2.96&lt;/td>
&lt;/tr>
&lt;tr>
&lt;td>16&lt;/td>
&lt;td>▲ 1&lt;/td>
&lt;td>Rhiannon Roberts&lt;/td>
&lt;td>Liverpool WFC&lt;/td>
&lt;td>2.791&lt;/td>
&lt;/tr>
&lt;tr>
&lt;td>17&lt;/td>
&lt;td>▲ 1&lt;/td>
&lt;td>Kerys Harrop&lt;/td>
&lt;td>Birmingham City WFC&lt;/td>
&lt;td>2.458&lt;/td>
&lt;/tr>
&lt;tr>
&lt;td>18&lt;/td>
&lt;td>▼ 3&lt;/td>
&lt;td>Nicola Cousins&lt;/td>
&lt;td>Yeovil Town LFC&lt;/td>
&lt;td>2.455&lt;/td>
&lt;/tr>
&lt;tr>
&lt;td>19&lt;/td>
&lt;td>▼ 3&lt;/td>
&lt;td>Millie Bright&lt;/td>
&lt;td>Chelsea LFC&lt;/td>
&lt;td>2.303&lt;/td>
&lt;/tr>
&lt;tr>
&lt;td>20&lt;/td>
&lt;td>&amp;ndash;&lt;/td>
&lt;td>Niamh Fahey&lt;/td>
&lt;td>Liverpool WFC&lt;/td>
&lt;td>2.288&lt;/td>
&lt;/tr>
&lt;/tbody>
&lt;/table>
&lt;h2 id="closing-thoughts">Closing Thoughts&lt;/h2>
&lt;p>In no particular order, here are some parting thoughts on the topic:&lt;/p>
&lt;ul>
&lt;li>
&lt;p>Defensive statistics are difficult, and it&amp;rsquo;s no wonder that this aspect of the game continues to be under-analyzed. The type and volume of a defender&amp;rsquo;s work varies game-to-game, and it&amp;rsquo;s entirely dependent on the opponent&amp;rsquo;s actions. A defender can&amp;rsquo;t make work for herself; she can only respond to an onrushing attack. And it could be argued that the most valuable work is done entirely off the ball&amp;mdash;by occupying key spaces on the field and forcing the opponent to attack via a less-optimal channel.&lt;/p>
&lt;/li>
&lt;li>
&lt;p>For these reasons, and in the interest of time, I did not regularize the xTE metrics in the table above. Arsenal&amp;rsquo;s defensive players must be good to play for Arsenal (the only team of the 11 that is missing), of course, but this transformation will likely not ensure their appearance on these lists, either. They may rarely be put in positions to show off their talents in any of the manners listed above&amp;mdash;because their offense retains a high share of possession each match, or their defense plays a high line and those defensive actions take place in less valuable areas of the pitch, or a mixture of both.&lt;/p>
&lt;/li>
&lt;li>
&lt;p>Incorporating tracking data into models such as these will be monumental, and &lt;a href="http://www.lukebornn.com/papers/fernandez_ssac_2019.pdf" target="_blank" rel="noopener">some of that work is already underway&lt;/a>&amp;mdash;primarily on the offensive side of the ball. Again, a defender&amp;rsquo;s most valuable contributions may be instances in which her mere existence convinces offenses to try another route toward goal. That nuance is lost with only event-level data.&lt;/p>
&lt;/li>
&lt;li>
&lt;p>xTE, as well as xT, are one-dimensional metrics. But, combining these values, as well as some metric to represent duel outcomes, may be a nifty way of calculating players' overall contributions. (Think &lt;a href="http://www.82games.com/barzilai2.htm" target="_blank" rel="noopener">plus-minus stats&lt;/a> in the NBA, minus the regression.)&lt;/p>
&lt;/li>
&lt;/ul>
&lt;p>&lt;em>Data: StatsBomb&lt;/em>&lt;/p></description></item></channel></rss>