The night before the 2008 MLS SuperDraft, the San Jose Earthquakes then-head coach Frank Yallop and general manager John Doyle were holed up in an old, brick warehouse in Oakland, hidden beneath the shadow of an overpass. These are the inauspicious headquarters of Match Analysis, a leading company in the field of data analysis in soccer.
Although the Earthquakes are not commonly lauded as a tactically progressive MLS side, there was once a time when they were at the forefront of statistical analysis in soccer. Even to this day, the club remains among the growing wave of organizations using data analytics to inform their decisions on and off the pitch.
In fact, one could argue that the team only exists in its current form because of the Oakland A’s Billy Beane, the subject of the seminal book “Moneyball,” who sought to apply Baseball’s analytics to other team sports, specifically soccer. In 2006, Beane convinced the owners of the A’s to purchase the Quakes and resurrect the defunct franchise.
Beane partnered with Bill Gerrard, one of the leading figures in data analysis in sports, to develop a proprietary system of evaluating soccer games and soccer players, a task they called: “Statistical Performance Analysis.” The Quakes were to be their test subject.
Over the course of three years between 2007 and 2010, Beane and Gerrard provided statistical oversight to the team and, as a part of this initiative, used analytics to inform their player recruitment strategy. A major part of this initiative was the MLS SuperDraft.
And that, in a roundabout way, is how Yallop and Doyle ended up in Match Analysis’ headquarters ahead of the 2008 Superdraft. Match Analysis serves clients worldwide, including the USMNT and German national team, and its platforms give coaches the ability to break down every segment of the game: from sprints to passes, shots, and touches.
Come draft day, the final outcome of their work was Shea Salinas, their first-round draft pick, 15th overall, from Furman University.
Seven years later, when I happened into a conversation with Bill Gerrard at a Sporting Analytics Conference in London, he still remembered Salinas. Although the details of Gerrard’s work with the Quakes remain wrapped up, Salinas was the one name mentioned in our short conversation.
The winger spent only two years in his first stint at the Quakes and never managed to hold down a starting position, despite the fact that the statistics showed he was one of the team’s most productive performers per minute. Perhaps coach Yallop didn’t favor the Beane and Gerrard analysis over old-school soccer instincts; indeed, the organization opted to move in a different direction in 2010.
But the old story piqued my interest again last month when students from Columbia University and Harvard University used data science techniques to assemble the ultimate MLS roster in conjunction with analytics firm OptaPro, as a part of Columbia’s Masters in Sports Management program. They split into four teams and the winning roster, chosen by a panel of five judges that included Director of Player Recruitment for NYCFC David Lee and the Head of Technical Recruitment and Analysis for Atlanta United FC, Lucy Rushton, was featured on MLSSoccer.com.
Intriguingly, four Quakes players made the winning twenty-man squad — Quincy Amarikwa, Marvell Wynne, David Bingham and Salinas — more than any other team.
So what makes Salinas stand out when he isn’t even a consistent starter for the Quakes? I put the question to the winning team of Columbia students.
“Shea Salinas was the 21st ranked midfielder in the league in our overall performance ratings, which were built around an expected goals model,” the students responded. “This was good enough to earn him a role on our roster as a backup attacking midfielder in our 4-3-3, a position we strongly believe he will fit nicely.
“Shea stands out for his ability to create chances at a high rate per minute played, which by a large margin was the biggest performance factor that led to his inclusion on our team, in addition to his pace and low risk of missing games. We believe his skill set will fit in well with the players that we built our team around – a foundation of accurate passers and strong finishers.”
Although frequently appearing as a second-half substitute, Salinas has played in sixty-three consecutive MLS matches, almost unheard of for an outfield player. This season, the winger has filled the void in pace the Quakes suffer from in the midfield and his crossing style of play is tailor-made for a talented poacher like Chris Wondolowski.
Practical considerations also “weighed heavily” in the student’s analysis. “When we combined the factors of his relatively low salary ($148,333 in guaranteed compensation in 2015) and the fact that his citizenship status wouldn’t cost our team one of only 8 international spots we felt he was a great value as a backup on our roster,” the students said.
“Unfortunately, Shea is twenty-nine (29) years old, which is on the wrong side of our aging curve for midfielders, but since we are not intending to ask him to play 90 minutes per game we believe he can be a real asset. For context, to us [Quincy] Amarikwa is an even better value by a large margin (and a no-brainer for inclusion on our roster). Amarikwa is our 9th ranked forward and only received $100,000 in 2015.”
Perhaps, then, the takeaway has much to do with the Quakes’ recruitment system and how they have successfully rebuilt since a catastrophic 2014 season. Recently, the team started using widely popular program Wyscout for player recruitment and Catapult, a GPS tracking system, for on-the-pitch fitness insights.
Midfielder Matias Perez-Garcia, whom the Quakes signed in 2014, is actually ranked higher than Salinas in the student’s expected goals model, but practical considerations kept him off their final roster.
The analysis also confirmed in statistical terms that the Quakes have struck gold in goalkeeper Bingham. Although the decision to let go of veteran keeper John Busch at the end of 2014 was controversial, the Bingham move has paid handsome dividends.
The student’s study found that Bingham was the fourth-best goalkeeper in the league, behind Jesse Gonzalez, Bill Hamid, and Nick Rimando. However, Hamid and Rimando had prohibitive salaries and Gonzalez’s sample size was too small to warrant the starting position on their final roster.
It is difficult to separate a goalkeeper’s individual performance from the rest of his team simply based on statistics, but the winning team’s model for doing so involved finding “the repeatable performance factors which would predict the fewest actual goals allowed relative to expected goals allowed.”
Although some were skeptical about the lack of big-name players on the winning roster, the team have a logical and rather compelling explanation: they simply found no correlation between wage bill and performance in MLS.
They said: “To us, this was an indicator that we did not want to build our team around high-priced Designated Players (DP’s). It seems that most teams use the DP spots more as a means to sell tickets than to win games, whether they do so intentionally or not.
“That said, a quick look at our performance ratings for forwards, based on a combination of stats that predict expected goals plus expected assists per 90 minutes, shows that most of the top performers are in the DP salary range. We were able to add the top striker and the top winger on our risk-adjusted final rankings to our roster by using two of our three DP slots, and even better, we didn’t break the bank like we would have with Villa and Giovinco (who each individually make more than our entire team).”
Much like the Quakes. Their wage bill is in the lower half of the league and they take a tentative approach to Designated Players. Instead of splashing on big-name DPs, the Quakes clearly target Central and South American markets to find hidden gems like Alberto Quintero, Anibal Godoy, and nineteen-year-old defensive midfielder Matheus Silva.
The Quakes, following in the footsteps of the ever-prudent Oakland A’s, tend to steer clear of the saturated European markets where the MLS salary cap doesn’t take you very far. They have just two European-born players, full-back Jordan Stewart and midfielder Simon Dawkins, and only Dallas have fewer in MLS. Bar the occasional wayward signing, the Quakes have succeeded in picking up talented players on the cheap. It’s not as galvanizing an approach for the fan base or the media spotlight, but certainly more stable in the long-term.
However, this approach is not without its drawbacks. Its flaws could become painfully exposed this summer during the Copa Centenario, given Dawkins, Quintero, Godoy, Bingham and potentially even Wondolowski will all be missing due to their international commitments. In MLS, only the Seattle Sounders will be dealt a bigger blow by international tournaments this summer.
That doesn’t mean the Quakes organization are sitting on their heels. They are looking to move ahead with plans for an academy complex in San Jose and are also reportedly in talks with USL side Reno 1868 regarding a potential affiliation. Development has begun on the land adjacent to Avaya Stadium and the 2016 MLS All-Star game is right around the corner. Over the offseason, they also hired a scouting and performance analysis, Vassili Cremanzidis, formerly of the Montreal Impact.
Although the Quakes have been reluctant to splash big in the transfer market, maybe the investment is simply made in more subtle ways.