Team # 30341 Page 1 of 21
本文系 2014年
美国大学生数学建模竞赛(MCM/ICM) B题
中国人民大学参赛小组 #30341
提交的论文的主体部分。
该文最终获得当年的 Finalist奖项。
本着交流学习的目的,我们现将自己辛苦多日的结晶与广大数学建
模爱好者、美赛参赛者分享。
欢迎各种形式的扩散,但请保留本页并注明出处!
关于本文的任何问题,可致信 echozito@sina.com
Team # 30341 Page 2 of 21
Behind the Numbers: How to Find College Coach Legends
I. Introduction ................................................................................................................................... 3
1.1 Definition of Our Coach...................................................................................................... 3
1.2 Criteria for Coaching Greatness .......................................................................................... 3
1.3 Criterion Indicators ............................................................................................................. 4
II. Data .............................................................................................................................................. 4
III. Evaluation Methods .................................................................................................................... 6
3.1 Method Selection ................................................................................................................ 6
3.2 Assumptions ........................................................................................................................ 6
3.3 The Analytic Hierarchy Process (AHP) Method ................................................................. 7
3.3.1 Criteria Hierarchy..................................................................................................... 7
3.3.2 Pairwise Comparison Matrix .................................................................................... 8
3.3.3 Consistency Check ................................................................................................... 9
3.3.4 Model Results ........................................................................................................ 10
3.4 The Linear Programming (LP) Method ............................................................................ 10
3.4.1 Mathematical Formulation ..................................................................................... 10
3.4.2 Model Results ........................................................................................................ 12
3.5 The Multiple Regression Analysis (MRA) Method .......................................................... 12
3.5.1 Qualitative Response Model .................................................................................. 12
3.5.2 The Model ............................................................................................ 12
3.5.3 Model Results ........................................................................................................ 13
3.6 Summary ........................................................................................................................... 13
IV. Sensitive Analysis ..................................................................................................................... 14
4.1 Discussion of Assumption 1: time does not matter............................................................ 14
4.1.1 Big-Game Opportunities ........................................................................................ 14
4.1.2 Media Attention ..................................................................................................... 16
4.1.3 Coaching Skills ...................................................................................................... 16
4.1.4 Change of Rules ..................................................................................................... 17
4.2 Discussion of Weights ...................................................................................................... 17
4.3 Summary ........................................................................................................................... 18
V. The Extended Model ................................................................................................................... 19
5.1 Gender ............................................................................................................................... 19
5.2 Sports ................................................................................................................................ 20
VI. Conclusions ............................................................................................................................... 20
Team # 30341 Page 3 of 21
I. Introduction
No matter what kind of sports we are talking about, coaches are important. Sports
competitions are as much about coaches as anything else. It is true because there is no
completely instinctive yet suitable tactics during the game. Competitive sports pursue
perfection, and nothing perfect can be carried out naturally without scientific training,
which is where coaches come to matter, where good coaches distinguish themselves.
It, therefore, necessitates the identification of best college coaches, which is the
aim of this paper. However, it also complicates our task because their achievements are
inseparable from their players and teams, making it hard to isolate and value coaches’
personal contribution.
One way to deal with this is to compare both absolute and relative outcomes for
each coach and evaluate them based on their long-term or even life-time average
achievements.
Before we move on to technical details, it is crucial for us to, first of all, define our
“coach”.
1.1 Definition of Our Coach
Lyle (2002) separates performance coaches from participation coaches. Compared
to participation coaches, performance coaches tend to have longer and more stable
relationships with athletes and are held much more responsible for the results. This
paper focuses only on such performance coaches, to be more specific, head coaches.
Apart from training players for the college, our coaches also make hiring and
transferring decisions, determine starting line-ups and design competition tactics. They
function as teachers, advisers, as well as leaders of the team, but they are not managers,
nor are they influenced by profitability. They are responsible for only their players and
competitions. It simplifies our model without losing much information since their
income is highly related to these competition records.
We narrow our scope to basketball, football and hockey in the United States in the
main part and discuss how it could be extended to other sports later.
1.2 Criteria for Coaching Greatness
Next, we define coaching greatness based on the coach definition. Since coaches
have many different tasks, their greatness could be defined from many aspects. If people
are concerned with only competition results, great coaches could be defined as those
who have the best life-time game records. If their abilities to bring out players’
potentials, i.e., their coaching skills are of interest, then they should be evaluated against
their predecessors. If people think great coaches should not only coach well, but also
recruit well, their recruiting ranks should be considered too.
These are all about their effectiveness in their area, but Becker (2009) has pointed
out that evaluating coaches based on their effectiveness only is “narrow” and “limiting”.
Since great coaches are also public figures and role models, therefore it is also important
to assess their goodness. We interpret goodness as their reputation.
Team # 30341 Page 4 of 21
However, Lyle(2002) also questioned to what extent coaches could be held
responsible for competition results and called for more attention on formative factors
such as coach-athlete relationship using questionnaires.
We agree with the literature on the need to have a more systematic framework that
includes coach-athlete interactions, but in no way can we access adequate amount of
data with such quality. In reconciliation, we consider only the following three criteria
in the main part and add a fourth criterion later.
Our great coaches are thus those who have
impressive competition records
excellent coaching skills
indisputable social prestige
healthy coach-athlete (C-A) relationships (added later)
For each criterion, we need the corresponding real-life measures.
1.3 Criterion Indicators
Much empirical work has been done to measure coaching greatness by both
journalists and academicians. Most magazines and websites use win/loss records and
their appearances in important competitions as indicators to reflect coaches’
competition records. They also organize different weekly and monthly polls to
determine the best, which can be used as the indicator for social prestige. The problem
is we can only find poll data for football. As for basketball and hockey, the only related
data we can find is Google Search results. It represents the media attention on each
coach about their good, as well as bad aspects, and thus is different from social prestige.
A notoriously bad coach could get the same amount of media attention as a famously
good coach. But our justification is that these coaches are mostly from the last century,
during which period information technology is still immature. Most of the search results
come from 21st century and bad coaches tend to get forgotten faster as time goes by.
Only truly great ones are remembered and constantly mentioned nowadays. So it is not
as good as polls, but can still, in some way, reflect coaches’ social prestige. Since polls
are also one kind of media attention, we will call this indication media attention
hereafter.
Coaching skills should be measured relatively, so we need to compare them with
their predecessors and their teams’ long-term average level using competition records.
As for the C-A relationship, we choose the popular Coach Evaluation
Questionnaire (CEQ) (Mallett, 2006 ) as the questionnaire used in our hypothetical
survey, and generate data based on questions designed in CEQ. Leadership Scale for
Sports (LSS) (Mallett, 2006), is also widely used, but CEQ’s criteria is “more sport
relevant” (Becker, 2009).
II. Data
Our data for Men’s College Basketball and College Football are drawn from Sports
Reference (www.sports-reference.com), API Poll and Google Search. The coach
variables we abstract for College Basketball in the Sports Reference are their names,
their average winning percentages (or win/loss ratios), their specific coaching
Team # 30341 Page 5 of 21
experience (i.e. teams, predecessors, match data, performance in NCAA and
championships won) and different basketball teams’ historic winning percentage.
Similar variables in the Sports Reference are used for College Football except that
participation frequencies and winning percentages in bowl games are utilized instead
of appearances in NCAA and championships won.
The Associated Press started a college football poll on both teams and coaches
since 1936 and is the longest-running poll that awards national titles. We cannot find
the historical records of AP Coach Poll, so we match AP poll rankings on teams to
coaches and use this variable to represent coaches’ social prestige. AP poll consists of
both weekly results and final/yearly results, and we use the final/yearly AP poll. For
college basketball coaches, it is difficult to find a similar media index so we use Google
Search results instead. Given its obscurity, we lower its weight in our basic models for
basketball.
Our raw data consist of 3513 basketball coaches and 2071 football coaches. Then
we try to roughly filter some candidates. We first delete those whose career start after
2000 or ends before 1900. Second, we remove coaches whose career is within 5 years
and whose amount of participated games are below the mean of the amount of
participated games for all coaches, which is 164 for basketball and 60 for football. We
further increase the threshold for basketball to 500. The reason for these deletions is
that we think it is necessary for coaches to participate in enough games to be qualified
as a legend coach. We then remove coaches whose life-time winning percentage is
below the median of all coaches and who have not taken part in any important matches
(i.e. bowl games, NCAA games). These procedures help control the size of data we
need to collect for our models, and thus save time. At last, we obtain a dataset of 111
basketball coaches and 109 football coaches.
Our data for Men’s Hockey are drawn from USCHO Statistics ( www.uscho.com ).
The coach variables we use are their average winning percentage, their performance in
NCAA games and the championships won in regular season conferences and
postseason tournaments. We have difficulty in finding coaches’ specific coaching
experience and their social prestige. Thus, we can only use the incomplete data at hand
to find the best coaches. Our final dataset consists of 91 Hockey coaches who have
attended at least 500 games and have quite good records (i.e. high ratio of winning (over
0.4), good performance in big games). We use these competition records to construct
eight key variables, as shown in Table 1.
Table 1. Variables and Meanings (i for individual, j for team)
Variable Meaning
Competition
Records
coach i’s life-time winning percentage (WP)
coach i’s participation frequency in bowl games/ NCAA games
coach i’s winning percentage in bowl games/ NCAA games
Coaching Skills coach i’s life-time relative performance against his/her team
coach i’s life-time relative performance against his/her predecessor
Social Prestige coach i’s social prestige
Other team j’s long-term winning percentage
team j’s winning percentage before coach i
Except for and whose formula is given in Equation (1) and (2)
respectively, other variables can be directly derived from Sports Reference.
(1)
(2)
Team # 30341 Page 6 of 21
III. Evaluation Methods
3.1 Method Selection
Since we think coaches should be assessed using multiple criteria including their
competition records, coaching skills and social prestige, our problem resembles a
multi-criteria decision making (MCDM) problem, which is extensively discussed in the
context of supplier evaluation and selection. According to the widely cited literature
review done by Ho, Xu & Dey (2010), popular methods used in this area are data
envelop analysis (DEA), mathematical programming (MP), analytic hierarchy process
(AHP), analytic network process (ANP) and genetic algorithm (GA). DEA is not
suitable in our case because inputs are not specified, nor are GA and ANP because they
both require match-level data that we do not have. Therefore, we choose the other two
methods for coach evaluation.
The reason why we use two methods instead of picking one is that they both have
their own strengths and weaknesses. For AHP, the problem is how to determine weights
for different criteria, which often proves to be most challenging in MCDM problems.
Although AHP method has already simplified the comparison to be pairwise, meaning
between two elements, it still requires us to assign numeric values for relative
importance, which is highly dependent on subjective judgments and might result in
cooking data. MP avoids the need to do accurate pre-estimation, but it also offers very
limited space to rank the criteria at all. A linear programming (LP) method developed
by Ng (2008) enables us to introduce some controls over the importance of our criteria
without losing MP method’s advantage, so we will use this method.
Both AHP and LP are nonparametric methods. The advantage of this kind of
methods is their ability to deal with low-quality data, but the problem is they cannot
take stochastic factors into account. Parametric methods, like multiple regression
analysis (MRA), require high-quality data but they provide a richer set of specifications
and enable us to test our hypotheses. We will, thus, also use MRA method for coach
assessment.
3.2 Assumptions
Before we build models to assess coaches, there are four assumptions that we have to
specify first.
Assumption 1: time does not matter.
We need to make a ceteris paribus assumption to ensure a stable outside
environment so that we can exclude time in our basic model. It assumes that rules or
whatever other external factors that might change over time and influence coach
performances, remain the same in the last century. It is a strong assumption that will be
tested in the Sensitive Analysis.
Assumption 2: coaches’ abilities do not change much during their life, while teams’
performance is stochastic because of white noises in the short term (decades), but
deterministic in the long run (over half a century), influenced by coaches and teams’
idiosyncrasies.
Team # 30341 Page 7 of 21
Assumption 2 is not strict because compared to coaches’ abilities, sports
competitions have much more uncertainty, but because of the law of large numbers
(LLN), this uncertainty tends to be unimportant in the long run. Therefore, these
uncertain variables can be viewed as white noises. Coaches are still what matters most
to team performance in the long run. The comparison between coaches’ life-time
average records and their teams’ long-term average records can shed some light on
coaches’ coaching skills.
Assumption 3: coaches have imperishable influences on directed teams in one
century.
Assumption 3 is a little harsh because of the word “imperishable”, considering the
fact that in college teams, most players change on a four-year basis. Our justification is
that although players trained by coaches and techniques used by coaches could all be
gone and lost after their departure from the teams, fames and traditions brought about
by the coaches have continuous feedback effects that might perpetuate. Also, one
century is not such a long time for sports teams since many coaches have stayed in one
team for decades. Assumption 3 suggests that even if we compare coaches’ average
records with their teams’ average records, it does not necessarily represent these
coaches’ coaching skills because other coaches’ contributions are also included. Instead,
we have to also compare coaches’ average records with their predecessors’ average
records. Note that, since we have assumed team performance to be unpredictable in the
short run, we cannot use their predecessors’ records directly. In order to compare them,
we have to calculate average records for their predecessors too. It toughens our data-
clearing process that is already pretty hard, but is worth the efforts because we value
the “coaching skills” criterion.
Assumption 4: all the good basketball coaches are in the National Collegiate
Athletic Association (NCAA).
This assumption enables us to consider only NCAA teams and reduces the amount
of data we need to collect. It is acceptable because NCAA is the most influential
national association in the United States, dedicated to safeguarding the wel
本文档为【美赛2014_Team # 30341】,请使用软件OFFICE或WPS软件打开。作品中的文字与图均可以修改和编辑,
图片更改请在作品中右键图片并更换,文字修改请直接点击文字进行修改,也可以新增和删除文档中的内容。
该文档来自用户分享,如有侵权行为请发邮件ishare@vip.sina.com联系网站客服,我们会及时删除。
[版权声明] 本站所有资料为用户分享产生,若发现您的权利被侵害,请联系客服邮件isharekefu@iask.cn,我们尽快处理。
本作品所展示的图片、画像、字体、音乐的版权可能需版权方额外授权,请谨慎使用。
网站提供的党政主题相关内容(国旗、国徽、党徽..)目的在于配合国家政策宣传,仅限个人学习分享使用,禁止用于任何广告和商用目的。