top of page
Search

CRS Pilot, DUPR and KPR - 6/17/23

Updated: Aug 9, 2023

We completed a practice week (June 6, June 8) and the first official week of our trial. We learned a lot about how Computerized Ratings work. There are positives, and of course there are negatives. We do not anticipate there will be a perfect solution.


We are testing the DUPR ratings system. We learned this week that DUPR is changing their formula, and with any change there are positives and negatives. Here are some of the changes.

  • DUPR is essentially using the UTPR-style calculation used by USA Pickleball. A maximum of 0.10 ratings point is available in each match. If a high favorite wins, expect < 0.01 ratings points to be exchanged. If a dramatic underdog wins, expect > 0.09 ratings points to be exchanged.

  • DUPR is assigning an introductory rating of 3.50 for new players. This will cause significant challenges for a period of time … a 5.00 player will need 30ish matches to move up toward 5.00 … a 2.50 player will need to lose against better players 10-20 times to settle into a 3.00 rating.

  • Matches entered via phone, on-site, will count 50% toward a ratings adjustment. This means that a maximum of 0.05 ratings points can be exchanged when scores are entered by players via phone.

  • Matches entered by a club administrator will count 100% toward a ratings adjustment.

  • PebbleCreek is now a Digital Club within DUPR. Go to mydupr.com and search for PebbleCreek Pball Club – Competitive. You can request to join the Club and one of our approved administrators will accept your request.

  • All DUPR players who submitted a DUPR-ID to me will be uploaded to DUPR for Club Membership in the next 1-2 days.

  • I now have permission to enter scores on your behalf. No more entering scores via phone.

  • We have a handful of groups playing Open Play games and submitting DUPR scores. Please be honest about the scores/results you choose to enter. Our Club Members have had the ability to enter scores via DUPR for the past 2-3 years, we just have more players testing the system on their own now.

  • We had an instance where somebody was attempting to manipulate the system. Having an administrator enter scores should eliminate attempts at cheating.

  • Court Reserve and DUPR have entered into a partnership. In the future, you should be able to see you DUPR within Court Reserve. You will see your color, your DUPR, and potentially other fields as well.

We will enter all scores into the Court Reserve ratings system beta test when their system becomes available.


DUPR Participation

The original intent of our pilot was to test DUPR. Roughly half of our participants provided me with a DUPR ID. About 75% of that audience is willing to have their scores submitted to DUPR. About 4% of the 75% acted questionably, seemingly submitting winning scores but not submitting losing scores. As of today, I will attribute the latter to mistakes made entering game results via phone.


Going forward, an administrator will enter scores into DUPR … no more entry of scores by our players during the pilot. Players are welcome to submit scores from their own open play games provided all four players agree to submit the score before the game begins.


We made accommodations for players who wanted to test a Computerized Rating System but did not wish to put their DUPR at risk for a pilot program. We created a rule that all four players must agree to DUPR for the game to be submitted to DUPR. The result of the compromise is between 15% and 25% of games are counting to DUPR, with almost no matches among players with DUPRs > 3.75 counting. It’s hard to learn from a pilot program if most games do not count.


We will replace our “partner-centric” events on June 27/29 with a DUPR-only event. To play on June 27/29, the player must be willing to have the administrator enter scores for all games. I fully expect reduced participation. I also expect that we will capture better data. We will evaluate where our pilot is at during the off-week of July 4.


Weekly Matchups

I received many comments about court assignments for our matches. Here is a sampling of the feedback. All of the bullet points below are actual comments.

  • “This club put me in the wrong color level and now you are putting me on bad courts. The club does not respect me.”

  • “My partner tanked me and now I have to play on a bad court because my partner played poorly. This system doesn’t work.”

  • “You assign me bad partners. Why don’t you like me?”

  • “I moved up from Court 7 to Court 4. Please adjust my color level.”

  • “You put the better players on Courts 1/2/3 and now I can’t play against them. That is unfair.”

  • “You are just replicating ladders, and ladders are stupid.”

  • “These matchups are a waste of my time.”

  • “I won’t play in this format in the winter if this is where our Club is headed.”

  • “USA Pickleball tried using an ELO-based UTPR formula and it failed, so you are going to fail too.”

Allow me to remind you what we are trying to accomplish. We want to see if a computerized ratings system is appropriate for our Club. To do that, I have to create strange matchups, so that I can see how a computerized rating system handles strange matchups. The “King of the Hill” format creates very odd matchups, as you have now experienced. That is the purpose of the format … odd matchups.


On purpose, I am placing players on strange courts against strange opponents, so that I can see how a computerized ratings system works.


We are not evaluating YOUR performance. It does not matter if you lose three straight games and end up on a court you don’t think you should be playing on. On that court, do your best to win, and then the computer has a new and interesting data point to evaluate. That is the point.

  • We are not evaluating your ability.

  • We are evaluating the ability of a computer to evaluate your results.

Every week, I will create odd court assignments, in an effort to create odd situations. This is not a reflection of your ability, the assignment does not reflect my perception of your skills. The court assignment is designed to allow us to learn the most we can about a Computerized Rating System.


A Third Rating Calculation to Give Us Immediate Feedback

Because DUPR is now using the UTPR-style of ratings calculation, I can simulate how DUPR (or Court Reserve) are likely to change over time as matches are entered. To do this, I created a third system, called “KPR” (you can pronounce it as “Caper”). I am starting with your initial color rating (Aqua = 4.50, Burgundy = 4.25, Green = 4.00, Indigo = 3.75, Maroon = 3.50, Orange = 3.25, Purple = 3.00, Red = 2.75, Teal = 2.50. From here, each match applies the ELO-based calculation that DUPR is using (probably not applied perfectly, but more than good enough to demonstrate if computerized ratings work).


Let’s say you are a 3.00 player, paired with a 3.00 player. You are playing against a 3.00 player and a 3.50 player. Your average is 3.00. Your opposition is 3.25. You are supposed to lose this match. Based on the table below, each game has a certain number of ratings points available.


The team with an advantage of 0.25 (3.50 player with a 3.00 player) should have a 90% chance of winning, and if they win they add 0.010 points to their rating (the 3.5 player becomes a 3.51 player, the 3.0 player becomes a 3.01 player). The losing team would lose just 0.01 point (meaning that they go from 3.00 to 2.99).


The team with a disadvantage earns many more ratings points if they win. If they win, they add 0.09 ratings points (going from 3.00 to 3.09). If they win, the “better” team loses 0.09 ratings points (going from 3.50 to 3.41 and going from 3.00 to 2.91).


This is how most computerized ratings systems work (this is the exact formula used by USA Pickleball for UTPR in tournaments – no difference at all).



How Did “KPR” Evaluate Players?

Each player is assigned an initial court, with the intent of creating a handful of scenarios that allow us to see how “KPR” (which now mirrors DUPR) handles difficult circumstances. Some of our Pioneers were placed in challenging situations.


One player was required to play in a round robin and was scheduled to play against better competition.

Game 1: Opponent = 4.125. Petri Dish Experiment = 3.750. Result = Loss. KPR is adjusted from 3.750 to 3.747.

Game 2: Opponent = 3.750. Petri Dish Experiment = 4.130. Petri Dish Experiment wins, adding 0.003 to the KPR. New KPR = 3.750.

Game 3: Opponent = 4.410. Petri Dish Experiment = 4.000. Petri Dish Experiment loses. New KPR = 3.740.

Game 4: Opponent = 4.030. Petri Dish Experiment = 3.686. Petri Dish Experiment loses, subtracting 0.004. New KPR = 3.736.


The player in our “Petri Dish Experiment” didn’t have a lot of fun, but the rating dropped by only 0.014 … essentially no change. We demonstrated that when you play people way better than you, your rating is not harmed. Play the games, battle, and if you win in an upset your rating will increase nicely. If you lose? No worries.


What Else Have We Learned About Computerized Ratings?

One of the fears about Computerized Ratings is that “better players won’t participate, they don’t want to risk their stature, and that means my computerized rating won’t increase.” Anybody watching the better players play during the first two weeks can see this happening – there is a dearth of Burgundy/Green players participating. My response is consistent … “we don’t need a specific distribution of players to prove that computerized ratings work”.


One of the players in my petri dish was assigned a court where the player could not break into our group of best players – I put up a firewall so the player could not advance and play better players. This was done on purpose.


This player went on to win all four games, all against players this person was theoretically better than. Let’s see how the KPR (pronounced “Caper”) performed.

  • 3.750 beginning KPR

  • 3.858 ending KPR

The experiment WORKED! The player won four (4) games out of 4, and the KPR increased nicely … nearly half-way to the next color level.


There is a phrase used by players who felt cheated by prior ratings systems … “tick tock, the game is locked”, meaning that players are being locked out of higher color levels. Our computerized rating experiment demonstrated that even when the player is locked out, the rating increased and the player would earn a spot at a higher level over time. The player only earned 0.027 per game (because the player was playing individuals with lower KPRs), but the computer helped this player regardless. The player would eventually play against better competition.


Did The Computer Fail?

I purposely placed a player on a court several colors above the pay grade of the player. What happened?

  • 1 win, 4 losses

  • KPR changed from 3.250 to 3.126

This isn’t technically a failure, but it does reveal something about computerized ratings .. ratings systems struggle to properly deal with “context.” This person lost 0.124 ratings points because I assigned the player (on purpose) to the wrong court. The player, in essence, lost a half-color-level because of a “mistake.”


In other words, it is important to have a good “starting point.” A computer will fix a poor starting point over time, but that won’t be fun for the player. This will be a challenge with the DUPR system now automatically starting everybody at 3.50. We have tournament players who have worked hard to earn, say, a 3.30 rating … it will not be fun to have matchups where a new (and inferior) player starts at 3.50 against a player who earned a 3.30 rating via tournaments. Over time – this will self-correct. But it won’t be an easy/fast process.


The Color Levels Are Horrible, Correct?

Incorrect. Color levels are more accurate than you have been led to believe. Yes … we learned that players have been incorrectly placed and we have already learned that a computerized ratings system can correct injustices.


We had a Teal player play five games, starting among other Teal players. The results?

  • 4 Wins, 1 Loss

  • KPR improved from 2.500 to 2.805

This person would have moved up to Red level.

This person was competitive against Orange players at the end of the day.

It would likely take one month +/- for this player to progress through color levels before landing where this player would be playing with equal players.


We were able to demonstrate that a Computerized Rating System can correct injustices. If you feel like raters have not been fair to you, a Computerized Rating System can help correct an injustice. But you must do your part … you must win your games.


But The Color Levels Are Horrible!

Incorrect. Overall, the color levels represent skill differences. I can prove that there are actual skill differences between color levels (i.e. an average Maroon player will beat an average Orange player) by comparing the difference in KPR values across the first two days of our Pilot. If color levels are inaccurate, players should have equal chances of winning regardless of color level. A RED player should win half of games against an ORANGE player if the color levels are incorrect.


Here are the results from the first two days of play. I calculated how often an UPSET happened when a lower-rated team beat a higher-rated team. In other words, if two ORANGE players are playing against two MAROON player, the ORANGE players have a KPR that is 0.25 less than the MAROON player … and therefore the ORANGE players “should” win (unless color levels are inappropriate).

  • Equal KPRs = 50% Chance of Winning

  • 0.01 to 0.10 Worse KPR = 46% Chance of Winning (26 games)

  • 0.10 to 0.25 Worse KPR = 38% Chance of Winning (32 games)

  • 0.25 to 0.50 Worse KPR = 21% Chance of Winning (24 games)

  • 0.50 to 0.75 Worse KPR = 20% Chance of Winning (5 games)

In other words, the color levels did a good job of separating players based on quality. From there, the computer (the “KPR” in this instance) was able to move people up who were the victim of an injustice or had a positive day. The KPR moved people down who were in a color level that was too high for the player, or moved players down who had an unlucky day. But make no mistake – the color levels predict game outcomes on a reliable basis. Purples beat Reds, Maroons beat Oranges, Greens beat Indigos.


We had two Aqua players help us … I'm so appreciative that they supported us. They won 8 games and lost just one (1) game. Combined, their KPRs increased by an average of 0.004 … they went 8-1 against inferior competition and their KPRs did not change. This is a win for Computerized Ratings.


How Much Does Winning Help?

Here is what happened to KPR based on number of wins.

  • 0/1 wins = -0.11 (i.e. KPR drops from 3.50 to 3.39)

  • 2 wins = -0.04 (i.e. KPR drops from 3.50 to 3.46)

  • 3 wins = +0.04 (KPR increases from 3.50 to 3.54)

  • 4/5 wins = +0.14 (KPR increases from 3.50 to 3.64)


If you have a horrible day, your KPR drops by -0.11. If you have a great day, your KPR increases by 0.14. If you have a normal day (win 2 lose 3 … or win 3 lose 2) your KPR essentially does not change.


This helps us understand that computerized ratings, once we have enough data, should be “consistent.” A computer will not over-react to normal results. A computer will reward good performance (even if caused by luck), but luck will even out over time.


If you feel like your game is improving and you play in a computerized ratings event for four weeks, winning three and losing two each week, you should see an improvement in your KPR of about 0.16 points … more than half a color level.

41 views

Recent Posts

See All

CRS Pilot, Show me the Math! - 8/17/23

Nearing the Finish Line We completed nine weeks of our Pilot. In dog weeks, that’s sixty-three weeks of events. Or so it might feel that way to you. I say that because we’re nearing the end of our Pil

CRS Pilot, A Problem is Identified - 8/10/23

Executive 1 Page Summary (more to follow for those who want more details). We completed eight weeks of our Computerized Ratings Pilot. 152 players played 424 games in our Pilot, an average of 11.16 ga

CRS Pilot, Its a Journey; 3 KPR Case Studies - 8/3/23

We completed eight weeks of our Computerized Ratings Pilot. June 6/8 = Practice Week June 13/15 = 1st Official Week, Scores Entered into DUPR via Phone June 20/22 = 2nd Official Week, Scores Entered i

Kommentare


bottom of page