The Caitlin Clark Effect, Visualized
Visualizing key suspects of Caitlin Clark's rise into women's basketball, particularly during the 2023-24 basketball calendars.
The following report was derived from data visualization coursework completed throughout June and July 2024. Read the full 38-page report here, or continue on to the abbreviated report intended for Bite-Sized Bison subscribers.
We’ve all heard plenty about Caitlin Clark, and as mentioned in a previous BSB, some have become fatigued of the narrative that created and surrounds The Caitlin Clark Effect. That’s entirely understandable; however, I find the phenomenon to be fascinating, so when I was tasked to create data visualizations on a topic, the direction was obvious. I hope the following visualizations are a thoughtful, refreshing take on a topic that feels like it’s been picked clean.
You’d be surprised how few visualizations have been created in an attempt to contextualize what Clark and women’s basketball have been doing. I found a few and included them in the 38-page report linked above, but most of the visualizations below seem to be entirely unique.
If you’re interested, these were all created using the Seaborn and Matplotlib Python packages, and the datasets were constructed by me using data from sportsdataverse-py (the same open source sports data organization that provides data for the most popular sports analytics efforts like CFB Data, GameOnPaper.com, and BCF Toys that you see often used in BSB), as well as other common databases used in BSB, like HerHoopsStats.com and SportsReference.com. This was all done in a Jupyter environment.
Objective
The premise of the report is centered on three questions, created to package a narrative on The Caitlin Clark Effect. While it’s impossible to fully encompass Clark’s effect with a handful of visuals, they do seem to provide some insight previously unknown (to me anyway).
Question 1: How did Caitlin Clark affect the attendance and viewership numbers in college, and has that translated to the WNBA?
Question 2: What makes Caitlin Clark such an exciting player?
Question 3: Does the attention Clark receives mirror her performance in the WNBA?
Question 1: How did Caitlin Clark affect the attendance and viewership numbers in college, and has that translated to the WNBA?
The casual sports fan understands Caitlin Clark’s teams have been obliterating attendance and viewership records for a couple years, both in the NCAA and the WNBA, but there aren’t many visualizations that put that into focus in an efficient way. The three below convey some of the effect, but certainly not all of it.
Source: sportsdataverse-py (June 20, 2024)
What Am I Looking At? This data was collected June 20, 2024, so not all road WNBA opponents are included. This is also regular season only data. Blue bars show each of Iowa’s opponents’ average home attendance figures in 2023-24, and the orange bars do so for the Fever’s opponents in 2024. The gray bars indicate the actual attendance number when Clark’s team visited, showing the explosion of attendance when she visits a market. The red dashed lines show opponents’ attendance records.
Some observations:
Iowa broke (or matched) every attendance record except for Rutgers during Clark’s senior season, but the Fever haven’t broken as many attendance records (according to teams’ records). The WNBA had a large viewing audience during the first handful of years after its establishment in 1997.
It’s become routine for Clark’s teams to outpace average home crowds when on the road, as opponents have mentioned repeatedly, dating back to Clark’s college days.
Indiana had the largest average home crowds of any Iowa opponent last season. That mark jumped above 10,000 when counting the postseason games.
A certain class of NCAA programs, at least what is shown by Iowa’s opponents here, seem to draw larger crowds on average than the WNBA.
Future Work
This visualization could be updated once this WNBA season is complete, which would provide a more complete scope of how The Caitlin Clark Effect took shape at its peak in 2024. It already could be updated to show the Fever’s road opponents since June 20.
Source: sportsdataverse-py (June 20, 2024)
What Am I Looking At? This data was collected June 20, 2024. This is also regular season only data. There are four entities (NCAA, WNBA, Iowa, Fever) shown in this time series stacked line graph, and each line uses the line below it as 0, so the numbers along the y-axis are primarily used for cumulative totals, not individual totals. This chart is primarily about trends. The shaded areas are when Caitlin Clark played for each entity, and since the WNBA and NCAA work on alternating seasons (NCAA in winter, WNBA in summer), Clark technically played the 2024 seasons for both leagues. That is why the line showing Clark’s entry into the WNBA is on 2023, even though she was drafted in 2024, and the line showing her entry into the NCAA is on 2020, even though her first season was 2021.
Some observations:
The change in Iowa and the Fever’s average home attendance since Clark’s rise is astounding. It takes quite a bit of influence to do the same to league-wide totals, but there was an uptick in both leagues compared to the handful of seasons before the COVID-19 pandemic.
Cumulatively, the attendance for all four leagues nearly doubled since 2015.
This isn’t shown in the chart, but the Fever had the lowest average home attendance in 2022 (second-lowest in 2023) and now own the largest average home attendance in the WNBA, by quite a margin too. You can see this reflected in the Fever’s shaded area. You can see something similar in Iowa’s shaded area over time.
Future Work
I’m very curious what this chart looks like a few years from now. I imagine the Iowa region will decrease, naturally, given all the changes happening there, but I’m not sure about the others. Will this trend continue past 2024 with the same fervor?
By expanding out and including all years of the WNBA since 1997, the shaded regions wouldn’t be as legible, but what would be shown is that the WNBA once held a stronger viewing audience that has tailed off, which can be seen in the chart above leading up to 2020. This is shown in the image below, created by @magdelena.kala on Threads just last week. You can see that the peak of WNBA attendance was nearly 11,000 in 1998. Referring back to my chart above, the WNBA and NCAA combined didn’t reach that level in 2015. Collectively in 2024, though, they seem to have, finally. This is crucial in noting that while Clark certainly is the catalyst to this increased interest, she’s not solely responsible.
Source: Statista, Nielsen (June 20, 2024)
What Am I Looking At? Because TV rights can skew viewership numbers, national championships are a good way of comparing viewer interest in sports, even if the sample size is small, because they’re usually shown on national TV. This line graph simply compares the viewership of the men’s and women’s national championships since 2013, with the last two highlighted, since those featured Clark.
Some observations:
Obviously, the women’s national championship surpassed the men’s game in viewers this year.
Again, it’s pretty clear that the two national championship games featuring Clark were the spike for the women’s game. Though, there are other factors outside of Clark, of course, such as how exciting the South Carolina team was in 2024 and Angel Reese on LSU in 2023.
The men’s game seems to have an interest issue in this context, separate from the women's game.
Even with the peak of the women’s game in this chart, it’s still not near the peaks of the men’s viewership just last decade. Expanding out would likely show even higher peaks.
Future Work
Like the other two visualizations responding to this question, I’m curious how this looks a few years from now. Does the men’s game continue downward? Does the women’s game continue upward? Can the women’s game reach the same peaks as the men’s game experienced? How does star power affect these numbers (probably a different project for that question)?
Question 2: What makes Caitlin Clark such an exciting player?
The charts were the most fun to build because they were so exploratory. I wasn’t sure what exactly would come out of the data, not as sure as I was regarding attendance and viewership, anyway.
Source: SportsReference.com, sportsdataverse-py (June 20, 2024)
What Am I Looking At? The NBA defines “clutch time” as the last 5 minutes of a deciding quarter (fourth or overtime) when the two teams are within at least 5 points of each other. The above chart shows how many points the top 20 scorers in the NCAA in 2023-24 scored in these scenarios. It also shows the total points scored by each of them throughout the season.
Some observations:
Caitlin Clark finished third in this metric, which is very good.
Uber-dominant teams and forwards don’t typically score many points in these scenarios. Mackenzie Holmes is a bit of a victim to that reality, as well as Angel Reese and Aaliyah Edwards. Audi Crooks scored the most among forwards.
Remember Lucy Olsen’s name if you aren’t familiar. She is second here and transferred to Iowa from Villanova. She will be a star in Clark’s absence.
Some other names to watch for in the NCAA this upcoming season: Georgia Amoore (Kentucky), Juju Watkins (USC), Ta’Niya Latson (Florida State), Audi Crooks (Iowa State), Hannah Hidalgo (Notre Dame), Kiki Iriafen (USC), and, of course, Paige Bueckers (UConn), among others.
Lastly, just look at how Clark outpaced the other top-20 scorers in the nation last season!
Future Work
I am curious how this translates to the 2024 WNBA season, but the Fever spent quite a bit of this season so uncompetitive that it might not speak as much to Clark’s ability, at least not as much as her time at Iowa did. Though, it would show who the most clutch players are in the WNBA right now.
Source: sportsdataverse-py (July 16, 2024)
What Am I Looking At? This is the most complex of the visualizations, but it’s intended to show something very simple: Who is attempting the most long-distance shots in the WNBA this season (as of July 16, 2024)? The answer is, emphatically, Caitlin Clark, as can be seen by the boxplot distributions of shots taken 25+ feet away from the basket this season. The density chart underneath shows the density of made shots (not just attempted shots) of 25+ feet when compared to Sabrina Ionescu.
Some observations:
Caitlin Clark is in her own tier here. Her boxplot above shows a much larger distribution of shot distances past 25 feet from the basket, among the top-9 players in shot volume of 25+ feet. She has no outliers (dots) shown, and her max distance is only shorter than two other outliers. Her median distance is also two feet longer than the median distance of the next four shooters.
Reading the density chart, Ionescu hits more shots than Clark when closer to 25 feet (still very long shots!), but Clark converts on more shots than Ionescu the further away they get.
It’s not shown in the chart, but Diana Taurasi actually converts at a higher rate than anybody at these distances, but she attempts the fourth-most.
Future Work
I’d like to find a way to work in a conversion rate into this chart. I kept it simple by only doing a distribution chart for the boxplots, but it would be interesting to include two different distributions side-by-side for each shooter (one for attempted shots and one for converted shots). I might do that when the season ends.
Question 3: Does the attention Clark receives mirror her performance in the WNBA?
Source: HerHoopsStats.com (July 16, 2024)
What Am I Looking At? The above is a scatterplot with 121 WNBA players all-time (min. 25 minutes per game). PER is across the x-axis, and Defensive Rating is along the y-axis (inverted because a lower Defensive Rating is better). The histograms along the margins help with comparing where Clark is in comparison to the other players’ career statistics. Active players are also highlighted orange to compare the current WNBA to the past WNBA.
I recognize that comparing career numbers to half of a rookie season is unfair to Clark, but the influx and diversity of new audiences watching the WNBA warranted the exploration, as the question asks if the attention mirrors her performance. Since this spike in interest is historic, comparing Clark to historic players was necessary.
Some observations:
Clark is, clearly, not very effective defensively, but she’s not asked to be. What is somewhat surprising is her average PER. This is most likely due to the number of turnovers she has, which should go down as she progresses into her career. That PER figure should move to the right in the next few years.
There seems to be a trend moving away from defense and more toward offensive efficiency right now, which we’ve seen in the last decade in other levels of basketball too. You can almost draw an imaginary trend line for the active and inactive dots on this scatterplot. Even though Clark isn’t highly favorable in these value metrics at the moment, this trend suggests why fans are attracted to her style of play and why she draws so many eyes.
Future Work
I didn’t include the trend lines for active and inactive players because I didn’t want to crowd the chart, but it might be worth including in a future version of this visualization. The same could be said for pointing at other players’ dots, such as A’ja Wilson, Breanna Stewart, Tamika Catchings, Angel Reese, etc. I’d also like to see this chart a few years from now, both to see where Clark is but also to see if the trend between active and inactive players continues.
Source: HerHoopsStats.com, sportsdataverse-py (July 16, 2024)
What Am I Looking At? The above chart might look familiar to folks who play sports video games, but it’s called a radar chart (because it looks like a radar!). Each of the measurements are in units of percentiles within the WNBA (for example, Clark is in the bottom-1% for turnovers, while Reese is in the 37th percentile). Percentiles are necessary because they’re all on the same 1-100 scale, highlighting one of the limitations of a radar chart. WNBA 2024 Rookie of the Year is a two-horse race right now, between Clark and Reese, but the attendance figures don’t show that (as seen in the bar chart above). The radar chart uses 9 fundamental metrics (chosen with the intent of eliminating bias), and it conveys each player’s profile (Clark with assists, Reese with rebounds) but also in value (PER, Win Shares).
The intent of this chart is not to show who is the better player because 1.) this chart doesn’t prove one or the other, and 2.) it will take more than a radar chart to make a good case for either. Its purpose is to show the proximity in value each player maintains and juxtapose that point with the average home attendance for their respective teams, which are each competing for similar opportunities in the league right now.
Some observations:
Obviously, Clark (already a historic distributor) and Reese (already a historic rebounder) need assists and rebounds, respectively, to show their profiles as players and have an edge on each other for each.
Clark leads in points per game and usage. Reese leads in Defensive Rating, which typically favors forwards over guards. However, for anyone watching these games, Reese does offer more defensively than Clark.
Clark turns the ball over more than anybody in the league, but Reese being in the 37th percentile isn’t exactly great for a forward either.
Reese leads Clark in the value metrics (PER and Win Shares/40). This doesn’t mean she’s more valuable (it takes more than a stat to show that), but it does show that she’s closer than some folks realize.
The attendance difference is insane!
Future Work
This chart could be expanded to include more players from this famed 2024 draft class, though you have to be careful with radar charts because they can get busy very quickly. It could also compare other more established players, such as Brittney Griner vs. A’ja Wilson or something. But ultimately, I could see this chart rounding into better shape with a larger sample size than just half a rookie season for both Clark and Reese.