Network Graphs – Visual-Baseball

Category: Network Graphs

Bad Trades, Red Sox Edition

Posted on December 4, 2022November 14, 2024 by kc2519

This is the first in a series of posts where I take a look at notoriously one-sided baseball trades, using the baseball trade networks published on this site earlier in 2022. I won’t necessarily rank these deals in any sort of order; rather I will pick out a few from the network trade graphs and provide some analysis and context for some of the most notorious transactions.

If you haven’t seen the trade networks previously, here’s a link.

The networks were built using data from Retrosheet and Neil Paine, loaded into Gephi, a network analysis and visualization tool, and ultimately pushed to the web where I could finish styling the graphs. Graph nodes (the circles in the networks) are sized based on the total future WAR (Wins Above Replacement) accrued by the teams involved in the trade. All values must occur at the major league level (MLB), so players involved in the deal who don’t reach the MLB level with their new team will have a zero value. Only the cumulative WAR value while playing for the new team is included; we are not calculating WAR once a player leaves one of the teams involved in the transaction.

Finding a bad trade by scanning the networks is more an art than a science; the key is to look for large nodes (indicating a lot of future WAR value), and then dissecting the trade to see how much value each team received. The other alternative is if we already know the player(s) we are looking for; in these cases we can perform a simple search to find the trade. Here’s a classic example that Red Sox fans would love to forget – trading future Hall of Famer Jeff Bagwell for journeyman reliever Larry Anderson. Let’s go to the Red Sox trade network and search for Jeff Bagwell.

Typing in Jeff Bagwell locates him quickly within the trade network. Note that even if a player is involved in multiple trades to or from the same team (rare but possible) the search will locate each transaction. Here’s the Bagwell transaction, showing his player node and future WAR value connected to the transaction node; every player involved in that transaction will be connected to the trade node, as long as there is some future WAR value. If a player in the trade did not play in the majors for the receiving team, they will not be reflected in the graph. Here’s a view of Jeff Bagwell relative to the trade:

We can also click on the transaction node to see the value provided to each team by all of the players involved in the trade, again assuming they spent time with the team and were not limited to the minor leagues. Clicking on that node will display the respective WAR values in the sidebar on the left of the screen:

Here’s where we get to the details of the trade, and specifically the direct benefits accrued to each team. The Red Sox received 1.1 future WAR from Larry Anderson; to put this in perspective, we might expect this sort of value for an average player for a single season. The Astros, on the other hand received an incredible 93.8 WAR from Jeff Bagwell, or close to 6 WAR per season for 16 years! That is a Hall of Fame level performance, and it eventually led to his selection to Cooperstown in 2017. Here’s a profile that mentions the one-sided trade.

While we have the Red Sox network open, let’s see if there are any other disastrous transactions (other than the cash sale of Babe Ruth to the Yankees, technically not a trade). After scanning the network, we find this one from 1928:

This one is clearly not a Bagwell-level disaster, but was still quite negative for the Red Sox, with a WAR differential of 30 points. The primary villain here is Buddy Myer, a solid infielder who hit .300 or better seven times for the Senators. Not a major star, but the owner of a very nice career, including leading the American League in batting average in 1935.

Let’s try to find one more before closing this piece, this time favoring the Red Sox. We zero in on this deal:

The Red Sox netted nearly 45 future WAR value while surrendering just 0.1; most of the benefit was generated by slugging future Hall of Famer Jimmie Foxx, but they also received a nice three season contribution from pitcher Johnny Marcum. Note how we also removed nodes not involved in the trade by clicking on the edges icon on the bottom left of the display area; this makes it easier to focus on the details.

Feel free to try your hand at finding more of these one-sided deals in the Red Sox or any other trade networks. I’ll be back with some other teams before long. Thanks for reading!

Final WAR Trade Networks Published

Posted on April 22, 2022November 14, 2024 by kc2519

The final 10 MLB WAR Trade Networks have now been published, bringing the total number of graphs to 31 – 30 teams and one overall network with all teams and transactions. For more information on the trade networks, click here. Here are the remaining networks:

Find your favorite teams and enjoy!

More WAR Trade Networks Published

Posted on April 20, 2022November 14, 2024 by kc2519

I’ve added 11 new MLB WAR Trade Networks to the site, bringing us to 21 in all. One more round of updates should get us to the full complement of graphs. For more information on using the graphs, see here.

Here are the new adds:

All the networks can be found here. Enjoy!

First 10 WAR Trade Networks Published!

Posted on April 20, 2022November 14, 2024 by kc2519

The first 10 WAR (Wins Above Replacement) Trade Networks are now available for exploring! This initial group includes nine team networks and one overall graph with all teams included. Here’s a list of the 10 graphs:

Each of these and any upcoming WAR trade networks can be found on this page.

Let’s walk through how the graphs work, using the Detroit Tigers network as an example. We’ll begin with an anatomy of the graph display:

As the image shows, the primary focus will be the main graph area in the center of the window. This is where all nodes (transactions, teams, and players) will reside, connected by edges based on common relationships. Transaction nodes will vary in size based on the total value of a trade with the largest nodes indicating a trade that created significant future WAR for one or both teams. Team and player nodes are set to constant sizes so that the initial visual focus will be on the transaction nodes. The size differences become more noticeable when we zoom in to the network. More on that shortly.

Edges are also sized based on WAR value; this is where we see the value provided to a team and by specific players. Edge sizes (weights) will be more easily seen when we zoom in to the network.

On the left are some graph controls to assist in navigating the graph. We can zoom in using the slider control or the plus/minus buttons adjacent to the slider. Zooming can also be done with a mouse scroll if you prefer that option. The fisheye lens can be toggled on or off and can be used to highlight certain areas of the graph by hovering over a selected region. Finally, the edges button will enable showing or hiding edges and connected nodes. This is useful when you wish to reduce surrounding nodes and focus on specific transactions. We can also pan the graph by dragging it using a mouse – this is helpful in centering a network or viewing specific regions of the graph.

At the upper left of the window is a color legend for each node type, and hidden on the left (not shown in our image) is an information pane that will show specifics about the network. More on that in a bit.

Now let’s examine the information window – this is what makes the network truly powerful. When the network is first displayed or the browser window is refreshed the information pane displays information about the graph (open it by clicking on the arrows icon at the top left):

You can see the simple overview of the graph, the source data, and what it aims to accomplish. Here’s an enlarged version for easier reading:

If we zoom in and select a specific transaction the pane displays the relevant details for that selection:

Now we have the details for the transaction – the season, teams, and players involved. Here’s the enlarged view:

You can do this for any transaction in a graph, or you could choose to select a team or player to see how they fit into the network. The possibilities are nearly endless and it’s a fun way to understand the relationships between teams, players, and trades.

We’ll do more exploring of the networks in upcoming posts; I’ll also be adding more teams until we have a complete set of trade networks. In the meantime, feel free to explore the graphs to learn more about the best (and worst) trades your favorite team has made over the last 120 years. Enjoy, and thanks for reading!

MLB Trade Networks Part 3: Edges Code

Posted on April 18, 2022November 14, 2024 by kc2519

In our previous post I shared the SQL code I created to pull data for our upcoming set of trade networks based on WAR (Wins Above Replacement) numbers from the Neil Paine 538 MLB data set. The prior post dealt with creating nodes for a network graph; this post will share code for edge creation. In simple terms, a graph needs edges that connect related nodes; for our case we need to connect transaction (trade) nodes to the teams and players involved in each transaction.

Part of what makes this case interesting is my desire to show edge weights based on the future WAR value each team received. Showing edges with varying weights will quickly help users to identify the relative importance of a trade. Wider edges will indicate a trade that involved high future value for one or both teams. In seeing the individual players involved in a common trade we can pinpoint where the future value (or lack thereof) comes from. This will become much clearer when the graphs are posted; I’ll do one or more posts on how to use and interpret each graph.

For now let’s examine the code. Gephi requires users to identify Sourcenodes and Targetnodes whether the edges are Undirected(i.e.- it doesn’t matter which node leads to the other) or Directed. Our initial code is for transactions to teams:

SELECT CONCAT(tr.TransactionID, ‘-‘, tr.PrimaryDate) AS Source, t.franchID AS Target, CONCAT(‘The ‘, t.name, ‘ received ‘, ROUND(SUM(h.WAR162),1),
‘ wins in future WAR value’) AS Label,
IF(ROUND(SUM(h.WAR162),1) = tr.season and tr.Type = ‘T’ AND tr.Season >= 1901 and LENGTH(tr.TeamTo) = 3 AND LENGTH(tr.TeamFrom) = 3
AND tr.Season = t.yearID
GROUP BY tr.TransactionID, tr.PrimaryDate, t.franchID, t.name;

With this code we are linking every transaction to the teams receiving one or more players in a trade. Note that we are summing the WAR value to create an edge weight based on the total value received by each team. If four players were involved (two to each team) these edge weights will reflect the combined values of these players. Note that we are setting edge weight = 1.0 if the future WAR is less than 1 (some will actually be negative so we need a minimal edge to show). Here’s a sample of results:

In contrast, the edges linking a transaction to individual players are based solely on that one player’s value. In the case cited above we will wind up with four lines of varying weights. Otherwise the code is quite similar:

SELECT CONCAT(tr.TransactionID, ‘-‘, tr.PrimaryDate) AS Source, p.playerID AS Target, CONCAT(p.nameFirst,’ ‘, p.nameLast, ‘ provided ‘, ROUND(SUM(h.WAR162),1),
‘ wins in future WAR value for the ‘, t.name) AS Label,
IF(ROUND(SUM(h.WAR162),1) = tr.season and tr.Type = ‘T’ AND tr.Season >= 1901 and LENGTH(tr.TeamTo) = 3 AND LENGTH(tr.TeamFrom) = 3
AND tr.Season = t.yearID AND t.franchID = h.franch_ID
GROUP BY tr.TransactionID, tr.PrimaryDate, p.nameFirst, p.nameLast, p.playerID, t.name;

The same logic on edge weights applies but now at the player level. Here are a few results:

I hope this makes sense – it will all become much more clear when the network graphs are produced. The good news is that I already have three graphs created and many more to come shortly. I’ll have some of them available on the site later this week. As always, thanks for reading.

Trade Network Updates, Part 2 (node code)

Posted on April 13, 2022November 14, 2024 by kc2519

Over the last 10 days I’ve been playing around with code that will enable some new versions of the MLB trade networks I premiered way back in 2015. The goal this time around is to factor in the future value of a trade to each of the participating teams. There are multiple measures that could be used for this assessment but I’m going with the WAR162 value from Neil Paine’s 538 github data source. Here’s how the site describes the War162 measure: JEFFBAGWELL WAR per 162 team games. Now you may ask why Jeff Bagwell? While he was a talented hitter for many years, his name is used as an acronym for this:

“The file “jeffbagwell_war_historical.csv” contains wins above replacement (WAR) data — according to JEFFBAGWELL (the Joint Estimate Featuring FanGraphs and B-R Aggregated to Generate WAR, Equally Leveling Lists), which averages together WAR from Baseball-Reference.com and FanGraphs — plus various other metrics for MLB since 1901.” Fun stuff, right?

The bottom line from my perspective is that this measure provides a robust way of assessing the value of a trade based on performance after the trade date. Did one team benefit while another team received a player who added no future value? Or did both teams make out equally well? Or was the trade of minimal value for both sides? These are the questions I’m attempting to visually address using network analysis and visualization.

Now comes the technical aspect for all you database and code lovers. First step is to create the network nodes; in this case we need to display the individual trade transactions, teams, and players. Let’s look first at the transactions using the Visual-Baseball MySQL source data:

SELECT CONCAT(a.Id, ‘-‘, a.PrimaryDate) as Id, CONCAT(‘Transaction ‘,a.Id,’ is from the ‘, a.Season, ‘ season’) AS Label,
‘Trade’ AS Type, SUM(a.Size) AS Size
FROM
(SELECT tr.TransactionID AS Id, tr.Season, tr.PrimaryDate, ROUND(SUM(h.WAR162),1) as Size
FROM historical_WAR_and_more h
INNER JOIN People p
ON h.key_bbref = p.bbrefID
INNER JOIN trades2021 tr
ON p.retroID = tr.Player
INNER JOIN Teams t
ON tr.TeamTo = t.teamID

WHERE tr.Season >= 1901 and h.year_ID >= tr.season and tr.Type = ‘T’ and tr.TeamTo = t.teamID and LENGTH(tr.TeamFrom) = 3
AND tr.Season = t.yearID and t.franchID = h.franch_ID

GROUP BY tr.TransactionID, tr.season, tr.TeamFrom, tr.TeamTo, tr.PrimaryDate) a

GROUP BY a.Id, a.PrimaryDate, a.Season;

Here we are simply creating a node for each trade transaction, a label showing the teams involved and the trade date, and summing up the previously mentioned WAR162 to size the nodes. This will be an important part of the graph – trades that created large future values (for one or both teams) will be more prominent in the graph. Small value trades will be represented by very small nodes indicating their relative lack of importance. This one was a challenge, but finally got the code to deliver the expected results.

The next step is to create team nodes; in this case we’ll provide a constant size:

SELECT t.franchID AS Id, tf.franchName AS Label, 15 AS Size

FROM historical_WAR_and_more h
INNER JOIN People p
ON h.key_bbref = p.bbrefID
INNER JOIN trades2021 tr
ON p.retroID = tr.Player
INNER JOIN Teams t
ON tr.TeamFrom = t.teamID
INNER JOIN TeamsFranchises tf
ON t.franchID = tf.franchID

WHERE tr.season >= 1901 and h.year_ID >= tr.season and tr.Type = ‘T’ and h.team_ID = tr.TeamTo and LENGTH(tr.TeamFrom) = 3

GROUP BY t.franchID, tf.franchName;

By applying a constant node size of 15, each team will have a similar appearance in the graph which will not distract us from the trade values (some will be much larger than 15).

Our third and final node step is to provide information on all players involved in one or more trades:

SELECT Id, Label, ‘Player’ AS Type, 5 AS Size
FROM
(SELECT p.playerID AS Id,
CONCAT(h.player_name, ‘ (‘, p.birthYear,’-‘,p.deathYear,’)’,’ played from ‘,LEFT(p.debut,4),’ to ‘,LEFT(p.finalGame,4)) AS Label

FROM historical_WAR_and_more h
INNER JOIN People p
ON h.key_bbref = p.bbrefID
INNER JOIN trades2021 tr
ON p.retroID = tr.Player

WHERE tr.season >= 1901 and h.year_ID >= tr.season and tr.Type = ‘T’ and LENGTH(tr.TeamFrom) = 3
and LENGTH(tr.TeamTo) = 3
AND p.deathYear > 1900

GROUP BY h.player_name, p.playerID, p.birthYear, p.deathYear, p.debut, p.finalGame

UNION ALL

SELECT p.playerID AS Id,
CONCAT(h.player_name, ‘ (‘, p.birthYear,’-‘,’ )’,’ played from ‘,LEFT(p.debut,4),’ to ‘,LEFT(p.finalGame,4)) AS Label

FROM historical_WAR_and_more h
INNER JOIN People p
ON h.key_bbref = p.bbrefID
INNER JOIN trades2021 tr
ON p.retroID = tr.Player

WHERE tr.season >= 1901 and h.year_ID >= tr.season and tr.Type = ‘T’ and LENGTH(tr.TeamFrom) = 3
and LENGTH(tr.TeamTo) = 3
AND ISNULL(p.deathYear)

GROUP BY h.player_name, p.playerID, p.birthYear, p.deathYear, p.debut, p.finalGame) a
GROUP BY Id, Label;

Here we are running a UNION query so we can gather information about the players moving in each direction of a trade (from one team to another). We then combine that information and apply a fixed size of 5 since there are far more players than teams. We’ll have the ability in the finished networks to zoom in and see more about each player.

Each of these 3 outputs (trades, teams, and players) is combined into a single input file that will feed Gephi. We should wind up with between 10k and 20k nodes which we’ll be able to filter
and zoom on in the network graph. I have high hopes for this set of networks (there may be one for each team as well as a comprehensive one) as it should really help display the most important trades in MLB history.

That’s it for our node creation process; the next post will share how we create the edges that will connect trades to teams and teams to players. Thanks for reading!

Trade Network Updates, Part 1

Posted on April 4, 2022November 14, 2024 by kc2519

A few years back (2016 o be specific) I created network graphs displaying the history of trades made for each MLB franchise, using transactions data from the wonderful Retrosheet project. These graphs presented more than a few challenges in how to present the data but I wound up with what I consider to be a very interesting set of results, which you can find here. I also created some posts on the process at that time, found here and here.

Here’s a snapshot within a graph:

Six seasons have elapsed since I created those graphs, so I thought it was beyond time to update them, but this time with a twist. Last fall I came across a great dataset that captures an array of advanced sabermetric statistics which I hope to use on a regular basis. These statistics can be used to assess a player’s true value relative to his peers each season. What if I could incorporate those into the trade network updates to show the post-trade value of each player to their new team? Ideally, this will help to show the value of each trade and which team wound up getting the better part of the deal.

Of course this would involve adding a degree of complexity to the MySQL code for pulling the data and shaping it for use in creating network graphs. However, the end result could be very revealing and worthwhile. Today I’m at the start of the process, tinkering with SQL code to extract the data in a proper format. Here’s an example:

SELECT h.player_name, p.playerID, tr.season, tr.TransactionID, tr.TeamFrom, tr.TeamTo, ROUND(SUM(h.WAR162),1) as WAR

FROM historical_WAR_and_more h
INNER JOIN People p
ON h.key_bbref = p.bbrefID
INNER JOIN trades2021 tr
ON p.retroID = tr.Player

WHERE tr.season >= 1901 and h.year_ID > tr.season and h.team_ID = tr.TeamTo AND tr.Type = ‘T’

GROUP BY h.player_name, p.playerID, tr.season, tr.TransactionID, tr.TeamFrom, tr.TeamTo

In this case, I’m looking at the cumulative WAR (Wins Above Replacement) for each traded player with their new team. This could be a single season total or the sum of many years in some cases. Here are some results:

We now have post-trade results (starting if the season following the trade) as measured by WAR for each traded player. We see one fairly substantial figure – the second Aaron Harang trade which netted 16.9 WAR points for his new team, the Cincinnati Reds (CIN in the results). Given that a single season WAR above 3 or 4 is considered substantial, it’s clear that his new team probably benefited from a few of those high-value seasons. What we can’t see yet is what they gave away in their half of the trade.

Fortunately, we can access this using the TransactionID field, which provides all the information for each party within the trade. But we’ll save that for another day as I figure out the next progression of the code. As always, thanks for reading!

Radial Franchise Networks for all MLB Franchises

Posted on April 22, 2020November 14, 2024 by kc2519

All 30 MLB Radial Networks are now complete, and available for you to explore. One thing to notice is that each network will have a slightly different (or radically different) shape, depending on how many (or few) players started in a single season. If the team was in the midst of a successful run, the radians will tend to be short, as fewer rookies or acquired players will debut. On the other hand, teams that are retooling will tend to have long radians, as there are many new players making the team. This could also be reflected in the number of players getting a September call-up from the minors.

While these networks are pretty attractive to view as static images (IMHO), the real fun comes from the interactivity, where you can click, zoom, pan, and see all the details for who played with whom over the course of a franchise’s history. Note that this is based on seasonal rosters, so not all connections actually played together at the same time of the season.

Anyhow, check out a handful of examples, and then try them out yourself at the Franchise Radial Networks 1901-2019 page.

[foogallery id=”1560″]

Team Franchise Radial Axis Network Anatomy

Posted on April 7, 2020November 14, 2024 by kc2519

A few years back, I created network graphs for many MLB franchises, using data from the Lahman Baseball Database. These graphs displayed the connections between teammates throughout the history of a given franchise from 1901-2013. Any players who were on a roster within the same season (or seasons) were connected to one another, with each node in the network representing a single player. These were then sized to reflect the number of seasons played with that franchise. Every graph was customized by using one or more of their official team colors, resulting in a visualization like this:

A full roster of the live versions can be found here.

Now that 2019 data is available, I thought it was about time to update the graphs, but this time with a new look that might make the graphs a bit more intuitive and perhaps even more visually attractive. Out of this was born the radial axis franchise graph, as shown for the 1901-2019 Detroit Tigers:

I’m pretty excited about the look the Radial Axis layout provides for this sort of data, and I think you’ll see why it is an effective method for visualizing all the players over the course of 119 seasons. Let’s have a look at the anatomy of this graph.

The graph runs in a counter-clockwise manner, starting with 1901 at the bottom of the graph, and working all the way around until we get to 2019, also at the bottom of the graph. Each set of nodes along the way represents the collection of players with their first Tigers season in that radian. We can see some years where there were many new players (1912 has an exceptional number), while other years had very few new players, 1915 being an example. Here’s a general diagram to help with this concept:

We can also identify a handful of players with especially large nodes, which indicate the number of seasons with the Tigers. These are sorted to make the graph clean and easy to interpret; players with the most seasons will all be closest to the center of the graph, with teammates from the same starting year sorted from longest to shortest tenure. The perimeter of the graph will be populated almost exclusively with one-season players.

For context, let’s examine a few of the large nodes, and identify who they are:

I hope this is starting to make sense. Each radian represents a season, and each node on a radian depicts a player, sized by their longevity with the team. The third critical aspect of the graph is the connectivity between players, represented by the thin gray lines running between them. These are called edges, and are at the heart of a network graph. Let’s have a closer look at the edges for Alan Trammell, as one example.

If we click on the Alan Trammell node, the graph is reduced to him and the players he played with over his career – or at least those who were on the roster in those seasons. This is the fun part of the graphs, as it facilitates exploration and pattern discovery. Here is a portion of the Trammell network, zoomed in so we can see the connections:

Now the edges are a bit more visible, and the graph detail begins to reveal itself. Notice the multiple large nodes in line behind Trammell; it turns out that this is the celebrated 1977 class, many of whom would ultimately be members of the great 1984 World Series champs. So while the 1977 Tigers were not a good team, they were beginning to see the fruits of a strong minor league pipeline. In order, here are the players in that group, and their connection to the 1984 team:

Alan Trammell (20 seasons, WS Champ)
Lou Whitaker (19 seasons, WS Champ)
Jack Morris (14 seasons, WS Champ)
Lance Parrish (10 seasons, WS Champ)
Milt Wilcox (9 seasons, WS Champ)
Dave Rozema (8 seasons, WS Champ)

Trammell is obviously connected to many other players who started in different seasons, given his 20 years with the team. In fact, he has a degree measure of 333, representing the number of players with a connection to Trammell. Thus, we can also see many connections to players who started their Tigers journey in other seasons. The three large nodes to the upper right of Trammell are interesting – who are they?

Closest to Trammell is John Hiller (1965-80)
Next is Mickey Stanley (1964-78)
Followed by Willie Horton (1963-77)

What we have here are the three remaining holdovers from the 1968 World Series champs, finishing up their careers just as Trammell was starting his long tenure with the Tigers. Discovering patterns like this make network graphs very interesting for me, and I hope you will also find them interesting. I’m currently in the process of refining the Tigers graph, which will be followed by graphs for each MLB franchise, once again using their team colors to provide some visual context. Hope you found this informative, and we’ll see you soon.

Updating Player Networks – Part 3

Posted on July 10, 2018November 14, 2024 by kc2519

In Part 1 of this series, we looked at how to generate node and edge data for all players within a single franchise’s history. Part 2 examined how we could take that data and create a network using Gephi, adding graph statistical measures along the way. In this, the final part of the series, our focus is on moving the graph beyond Gephi and on to the web, where users can interact with the data and interrogate the player network using sigma.js software. So let’s pick up with the process of moving the network from Gephi to sigma.js.

Recall our basic network structure in Gephi, which looks like this:

One of our goals when we export the graph to the web is to enable user interaction, so the above graph becomes a bit less intimidating. As a reminder, this is at most a moderate sized network; the need to provide interactive capabilities becomes even greater for large networks.

There are a few ways we can create files suitable for web deployment using Gephi. In this case, the choice is to use the simple sigma.js export plugin located at File > Export > Sigma.js template. Selecting this option will provide a set of options similar to this:

This template allows for a modest level of customization, including network descriptions, titles, author info, and other attributes relevant to the network. When all fields are filled to your satisfaction, click on the OK button to save the template. Your network will be saved to the location specified in the blank space at the top of the template window (grayed out in this case). A word of caution is in order here – if you make some custom entries to the template, and then make adjustments to your network, be sure to specify a new location to save the generated files. Otherwise, the initial set will be overwritten. This is especially critical if you have gone behind the scenes to customize colors, fonts, and other display attributes. More on that capability in a moment.

Once the template is complete and the OK button is clicked, a set of folders and files is generated that can then easily be copied to the web. Here’s a view of the created file structure:

These files and sub-folders are all housed within a single folder named ‘network’. If you wish to tinker with your graph in Gephi, rename the network folder to something else prior to exporting a second (or 3rd or 4th time). This will help keep you sane. 🙂

Without going into great detail here, let’s talk about the key files:

data.json stores all of your graph data, including positioning attributes, statistics created in Gephi, plus node and edge details
config.json contains many of the primary graph settings that can be easily edited for optimal web display. It’s quite easy to go through a trial and error process, since the file is so small. Simply make changes, then refresh your browser to see the result.
index.html has a few basic settings relevant to web display, most notably the title information that the browser will use

Within the css folder are .CSS files where you can make changes to many display attributes. This is typically where you will adjust fonts and font sizes, as well as some colors. The js folder has javascript files that can be edited to a certain degree, although caution is recommended if you’re not a javascript guru. Finally, the images folder contains any relevant image files to be used for web display, such as logos.

Alright, now that we have had a brief view of the technical details, let’s have a look at the network graph in the browser. Note that this is still a bit experimental at this stage; I’m attempting to customize each graph based on the official team colors or close variations in the color family.

To see some of the interactive functionality, let’s select a specific player. Simply type Ted Williams (the greatest Red Sox batter of all time) in the search box, and view the results:

Now we see only the direct connections (a 1st degree ego network) for Ted Williams (270 degrees in this case), as well as a wealth of statistical information previously calculated in Gephi, seen in the right panel. At the bottom of the panel are hyperlinks where any one of the 270 connections may be clicked, allowing us to view their network. As you can see, sigma.js quickly provides great interactivity for graph viewers.

Even better, we can scroll in to the network at any time:

Hovering on a node generates a pop-up title for that node, as seen for Ted Williams in this instance. We also begin to see the names of other prominent players at this zoom level. Additional zooming will reveal more player titles – a great way to embed information without making the original graph visually chaotic by displaying all titles at every level.

For the current web version of this graph, click here. I’ll try to keep this version active, even if I make improvements to the final network. Once again, thanks for reading!