Detroit Tigers Network Graph 1901-1949

Currently I’m way into using Gephi with the Sigma.js plugin, which takes the cool graph output from Gephi and makes it interactive and absolutely fun to play with for anyone interested in either network graphs or baseball history – or both.

Just recently, I created a piece featuring the career connections of Octavio Dotel, who has played on a record 13 MLB franchises, and may wind up with number 14 this season. Now, I’m moving into the team level, and have just created a 1901-1949 Detroit Tigers network as a test. Ultimately, I may wind up getting everything from 1901 through 2013 in the graph, but need to test for usability first.

Tigers Network map

Have a go at it here or visit the Network Graphs portfolio page to view this graph and others.

FacebooktwitterredditpinterestlinkedinmailFacebooktwitterredditpinterestlinkedinmailby feather
FacebooktwitterlinkedinrssFacebooktwitterlinkedinrssby feather

Octavio Dotel Network Draft 2

A few days ago I shared a post about creating a network graph detailing the many travels of former Tigers pitcher Octavio Dotel in his Major League Baseball career – 13 franchises over a 15 season span. I now have a live graph on my website, complete with a search box, hover capabilities, and easy clicking on links to narrow the graph to a manageable number of nodes.

Gephi provides the base functionality for the network creation and the excellent Sigma-js plugin converts the original network to a highly interactive web-based one. All I have to do is get the data into Gephi, choose a suitable algorithm, and maybe tinker a bit with the style settings in Sigma, and presto! a slick graph is created. Here’s a static look, but to really appreciate the beauty of the interaction, navigate to the Dotel visualization. The live version lets you mouse scroll to resize the image so you can zoom in for greater detail, in addition to the other navigation functions already mentioned. I’m not done yet, as some more data elements are coming, but the basic look and feel should remain unchanged. Enjoy!

FacebooktwitterredditpinterestlinkedinmailFacebooktwitterredditpinterestlinkedinmailby feather
FacebooktwitterlinkedinrssFacebooktwitterlinkedinrssby feather

The Amazing Octavio Dotel – Draft 1

For whatever reason, I recently had an epiphany about creating a data viz that would track the career of Octavio Dotel, the veteran pitcher who has managed to pitch for nearly half the franchises in Major League Baseball. Between 1999 and 2013, Dotel pitched for no fewer than 13 teams, including multiple seasons where he pitched for two or even three teams in the same season. No shortage of potential angles for this one – number of teams, number of other pitchers he pitched with, how many of his former teammates are still active, etc.

After a few days of thinking about this, and making sure my data was up to date, I finally began creating a graphic using Gephi, the open source network graphing tool I recently authored a book on. Over the course of several posts, I’m going to share what could be considered (if I were an artist) as sketches leading to the final work.

So here comes the first ‘sketch’. Given the desire to create a graphic that is easy to view digitally (as opposed to a print version), I quickly determined that an algorithm that created some sot of circular graph would be best. After some experimentation, I chose the ARF (Attractive and Repulsive Forces) method using Gephi. Like many network algorithms, ARF draws similar nodes together while pushing unrelated nodes farther apart, resulting in a graph that is not only visually striking but also quite intuitive. I’ll talk more about that as the graph evolves toward a finished state.

With that said, here’s the initial take:

Octavio Dotel Draft 1

Now, for some explanation of what you’re seeing. In the center (the largest blue node) is Octavio Dotel; since the graphic is about his connections, it’s only fair he gets top billing. The next level of nodes are depicted by slightly smaller circles. These represent each of the teams Dotel has performed for, with a single node for each season. The teams are color coded to resemble the actual team colors. If you look at the top center of the graphic, you will notice five identically colored circles, covering the five different seasons Dotel toiled for the Houston Astros, from 2000-2004. Most of the teams will have just a single season, while a few others have two.

Beyond the team level, you will see a few hundred smaller nodes, each one representing a single pitcher Dotel crossed paths with over the course of his career. Some of these nodes will be positioned between teams, indicating that a pitcher was part of multiple teams with Dotel.

As I continue to work on this, I’ll add some notation, reference salient features in the graph, and do some additional color coding at the pitcher level. Stay tuned.

FacebooktwitterredditpinterestlinkedinmailFacebooktwitterredditpinterestlinkedinmailby feather
FacebooktwitterlinkedinrssFacebooktwitterlinkedinrssby feather

Gephi Book Now Available!

I’m pleased to announce that my first book has been published (thanks to all at Packt Publishing!) and is now available online.

Network Graph Analysis and Visualization with Gephi provides a gentle introduction to the world of network graph visualization using Gephi, a powerful open source tool. In this post, I’ll walk you through a few examples from the book to illustrate how you can begin creating your own network graphs with Gephi.

Before diving into any specific examples, I want to give you an idea of what the book covers, so here’s the Table of Contents:

  • Preface
  • Chapter 1: Installing Gephi
  • Chapter 2: Creating Simple Network Graphs
  • Chapter 3: Exploring Additional Layout Options
  • Chapter 4: Creating a Gephi Dataset
  • Chapter 5: Exploring Plugins
  • Chapter 6: Advanced Features
  • Chapter 7: Deploying Gephi Visualizations
  • Appendix: Network Visualization Resources

While this book makes no claim to covering everything you can do with Gephi (not even close!), it does provide the reader with a broad and accessible overview, while also addressing some of the basic concepts and terminology of network graph analysis.

Here are a few excerpts from a companion article for the book; you can also download a sample chapter from the book page at Packt.

“Gephi is a versatile and powerful tool that will help you create simple network visualizations quickly, while also providing the capabilities to build complex graphs based on large datasets. In this article, you will learn some of the fundamentals of Gephi and network visualization, which will rapidly empower you to create your own graphs…”

“Network graphs are essentially based on the construct of nodes and edges. Nodes represent points or entities within the data, while edges refer to the connections or lines between nodes. Individual nodes might be students in a school, or schools within an educational system, or perhaps agencies within a government structure…”

“Network graphs are drawn through positioning nodes and their respective connections relative to one another. In the case of a graph with 8 or 10 nodes, this is a rather simple exercise, and could probably be drawn rather accurately without the help of complex methodologies. However, in the typical case where we have hundreds of nodes with thousands of edges, the task becomes far more complex…”

“Gephi is an ideal tool for users new to network graph analysis and visualization, as it provides a rich set of tools to create and customize network graphs. The user interface makes it easy to understand basic concepts such as nodes and edges, as well as descriptive terminology like neighbors, degrees, repulsion, and attraction. New users can move as slowly or as rapidly as they wish, given Gephi’s gentle learning curve…”

So if you or anyone you know is interested, navigate to the book’s page, where you’ll find more information, including a sample chapter, as well as links to a number of book sellers. Thanks, and happy visualizing!

FacebooktwitterredditpinterestlinkedinmailFacebooktwitterredditpinterestlinkedinmailby feather
FacebooktwitterlinkedinrssFacebooktwitterlinkedinrssby feather

Gephi Book is Coming Soon!

It’s been more than a month since I posted, but lest any of you suspect me of getting lazy, I’ve been busy with two book projects, plus the usual summer assortment of activities. Blogging, tweeting, and Facebook posting have taken a backseat for a stretch as I tweak formulas and layouts for one book (baseball pennant races), and submit rewrites for chapters on the other book (Gephi and network visualization).

The past week is a good case in point as I submitted eight chapter rewrites in less than a week, as the publisher is pushing (nicely) to have the book available in September. For anyone interested in the topic (I am personally fascinated with networks and what they reveal about a variety of subjects), here’s a link to the book’s page at Packt.

It’s pretty exciting to be part of this whole publishing process, and to be implementing suggestions from a group of reviewers who I’ve never met, but who are obviously passionate about both Gephi and the broader subject of network visualization. Their constructive criticism and honest feedback is making this book many times better than it would have been if I was working alone through the process. Once the book is complete, I’ll offer more detail and insight into the people behind the book.

We’re now entering the layout and design phase of the publishing cycle, which can be challenging for books such as this that combine text with a lot of images. Given the hundreds of books of this type that Packt has produced, I’m confident the final layout will look great, and we’ll have produced a book that helps introduce new users to the exciting world of Gephi and network graphs.

Meanwhile, I’m back on the pennant race book, and still holding to a 2013 publishing date (albeit later in the year than originally intended). If the 2013 season data is available in time, I may be able to include it in the book while still publishing before the December holiday season. Who knows, it might make a nice Christmas gift for that baseball fan on your list!

FacebooktwitterredditpinterestlinkedinmailFacebooktwitterredditpinterestlinkedinmailby feather
FacebooktwitterlinkedinrssFacebooktwitterlinkedinrssby feather

A New Post! And a New Book!

Realize I haven’t posted in over a month, but think I have a legit excuse, beyond the usual spring busy-ness with soccer, baseball, kendo, etc. I’ve been spending lots of time on a book project (not the baseball book) about Gephi and network visualization. The book should be out later this summer, giving me two books to be released in 2013.

Gephi is a terrific open source project focused on creating network graphs, the sort you often see with social network data. Basically, you have nodes that represent a person or other entity, coupled with lots of connections (edges) to other nodes. Sounds simple, but the possibilities are almost endless.

Network graphs have become one of the top visualization categories over the last 5-10 years, and Gephi enables users to create them without having to do a lot of coding or other customization. However, it does provide loads of options that allow for very complex and fascinating graphs that can be deployed as either static or interactive web projects.

Currently working on chapter 3 (of 7, plus an appendix), and can’t wait to see the final book. Stay tuned.

FacebooktwitterredditpinterestlinkedinmailFacebooktwitterredditpinterestlinkedinmailby feather
FacebooktwitterlinkedinrssFacebooktwitterlinkedinrssby feather

MLB Trades Network Map

Took a little break from the book to create a new infographic on trades between major league baseball teams over the last 100+ seasons, from 1901-2012. Network maps are often put to use analyzing and depicting connections within a social network like Twitter or Facebook, but can be used in many other instances, as this example shows.

The key concepts in network maps are nodes and edges; nodes are the connection points in a network (the teams in this case), while edges are the connections between nodes, showing a level of activity or connectedness (number of trades in this case). Have a look:

Network maps are often among the most elegant data visualizations, bordering on the artistic while still providing insight into the underlying data. In some cases, the maps are interactive, making them even more useful. At some point, I’ll offer some of those up, but in the meantime, take a look at the work of Jan-Willem Tulp and Jerome Cukier. These guys are creating some of the best work I’ve seen, highlighted by Tulp’s voting display from the 2012 Dutch elections, and Cukier’s interactive Paris Metro display. Fantastic work!

To view or download a PDF of the MLB trades graphic, click here.

FacebooktwitterredditpinterestlinkedinmailFacebooktwitterredditpinterestlinkedinmailby feather
FacebooktwitterlinkedinrssFacebooktwitterlinkedinrssby feather