Thursday 17 July 2014

The Revisionist NBA Draft - Tableau Storyboarding


Welcome to the disclaimer. So yes, this is a slight cheat as I have already posted the basics of this before but now there is more data and a lot more story / charting. I feel like I scrapped the surface with the last dashboard so here is a deeper dive in to the data.
_______________________________________________________________________________

As a history student there are normally two standard views on any event in history - the traditional and the revisionist. Well here is the revisionist view of Bill Simmons reranking of NBA draft picks from the last six years.

For those who haven't come across the NBA draft before, it's a method to distribute the top young talent from colleges and teams across the world to the NBA teams. There is an egalitarian approach to the draft as each year, those teams who don't make the playoffs are put in the lottery to see who gets the 1st pick. The worse your record in the regular season, the higher your chances are of landing that top pick. Therefore, you would expect the top draft picks to turn in to the top elite players in the league.

Well this visualisation is a study to determine if in fact the best players go to those teams choosing first? Do some teams fair better than others?

This visualisation also acts as my entry in to the Third Tableau Iron Viz Competition for 2014 as the visualisation takes advantage of the new story points feature in Tableau Desktop version 8.2. This is a really interesting feature that allows the author to guide their audience through the story they have found in the data. But Tableau is focused on allowing the user to explore data as well and the viewer can still explore the visualisations in the story without effecting the overall flow and what is coming up.

The feature has a lot to be developed still (more formatting control, altering tooltips, creating multiple charted tabs etc) but rather than having to build multiple versions of the same chart is a massive step on if you want to show different views of it.

As always, looking forward to the feedback on whether you enjoy the work or learn anything from it.

Wednesday 9 July 2014

Tableau on Tour - Day Two

Tim Harford - (Mis)Information is Beautiful keynote

- If you are a data visualisation guru you will have come across the Florence Nightingale story of visualisation many times. An innovative visualiser but she was still arguing a point and hence she used a coxcomb diagram.

- Florence had the dubious honour of creating one of the first infographics. Tim did point out she saved '000s of people but still... Have a read of wtfviz.net - you'll thank Tim for pointing it out.

- Misinformation is an issue in infographics. Height used for icons, forgets to work out the area implications.

- Here is the link to Tim's reference to Saturday Morning Breakfast Cereal - http://www.smbc-comics.com/index.php?db=comics&id=3167. A quick reference as to why info graphics have to be treated with care.

- Tim showed the example of the New York subway lines inequality line. Here is the London equivalent for house prices - http://www.standard.co.uk/news/london/interactive-map-of-london-underground-shows-how-capitals-house-prices-stack-up-9106901.html

- Tim played Debtris by Information is Beautiful - Debtris (UK version) - YouTube. He then pulls apart some of the "costs" referenced by McCandless. Not coming apples with apples. Visualisation is becoming easy but... "You don't have to think, but I recommend it. Pro tip"

- Tim highlights it's not enough to be beautiful, it has to be sound work. He understand when someone 'smooth talks' us without substance. Not everyone has developed the ability to see the gaps being presented.

- Dazzle camouflage in military shipping is a similar analogy to info graphics.

- Be careful of:
1. People that lie with data
2. People who haven't taken care of how the are visualising
3. People who are not careful of the metrics they use

Matt Francis - Once Upon a Tableau - The How and Why of Story

- 220 slides (seriously!) of hilarity!

- Tableau Public is a great learning resource - download something that makes you go woah and you can download it to learn how it was done.

- Stories are important because they are: more Memorable, Impactful and Relatable

- Matt talks about Sequential Art - a way to transmit human experience so why not use Sequential Data Visualisation

- Data viz should be unbiased and neutral (I'd say can it be?). Data stories lead a reader in a certain direction. You are influencing through a guided tour

- You need to:
1. Consider your plot - plan out what you are going to do
2. Consider your audience - harks back to Paul Banoub's idea of present to Homer as well as Lisa Simpson

- Matt went through his thinking on creating his:
1. Sunspots visualisation - http://wannabedatarockstar.blogspot.co.uk/2014/02/how-sun-controls-weather_21.html
2. The Greatest F1 driver - http://wannabedatarockstar.blogspot.co.uk/2014/03/who-is-greatest-f1-driver.html
3. Malaria - the global problem - http://wannabedatarockstar.blogspot.co.uk/2014/04/malaria-global-preventable-diesase.html

- "Hit people with the numbers and make it relatable" create an emotional impact, like the use of the school bus in the Malaria viz

- Matt's do's and don'ts:
1. Don't use it for everything. Single dashboards are effective and also multiple dashboards but question whether it is the right choice
2. Don't waste space - "each story point must earn it's place"
3. Do share the right story point - the 'Share' URL you have on your screen is for the point you are on when you click share.


The Final Keynote - Kenneth Cukier (data editor at the Economist)
- http://www.economist.com/blogs/graphicdetail is an accessible reference for some of Kenneth's team reference

- Kenneth uses the example of the Consumer Price Index (CPI) that is made up (like a lot of the financial indices) are a large number of measures. Changes to all underlying factors can now be visualised in seconds.

- "n=all" before you used to have to sample, now you can use all of the data. Data visualisation allows you to see all of the data and the trends.

- Information growth in digital sources is exponential and it's growth is unlikely to change. Even with this data growth, there is a need to show all of it.

- The use of different visualisation has allowed information never seen as data, suddenly be visualised and insight taken from it.

- By disaggregating data, you can find new uses and ways to look at this information. Fo example, employment data like this: http://www.nytimes.com/interactive/2014/06/05/upshot/how-the-recession-reshaped-the-economy-in-255-charts.html?_r=0

- New techniques are still required to tell different and more interesting stories. You have to explore the data without conceived ideas about what you are going to find. Remove Preconceived Notions! Observe first, answer afterwards.

- Kenneth still battles to get the visualisation in to the magazine that depicts the data best vs. what is easily accessible. Has "failed" multiple times with 'innovative' visualisations but hits home runs that shows you need to keep pushing the boundaries.

- The world cup has been a testbed for the media's large dataset visualisations (social media etc)

- Need to be aware of access, ethics, privacy, ownership, "data-ism" (a new alchemy that needs to be treated with care and be cautious). Need to remember the humanity behind the data.

...now off to the London Tableau User Group. Great conference and thank you to all those who made it a special couple of days.


Tuesday 8 July 2014

Tableau on tour - reaches London

Ok, here is my attempt at live blogging so this page will capture my thoughts and key ideas throughout the conference (if the wifi holds up).

Opening keynote:
- 7,000 people expected in Seattle for the global conference this year. Should be epic!

- A great demonstration of Tableau Public through Paul Banoub's cup stacking viz

- Francois Ajenstat giving really good insights into the founding of Tableau. Chris, Christian and Pat had to produce multiple visualisation to describe some code - they desired to find an easy way to do this.

- Key areas for development for tableau:
1. Seamless access to data - new data load in 8.2 a key part of this
2. Analytics and statistics for everyone - sophisticated modelling in one click. Mapping the big development in 8.2. Data search to be developed in future releases.
3. Visual Analytics everywhere - Dave Story - former Lucas Films BI guru, now mobile and strategic growth VP highlights edit function on server to explore further through 'edit' functionality.
4. Storytelling - Powerpoint just isn't enough anymore! Check out my Viz of the Day in the article below to see more.
5. Enterprise - Tableau. Online and Public shows scalability. New administration functions in 8.2 to add, edit and delete users / workbooks the latest iteration.
6. Fast, easy, beautiful - Tableau want you to have a "conversation with the data so the software fades away". Subtle changes like responsive marks on maps, seamless movement around maps etc

Paul Banoub's talk on building a Tableau Centre of Excellence
- UBS using Tableau for IT, Business, Finance and HR

- It's not just about Tableau it's about Visualisation best practice.

- Design for Homer as well as Lisa Simpson - don't just build shareable work for people like you. It won't have the same benefit if not.

- Proof of concepts and communities are key ways to find who gets hooked on Tableau. Similar experience for me too apart from people who see Tableau over your shoulder and go "I want to do that"

- Virtual Hosting, Monitoring (great work by Mike Roberts formerly of Interworks that I will need to read after), Configuration all need to be considered to create the polished user experience that Tableau is designed for. Landing pages on intranet and style guides help users to get better results sooner (ref. Mark Jackson's blog ugamarkj.blogspot.com)

- Purchaisng and onboarding new users is cruical to the user experience. It can really put people off Tableau if you falter and are not timely. *nods*

- Monthly and Quarterly structured training sessions (including Tableau doctor sessions) have helped people excel and drive themselves further.

Bethany Lyons - data blending
- Data blending is a left outer join (everything from left (primary) and brings in just the stuff that matches in the right (secondary)). Inner joins are also possible by excluding any nulls. Can't do full outer joins through blends.

- Try to aggregate through linking fields for better performance. These will be the most common request anyway.

- The combination of highlighted link icons will create the unique combination of data points to be joined. Over selection of data links leads to data reduction (data points disappearing) if you just select everything but are actually not concerned about that unique combination of data points. - To prevent data loss then you need to pad the domain - this means adding null values to ensure there is a cross over between both data sets on all of the linking fields

- * value of doom (hopefully will be called the Death Star) - occurs when your secondary data source has a many-to-one relationship with the primary data source. This doesn't happen with joins as joins create additional rows of data but likely will duplicate the measure value.

- If you want to filter between two data sources, joins are the way to go. Because when one set is used as a filter but it has no concept of that filter in the second data source, Tableau can't filter it. - You could then join these two source abut blend in the measures to avoid duplication of values.

Jock McKinley
- Story arc - Question or problem, Logical sequence or narrative, and Conclusion or Resolution

- Three types of use of visualisation in storytelling - 1. Find, 2. Tell and 3. Explore

- Tell - Jock used a great connected Scatterplot on road fatalities in motor vehicles by Hannah Fairfield. Chart run through the story and annotation is all around. Body of text in large dead white space. Very much a tell style.

- Use "Information Scent" (highlight function in Tableau) to lead others to explore your data.

- Explore - people don't expect data views to be a place to explore. You need to show them you can and how. D3 line charts used heavily in good examples. Allows people to explore.

- Collaborate - share and you never know where someone will take your data and story too.

- This will truly create a tree of knowledge

Andy Kriebel and Dan Murray
- Dan - "Andy is the smartest dumb guy you'll ever meet" nice!

- 1st user meeting was 2009 in Atlanta, Andy Cotgrave created the 1st User Group in the UK in 2010 (a contentious point!)

- 10. Rules of running a good user group:
1. Multiple companies and leaders mean that there is a good supply of organisers and spaces to choose from
2. Make it a no sales zone - listen to what the users want
3. Make it routine - monthly or quarterly so people can get it in to their diaries
4. Central location, easy accessibility key to remove excuses not to attend
5. Webinar make it tough as you don't really get to meet people. Allow webex, but just don't promote it
6. Andy - "beer helps"
7. 3-4 hours is key duration time
8. What? Guest speaker (can be wider than Tabelau), hands on training (helps newbies, don't go too advanced), give a random data set and allow people to build.
9. You want people to get hooked to come back so the takeaway learnings need to be good
10. Always have the next meeting ready to go...

Bethany Lyons - Table Calculations
- All the major work is done within advanced settings of the table calc. Using this rather than the drop downs mean that as you change the visualisation, the table calculation will remain calculating what you want by what you want

- Difference from average - sum(sales)-window_avg(sum(sales))

- You can nest as many table calculations as you want. You'd be crazy or a zen master to do this but you could

- dayofyear - is a number of the day in the year and is used within date part calculation syntax (I've never come across it)

- Last() is the table calculation that I always forget about but saves working on date calculations

- ATTR is a way to aggregate dimensions to be used within table calcs where you are using aggregated measures

- Jittering in a box-plot - use index() and change the partition to be your dimension