Tuesday, 10 October 2017

Vegas Data17 - Opening Keynote Brief Notes

Adam Selipsky
61,000 customers in 100+ countries
Data Myths - created to replace the unknown and create reasons for what happens
#1 AI will replace the analyst. Actually AI is likely to assist the analyst. Tableau is smart software. Drag and drop of Clusters and Trend. Natural Language Processing will take out some of the technical barriers.
#2 Data is only for the analysts. More data programs in Higher Education. 800 million knowledge workers. Excel used to be taught in university, now school children use it. Tableau aiming for the same.
#3 Data governance means no. Data is valuable so needs to be protected. But that’s the old model - the bottleneck has been removed. Governance should means secure enablement.
Honeywell deals in human safety. It needs to have appropriate data ready at appropriate times. Giving multiple environments with clear transitions between them helps to manage creation at all levels. 20,000+ users within two years as the right mix of options gives governance and flexibility.
#4 There can be one, perfect source of truth. Innovation is so rapid that you can predict the different sources and combinations of these. We live in a world of many sources of truth. We have to embrace that. Tableau invests in all of the flexibility that you need. Go past the hype and try the tools. 

Francois Ajenstat
Myth - BI platforms take power away from the people. Often designed for specialist. Tableau focuses on people. 

100+ features added in Tableau over the last year. 50+ came from the community.

10.5 in beta today
Hyper
Linux
Viz in Tooltips 

Data Engine
Hyper is the new Data Engine. Tableau Data Engine was great. But time to scale up. Hyper instant compared to 25 second load where heavy calculations are used. Extract creation will also be massively improved (3 million rows live SQL) double the speed in Hyper. Hyper doesn’t sort the data like a TDE. 500 million rows isn’t possible with TDE, Hyper is possible. No migration necessary - just use 10.5 and it’s there.
Data prep
Still working with existing data prep partners but not everyone has this - introducing Maestro. Data profiles gives you a sense of your data. Filter outliers with normal exclude functionality that you find in Tableau. Grouping through Fuzzy Clustering to sort poorly entered data. Clear view of the changes made through the transformation. Drag and drop joins and unions. Very simple join and cleaning of joins. Maestro in beta this quarter
Extensions API
Makes additions in to Tableau rather than Tableau in to other applications. Dashboards becoming their own applications. Dashboard Extensions create two-way communication with the data source.


Tuesday, 6 June 2017

What’s Next for Tableau - Andrew Beers


“This next hour isn’t about me, it’s all about you”

Three principles and how Tableau is answering them:

  1. People who know the data should ask the questions
Profile: bring domain knowledge rather than the SQL-coding skills to get to the data to answer the questions

2. Software should be designed for deeper thinking
Profile: Not simplistic products, simple ways to do complex things
After the first five minutes of simplicity there has to be more [complexity]...to let you have the impact you want to have

3. Analytics at Scale can drive change
Profile: Analytics at scale can solve the big problems in the world

Last 12 months in the Tableau world has gone from 10.0 to 10.3
Half of the 130 improvements came from ideas on the Ideas forum
Loving that Tableau is on a ‘quarterly’ cadence

So how does Tableau help us achieve this?

Connect
Data Prep: “If you can’t connect to the data then you can’t analyse it”
So need to connect to all data (pdf, cloud files, Eloqua, Service Now etc)
Hybrid engine: Fast Live & In-memory, Cross DB Federation
Data Prep: Self Service, export to csv

PDF - shows all the tables in the pdf when loaded into Tableau. Can pivot
JSON - Tableau detects the structure. When connection to JSON, if you have multiple rows then Tableau will create the LoD measures removing the duplication for you

Answer More Questions
More ways to change lines (regular, stepped or jump lines)
Tooltip selections: Using dimensions in the tooltip to enable highlighting other other marks that allow the common elements to be highlighted.
Dashboard element sizing - ‘Distribute Evenly’
Python / R integration - use SCRIPT_REAL function

Collaboration at Scale
Sharing / Governance
Discussion
Altering
Anywhere Web & Mobile
Tableau is now watching how the whole community is using the common joins done to ‘recommend’ what tables to join on
Alerting - can schedule regularity of alerts

Big changes coming soon:
Linux
Hyper
Maestro

Linux: cost saving / ease of management / technical skills of IT / more security control
Hyper: support for very large data sizes (billions of rows), faster queries, much faster extract creation, scalable (mid summer public beta)
Maestro: Complete visibility for immediate understanding (coordinated views aid understanding)



Saturday, 25 March 2017

Iron Viz - We can't breathe

It’s that time again with IronViz feeder competitions starting up again. Every year, I take this opportunity to explore a technique I haven’t used before, or alternatively, a subject that I care about.

As a keen cyclist in London, Air Quality is a really important subject to me. With increased cycling infrastructure, the move to electronic cars and banning older vehicles from the city centre, I was expecting to download a set of data that showed dramatic falls in the levels of pollutants that are all around us Londoners. Nothing could be further from the truth.



Using data from londonair.org.uk, I looked at the Borough level of detail as there are only three or four monitoring points per Borough. The data proved a challenge with sporadic measurement across an inconstant set of pollutants. This meant that finding trends and understanding geographical patterns using the new Spatial file connector in Tableau was a good challenge.

The biggest break-through I had was whilst researching the subject, was coming across the European legal limits that have come in to force across different pollutants, at various points since 2005. This is shown as the reference line on the trend data to show how often the limits are broken across London.

The answer in many cases is the daily average often doesn’t fall below some of the limits. Worrying stuff!

The spatial mapping allows the reader to see the impact on the city centre. Although Lambeth, outside of the true centre of the City, seems to be hit particularly hard by pollution levels.

As with any IronViz entry, it’s a chance to use a little imagination and have some fun with charting. When I played with some charts, it looked like exhaust fumes so, I kept it and features it at the top of my visualisation.

Overall, the visualisation has led me to want to dig in further in to this subject and what I can do to start making a difference as currently there isn’t any improvement of any note.

Thursday, 10 November 2016

Data16 - Keep it simple stupid by Chris Love

Keep it simple stupid


“Our world is unbounded by complexity”. Tolomy was living at a time of rapid development and built up an enourmous amount of data. He looked at the data and he mapped planetary movements. But he had the earth at the centre of the movement. It made sense.


The helio-centric model made more sense but faced more significant challenges due to the challenges to the culture this theory made.


Sankeys
Made originally but an engineer captain Sankey to look at steam power and energy loss throughout the process of a steam engine.
Minard’s Napolean’s march into Russia is probably the first that is largely used and most infamous


Sankeys are tough to build in Tableau. There are a lot of steps and a lot of techniques. Chris uses the example of Pablo’s Spanish migration and how a basic set of small multiple histogram show up the trends a lot more clearly


It's not just Pablo. A lot of people have used and built a lot of sankeys. It's not wrong but is it the best way to achieve what you are trying to achieve.


Chris highlights Joe Radburn as he is looking at complex subjects but is visualising them simply. This is arguable just a challenging skill to master.


Kuhn - new discoveries are only made when you don't have a preconception as to what the answer is and the best tool to achieve that preconception. Only exploration will lead to discovery.


Data sources should allow people to explore for themselves, or dashboards to let them answer their curiosity, story points are heavily guided.


Design for mobile is making us think more simply about the message we are trying to convey.


Wednesday, 9 November 2016

Pimp my Viz: Tokyo Drift - Jewel Loree

Jewel Loree


Taking crazy tips on how to create cool dashboards


Viz 1: Jewel’s Pokemon Go viz
For device designer, text doesn't resize but images do so take images of your titles
Helper buttons are great and useful - create a basic view of just a shape and the customise the tooltip
Jewel’s way to build unit charts is great - she's going to post the calcs on her blog and I will add a photo later [add photo here - remind me if I don’t]
Bring custom shapes in in the same order as the dimension so you can just click assign palette to save having to allocate them all individually
Mapbox used to create a similar map style to match Pokemon Go’s maps
To have different vizzes on your device specific dashboards, then just have the floating element just catching the edge of the dash (please make this easier tableau)


Viz 2: Data Breaches by Marc Schonwadt
There is a lot of custom formatting that you can do with text.
Check out the layout tab to see how someone has built their viz
Hiding index menus using layout containers (set up the hidden element as sitting at negative whatever the width of the layout container.
Jewel just copied across the dashboard which pulled in Robert Rouse’s technique dash to take his helper data with it. Make sure you don’t fit the width as it no longer collapse.


Everyday Pimping
Use custom colours
Do more with the Marks Card
Create custom headers
Always fix your tooltips
Thoughtful interactions


Putting a label on the bar and take away the header. Put the rows on the label and align the text to the left. Then fatten your bars.
Sort your bubble charts but sorting your dimension and sort on what is setting the size
Create custom header called Canva




Data16 - Developers on Stage

Analytics
Automatic drill - level of detail goes deeper as you zoom in
Map scaling added in to the maps
Spatial file connector coming in 10.2 beta
Python integration - using python scripts in calculated fields
Tooltip selection - click on the categorical fields in the tooltip to highlight by that selections
Date filters - filter to latest date rather than being stuck on what you originally published as.
Step and Jump lines - squarer trend lines
Advanced conditional formatting


Dashboards and Stories
Distribute evenly to space out your objects on a dashboard
Can add margin around all objects on your dashboard. No more blanks to separate your charts
Expressive text editor - add images and URLs in to text objects (including tooltips)
Web authoring has more right click functionality than before
Story points on the web
Full screen viz on the web


Mobile
Direct linking from subscription emails and condition warnings
Smooth tooltips for mobile. Also easier selection of small marks in a movie cuz. Like selecting where to enter text on a phone when holding down your finger.
Commenting on the go (not just in Desktop and Server)
Offline interactivity


Data
Joins on calculated fields
Database unions
New data connectors: pdfs, JSON


Extensibility
Server client library makes it easier to write simple scripts
RET API: JSON and CORS support
GetData() 3rd Party Charting libraries
Mobile App bootstrap - on GitHub now




Tuesday, 8 November 2016

Cross Database Joins - Bethany Lyons and Alex Ross

Bethany Lyons and Alex Ross


The unexpected solution to many tough analytics problems


Bethany has looked forward to delivering this session for a long time as CDJ (Cross Database Joins) can be used to solve so much


Often most analysts have read only access so don't have the chance to create data
Identify (create vs deriving)
Understand (complexity and volume vs performance)
Use - focus on CDJs as the solution


Example 1
Taking a 12 month subscription from just one row and spreading it across the 12 months (on a monthly basis) to show the monthly revenue. Use a simple excel scaffold of month number and a key of 1 to create the product join.


To create the filter of finding when the contracts will earn revenue up to, Bethany used a Boolean filter calf but then added it to the data source filters to cut down on the processing done locally on your machine.


V10.2 adding joins based on calcs


Example 2


Counting staff employed at any single point in time - use missing values to fill the gaps on the table calcs


A scaffold of all dates is needed though and the just return those that are after hire and pre-termination date.


What if the scaffold creates huge amounts of rows?
I.e. If you have seconds a tool is active, you can scaffold on a day level and the create a calc that then counts a full day’s worth of seconds (86,400) but if a partial day then use a datediff() calc