DV Posts

2400 years ago the concept of Data Visualization was less known, but even than Plato said “Those who tell stories rule society“.


I witnessed multiple times how storytelling triggered the Venture Capitalists (VCs) to invest. Usually my CEO (biggest BS master on our team) will start with a “60-seconds-long” short Story (VCs called them “Elevator Pitch”) and then (if interested) VCs will do a long Due Diligence Research of Data (and Specs, Docs and Code) presented by our team and after that they will spend comparable time analyzing Data Visualizations (Charts, Diagrams, Slides etc.) of our Data, trying to prove or disprove the original Story.

Some of conclusions from all these startup storytelling activity were:

  • Data: without Data nothing can be proved or disproved (Action needs Data!)

  • View: best way to analyze Data and trust it is to Visualize it (Seeing is Believing!)

  • Discovery of Patterns: visually discoverable trends, outliers, clusters etc. which form the basis of the Story and follow-up actions

  • Story: the Story (based on that Data) is the Trigger for the Actions (Story shows the Value!),

  • Action(s): start with drilldown to a needle in haystack, embed Data Visualization into business, it is not an Eye Candy but a practical way to improve the business

  • Data Visualization has 5 parts: Data (main), View (enabler), Discovery (visually discoverable Patterns), Story (trigger for Actions) and finally the 5th Element – Action!

  • Life is not fair: Storytellers were there people who benefited the most in the end… (no Story no Glory!).


And yes, Plato was correct – at least partially and for his time. Diagram above uses analogy with 5 Classical Greek Elements. Plato wrote about four classical elements (earth, air, water, and fire) almost 2400 years ago (citing even more ancient philosopher) and his student Aristotle added a fifth element, aithêr (aether in Latin, “ether” in English) – both men are in the center of 1st picture above.

Back to our time: the Storytelling is a hot topic; enthusiasts saying that “Data is easy, good storytelling is the challenge” http://www.resource-media.org/data-is-easy/#.URVT-aVi4aE or even that “Data Science is a Storytelling”: http://blogs.hbr.org/cs/2013/03/a_data_scientists_real_job_sto.html . Nothing can be further from the truth: my observation is that most Storytellers (with a few known exceptions like Hans Rosling or Tableau founder Pat Hanrahan) ARE NOT GOOD at visualizing but they still wish to participate in our hot Data Visualization party. All I can say is “Welcome to the party!”

It may be a challenge for me and you but not for people who had a conference about storytelling: this winter, 2/27/13 in Nashville, KY: http://www.tapestryconference.com/ :

Some more reasonable  people referring to storytelling as a data journalism and narrative visualization: http://www.icharts.net/blogs/2013/pioneering-data-journalism-simon-rogers-storytelling-numbers

Tableau founder Pat Hanrahan recently talked about “Showing is Not Explaining”. In parallel, Tableau is planning (after version 8.0) to add features that support storytelling by constructing visual narratives and effective communication of ideas, see it here:

Collection of resources on storytelling topic can be found here: http://www.juiceanalytics.com/writing/the-ultimate-collection-of-data-storytelling-resources/

You may also to check what Stephen Few thinks about it here: http://www.perceptualedge.com/blog/?p=1632

Storytelling as an important part (using Greek Analogy – 4th Classical Element (Air) after Data (Earth), View (Water) and Discovery (Fire) and before Action (Aether) ) of Data Visualization has a practical effect on Visualization itself, for example:

  • if Data View is not needed for Story or for further Actions, then it can be hidden or removed,

  • if number of Data Views in Dashboard is affecting impact of (preferably short Data Story), then number of Views should be reduced (usually to 2 or 3 per dashboard),

  • If number of DataPoints is too large per View and affecting the triggering power of the story, then it can be reduced too (in conversations with Tableau they even recommending 5000 Datapoints per View as a threshold between Local and Server-based rendering).


Below you can find samples of Guidelines and Good Practices for Data Visualization (mostly with Tableau), which I used recently.

best-practiceSome of this samples are Tableau-specific, but others (may be with modifications) can be reused for other Data Visualization Platform and tools. I will appreciate feedback, comments and suggestions.

Naming Convention for Tableau Objects

  • Use CamelCase Identifiers: Capitalize the 1st letter of each concatenated word

  • Use Suffix for Identifiers with preceding underscore to indicate the type (example: _tw for workbooks).

Workbook Sizing Guidelines

  • Use Less than 5 Charts per Dashboard, Minimize the number of Visible TABs/Worksheets

  • Move Calculations and Functions from Workbook to the Data.

  • Use less than 5000 Data-points per Chart/Dashboard to enable Client-side rendering.

  • To enable Shared Sessions, don’t use filters and interactivity if it is not needed.

Guidelines for Colors, Fonts, Sizes

  • To express desirable/undesirable points, use green for good, red for bad, yellow for warning.

  • When you are not describing “Good-Bad situation” (thanks to feedback of visitor under alias “SF”) , try to use pastel, neutral and blind colors, e.g. similar to “Color Blind 10″ Palette from Tableau.

  • Use “web-safe” fonts, to approximate what users can see from Tableau Server.

  • Use either auto-resize or standard (target smaller screen) sizes for Dashboards

Data and Data Connections used with Tableau

  • Try to avoid pulling more than 15000 rows for Live Data Connections.

  • For Data Extract-based connections 10M rows is the recommended maximum.

  • For widely distributed Workbooks use of Application IDs instead of Personal Credentials.

  • Job failure due expired credentials leads to suspension from Schedule, so try to keep embedded credentials up to date


Tableau Data Extracts (TDE)

  • If Refresh of TDE takes more than 2 hours, consider to redesign it.

  • Reuse and share TDEs and Data Sources as much as possible.

  • Use of Incremental Data Refresh instead of Full Refresh when possible.

  • Designate Unique ID for each row when Incremental Data Refresh is used.

  • Try to use free Tableau Data Exract API instead of licensed Tableau Server to create Data Extracts

Scheduling of Background Tasks with Tableau

  • Serial Schedules is recommended; avoid the usage of hourly Schedules.

  • Avoid scheduling during peak hours (8am-6pm), consider weekly instead of daily schedules.

  • Optimize Schedule Size, group tasks related to the same project into one Schedule, if total tasks execution exceeds 8 hours, split Schedule on a few with similar Name but preferably with different starting time.

  • Maximize the usage of Monthly and Weekly Schedules (as oppose to Daily Schedules) and usage of weekends and nights.

Guidelines for using Charts

  • Use Bars to compare across categories, use Colors with Stacked or Side-by-Side Bars for deeper Analysis

  • Use Line for Viewing Trends over time, consider Area Charts for Multi-lines

  • Minimize the usage of Pie Charts; when appropriate – use it for showing proportions. It is recommended to limit pie wedges to six.

  • Use Map to show geocoded data, consider use maps as interactive filters

  • Use Scatter to analyze outliers, clusters and construct regressions


You can find more about Guidelines and Good Practices for Data Visualization here: http://www.tableausoftware.com/public/community/best-practices

The most popular (among business users) approach to visualization is to use a Data Visualization (DV) tool like Tableau (or Qlikview or Spotfire), where a lot of features already implemented for you. Recent prove of this amazing popularity is that at least 100 million people (as of February 2013),  used Tableau Public as their Data Visualization tool of choice, see


However, to make your documents and stories (and not just your data visualization applications) driven by your data, you may need the other approach – to code visualization of your data into your story and visualization libraries like  popular D3 toolkit can help you. D3 stands for “Data-Driven Documents”. The Author of D3 Mr. Mike Bostock designs interactive graphics for New York Times – one of latest samples is here:


and NYT allows him to do a lot of Open Source work which he demonstartes at his website here:

https://github.com/mbostock/d3/wiki/Gallery .


Mike was a “visualization scientist” and a computer science PhD student at #Stanford University and member of famous group of people, now called “Stanford Visualization Group”:


This Visualization Group was a birthplace of Tableau’s prototype – sometimes they called it  “a Visual Interface” for exploring data and other name for it is Polaris:


and we know that creators of Polaris started Tableau Software. One of other Group’s popular “products” was a graphical toolkit (mostly in JavaScript, as oppose to Polaris, written in C++) for Visualization, called ProtoVis:


- and Mike Bostock was one of ProtoViz’s main co-authors. Less then 2 years ago Visualization Group suddenly stopped developing ProtoViz and recommended to everybody to switch to D3 library


authored by Mike. This library is Open Source (only 100KB in ZIP format) and can be downloaded from here:



In order to use D3, you need to be comfortable with HTML, CSS, SVG, Javascript programming, DOM (and other Web Standards); understanding of jQuery paradigm will be useful too. Basically if you want to be at least partially as good as Mike Bostock, you need to have a mindset of a programmer (I guess in addition to business user mindset), like this D3 expert:


Most of successful early D3 adopters combining even 3+ mindsets: programmer, business analyst, data artist and even sometimes data storyteller. For your programmer’s mindset you may be interested to know that D3 has a large set of Plugins, see:


and rich #API, see https://github.com/mbostock/d3/wiki/API-Reference

You can find hundreds of D3 demos, samples, examples, tools, products and even a few companies using D3 here: https://github.com/mbostock/d3/wiki/Gallery


This is the Part 2 of the guest blog post: the Review of Visual Discovery products from Advizor Solutions, Inc., written by my guest blogger Mr. Srini Bezwada (his profile is here: http://www.linkedin.com/profile/view?id=15840828 ), who is the Director of Smart Analytics, a Sydney based professional BI consulting firm that specializes in Data Visualization solutions. Opinions below belong to Mr. Srini Bezwada.

ADVIZOR Technology

ADVIZOR’s Visual Discovery™ software is built upon strong data visualization tools technology spun out of a distinguished research heritage at Bell Labs that spans nearly two decades and produced over 20 patents. Formed in 2003, ADVIZOR has succeeded in combining its world-leading data visualization and in-memory-data-management expertise with extensive usability knowledge and cutting-edge predictive analytics to produce an easy to use, point and click product suite for business analysis.

ADVIZOR readily adapts to business needs without programming and without implementing a new BI platform, leverages existing databases and warehouses, and does not force customers to build a difficult, time consuming, and resource intensive custom application. Time to deployment is fast, and value is high.

With ADVIZOR data is loaded into a “Data Pool” in main memory on a desktop or laptop computer, or server. This enables sub-second response time on any query against any attribute in any table, and instantaneously update all visualizations. Multiple tables of data are easily imported from a variety of sources.

With ADVIZOR, there is no need to pre-configure data. ADVIZOR accesses data “as is” from various data sources, and links and joins the necessary tables within the software application itself. In addition, ADVIZOR includes an Expression Builder that can perform a variety of numeric, string, and logical calculations as well as parse dates and roll-up tables – all in-memory. In essence, ADVIZOR acts like a data warehouse, without the complexity, time, or expense required to implement a data warehouse! If a data warehouse already exists, ADVIZOR will provide the front-end interface to leverage the investment and turn data into insight.
Data in the memory pool can be refreshed from the core databases / data sources “on demand”, or at specific time intervals, or by an event trigger. In most production deployments data is refreshed daily from the source systems.

Data Visualization

ADVIZOR’s Visual Discovery™ is a full visual query and analysis system that combines the excitement of presentation graphics – used to see patterns and trends and identify anomalies in order to understand “what” is happening – with the ability to probe, drill-down, filter, and manipulate the displayed data in order to answer the “why” questions. Conventional BI approaches (pre-dating the era of interactive Data Visualization) to making sense of data have involved manipulating text displays such as cross tabs, running complex statistical packages, and assembling the results into reports.

ADVIZOR’s Visual Discovery™ making the text and graphics interactive. Not only can the user gain insight from the visual representation of the data, but now additional insight can be obtained by interacting with the data in any of ADVIZOR’s fifteen (15) interactive charts, using color, selection, filtering, focus, viewpoint (panning, zooming), labeling, highlighting, drill-down, re-ordering, and aggregation.

Visual Discovery empowers the user to leverage his or her own knowledge and intuition to search for patterns, identify outliers, pose questions and find answers, all at the click of a mouse.

Flight Recorder – Track, Save, Replay your Analysis Steps

The Flight Recorder tracks each step in a selection and analysis process. It provides a record of those steps, and be used to repeat previous actions. This is critical for providing context to what and end-user has done and where they are in their data. Flight records also allow setting bookmarks, and can be saved and shared with other ADVIZOR users.
The Flight Recorder is unique to ADVIZOR. It provides:
• A record of what a user has done. Actions taken and selections from charts are listed. Small images of charts that have been used for selection show the selections that were made.
• A place to collect observations by adding notes and capturing images of other charts that illustrate observations.
• A tool that can repeat previous actions, in the same session on the same data or in a later session with updated data.
• The ability to save and name bookmarks, and share them with other users.

Predictive Analytics Capability

The ADVIZOR Analyst/X is a predictive analytic solution based on a robust multivariate regression algorithm developed by KXEN – a leading-edge advanced data mining tool that models data easily and rapidly while maintaining relevant and readily interpretable results.
Visualization empowers the analyst to discover patterns and anomalies in data by noticing unexpected relationships or by actively searching. Predictive analytics (sometimes called “data mining”) provides a powerful adjunct to this: algorithms are used to find relationships in data, and these relationships can be used with new data to “score” or “predict” results.


Predictive analytics software from ADVIZOR don’t require enterprises to purchase platforms. And, since all the data is in-memory, the Business Analyst can quickly and easily condition data and flag fields across multiple tables without having to go back to IT or a DBA to prep database tables. The interface is entirely point-and-click, there are no scripts to write. The biggest benefit from the multi-dimensional visual solution is how quickly it delivers analysis, solving critical business questions, facilitating intelligence-driven decision making, providing instant answers to “what if?” questions.

Advantages over Competitors:

• The only product in the market offering a combination of Predictive Analytics + Data Visualisation + In memory data management within one Application.
• The cost of entry is lower than the market leading data visualization vendors for desktop and server deployments.
• Advanced Visualizations like Parabox, Network Constellation in addition to normal bar charts, scatter plots, line charts, Pie charts…
• Integration with leading CRM vendors like Salesforce.com, Blackbaud, Ellucian, Information Builder
• Ability to provide sub-second response time on query against any attribute in any table, and instantaneously update all visualizations.
• Flight recorder that lets you track, replay, and save your analysis steps for reuse by yourself or others.

Update on 5/1/13 (by Andrei): Avizor 6.0 is available now with substantial enhancements: http://www.advizorsolutions.com/Bnews/tabid/56/EntryId/215/ADVIZOR-60-Now-Available-Data-Discovery-and-Analysis-Software-Keeps-Getting-Better-and-Better.aspx

I doubt that Microsoft is paying attention to my blog, but recently they declared that Power View now has 2 versions: one  for SharePoint (thanks, but no thanks) and one for Excel 2013. In other words, Microsoft decided to have own Desktop Visualization tool. In combination with PowerPivot and SQL Server 2012 it can be attractive for some Microsoft-oriented users but I doubt it can compete with Data Visualization Leaders – too late.

Most interesting is the note about Power View 2013 on Microsoft site: “Power View reports in SharePoint are RDLX files. In Excel, Power View sheets are part of an Excel XLSX workbook. You can’t open a Power View RDLX file in Excel, and vice versa. You also can’t copy charts or other visualizations from the RDLX file into the Excel workbook.

But most amazing is that Microsoft decided to use the dead Silverlight for Powerview: “Both versions of Power View need Silverlight installed on the machine.” And we know that Microsoft switched to HTML5 from Silverlight and no new development planned for Silverlight! Good luck with that…

And yes, you can add now maps (Bing of course), see it here:

(this is a repost from my other Data Visualization blog: http://tableau7.wordpress.com/2012/05/31/tableau-as-container/ )

Often I used small Tableau (or Spotfire or Qlikview) workbooks instead of PowerPoint, which are proving at least 2 concepts:

  • Good Data Visualization tool can be used as the Web or Desktop Container for Multiple Data Visualizations (it can be used to build a hierarchical Container Structures with more then 3 levels; currently 3: Container-Workbooks-Views)

  • It can be used as the replacement for PowerPoint; in example below I embedded into this Container 2 Tableau Workbooks, one Google-based Data Visualization, 3 image-based Slides and Textual Slide: http://public.tableausoftware.com/views/TableauInsteadOfPowerPoint/1-Introduction

  • Tableau (or Spotfire or Qlikview) is better then PowerPoint for Presentations and Slides

  • Tableau (or Spotfire or Qlikview) is the Desktop and the Web Container for Web Pages, Slides, Images, Texts

  • Good Visualization Tool can be a Container for other Data Visualizations

  • Sample Tableau Presentation above contains the Introductory Textual Slide

  • Sample Tableau Presentation above  contains a few Tableau Visualization:This Tableau Presentation contains a Web Page with the Google-based Motion Chart Demo

    1. The Drill-down Demo

    2. The Motion Chart Demo ( 6 dimensions: X,Y, Shape, Color, Size, Motion in Time)

  • This Tableau Presentation contains a few Image-based Slides:

    1. The Quick Description of Origins and Evolution of Software and Tools used for Data Visualizations during last 30+ years

    2. The Description of Multi-level Projection from Multidimensional Data Cloud to Datasets, Multidimensional Cubes and to Chart

    3. The Description of 6 stages of Software Development Life Cycle for Data Visualizations

(this is a repost from my Tableau blog: http://tableau7.wordpress.com/2012/04/02/palettes-and-colors/ )

I was always intrigued with colors and their usage, since my mom told me that may be ( just may be, there is no direct prove of it anyway) Ancient Greeks did not know what the BLUE color is – that puzzled me.

Later in my live, I realized that Colors and Palettes are playing the huge role in Data Visualization (DV) and it eventually led me to attempt to understand of how it can be used and pre-configured in advanced DV tools to make Data more Visible and to express the Data Patterns better. For this post I used Tableau to produce some palettes, but similar technique can be found in Qlikview, Spotfire etc.

Tableau published the good article of how to create customized palettes here: http://kb.tableausoftware.com/articles/knowledgebase/creating-custom-color-palettes and I followed it below. As this article recommended, I modified default Preferences.tps file; see it below with images of respective Palettes embedded.

For the first, regular Red-Yellow-Green-Blue Palette with known colors with well-established names, I created even a Visualization in order to compare their Red-Green-Blue components and I even tried to placed respective Bubbles on 2-dimensional surface, even originally it is clearly a 3 dimensional Dataset (click on image to see it in full size):

For the 2nd Red-Yellow-Green-NoBlue Ordered Sequential Palette, I tried to implement the extended “Set of Traffic Lights without any trace of BLUE Color” (so Homer and Socrates will understand it the same way as we are) while trying to use only web-safe colors. Please keep in mind, that Tableau does not have a simple way to have more than 20 colors in one Palette, like Spotfire does.

Other 5 Palettes below are useful too as ordered-diverging almost “mono-chromatic” (except Red-Green Diverging, since it can be used in Scorecards when Red is bad and Green is good). So see below Preferences.tps file with my 7 custom palettes.

<?xml version=’1.0’?> <workbook> <preferences>
<color-palette name=”RegularRedYellowGreenBlue” type=”regular”>
<color>#FF0000</color> <color>#800000</color> <color>#B22222</color>
<color>#E25822</color> <color>#FFA07A</color> <color>#FFFF00</color>
<color>#FF7E00</color> <color>#FFA500</color> <color>#FFD700</color>
<color>#F0e68c</color> <color>#00FF00</color> <color>#008000</color>
<color>#00A877</color> <color>#99cc33</color> <color>#009933</color>
<color>#0000FF</color> <color>#00FFFF</color> <color>#008080</color>
<color>#FF00FF</color> <color>#800080</color>


<color-palette name=”RedYellowGreenNoBlueOrdered” type=”ordered-sequential” >
<color>#ff0000</color> <color>#cc6600</color> <color>#cccc00</color>
<color>#ffff00</color> <color>#99cc00</color> <color>#009900</color>


<color-palette name=”RedToGreen” type=”ordered-diverging” >
<color>#ff0000</color> <color>#009900</color> </color-palette>

<color-palette name=”RedToWhite” type=”ordered-diverging” >
<color>#ff0000</color> <color>#ffffff</color></color-palette>

<color-palette name=”YellowToWhite” type=”ordered-diverging” >
<color>#ffff00</color> <color>#ffffff</color></color-palette>

<color-palette name=”GreenToWhite” type=”ordered-diverging” >
<color>#00ff00</color> <color>#ffffff</color></color-palette>

<color-palette name=”BlueToWhite” type=”ordered-diverging” >
<color>#0000ff</color> <color>#ffffff</color> </color-palette>
</preferences> </workbook>

In case if you wish to use the colors you like, this site is very useful to explore the properties of different colors: http://www.perbang.dk/rgb/

« Previous PageNext Page »


Get every new post delivered to your Inbox.

Join 332 other followers