Popular


Visitors to this blog keep asking me to estimate Tableau Software prices (including for Tableau Online), even Tableau published all non-server prices on its website here: https://tableau.secure.force.com/webstore However this does not include discounts, especially for enterprise volume of buying no pricing for servers of any kind (at least 2 kinds of server licenses exist) and no pricing for consulting and training.

Thanks to website of Tableau Partner “Triad Technology Partners” we have a good estimate of all Tableau prices (they are always subject of negotiations) in form of so called GSA Schedule (General Services Administration, Federal Acquisition Service, Special Items: No. 132-33 Perpetual Software Licenses, No. 132-34 Maintenance of Software as a Service, No. 132-50 Training Courses) for Tableau Software Products and Services, see it here:

http://www.triadtechpartners.com/vendors/tableau-software/ here (for example it includes prices for IBM Cognos and others):
http://www.triadtechpartners.com/contracts/ and specific Tableau Prices here:
http://www.triadtechpartners.com/wp-content/uploads/Tableau-GSA-Price-List-April-2013.pdf

I grouped Tableau’s Prices (please keep in mind that TRIAD published GSA schedule in April 2013, so it is 1 year old prices, but they are good enough for estimating purposes)  in 5 groups below: Desktop, Server with licensing for Named Users (makes sense if you have less then hundred “registered” users), Core Licenses for Tableau Server (recommended when you have more then 150 “registered” users), Consulting and Training Prices:

Google sheet for spreadsheet above is here:

https://docs.google.com/spreadsheets/d/1oCyXRR3B6dqXcw-8cE05ApwsRcxckgA6QdvF9aF6_80/edit?usp=sharing
and image of it – for those who has misbehaved browsers is below:
TableauPrices2013

Again, please keep in mind that above just an estimate for prices (except for Tableau Online), based on 2013 GSA Schedule, and a good negotiator can always get a good discount (I got it each time I tried). You may also wish to review more general article from Boris Evelson here:

http://blogs.forrester.com/boris_evelson/14-04-22-a_common_denominator_for_pricing_and_negotiating_business_intelligence_bi_and_analytics_software#comment-27689

Note about choice between Core License and Server License with Named Users: I know organizations who choose to keep Named Users Licensing instead of switching to Core License even with more then 300 registered users, because it allows them to use much more capable hardware (with much more CPU Cores).

Observing and comparing multiple (similar) multidimensional objects over time and visually discovering multiple interconnected trends is the ultimate Data Visualization task, regardless of specific research area – it can be chemistry, biology, economy, sociology, publicly traded companies or even so called “Data Science”.

For purposes of this article I like the dataset, published by World Bank: 1000+ Measures (they called it World Development Indicators) of 250+ countries for over 50+ years – theoretically more then 10 millions of DataPoints:

http://data.worldbank.org/data-catalog/world-development-indicators?cid=GPD_WDI

Of course some DataPoints are missing so I restricted myself to 20 countries, 20 years and 25 measures (more reasonable Dataset with about 10000 DataPoints), so I got 500 Time Series for 20 Objects (Countries) and tried to imitate of how Analysts and Scientists will use Visualizations to “discover” Trends and other Data Patterns in such situation and extrapolate, if possible, this approach to more massive Datasets in practical projects. My visualization of this Dataset can be found here:

http://public.tableausoftware.com/views/wdi12/Trends?amp;:showVizHome=no

In addition to Trends Line Chart (please choose Indicator in Filter at bottom of the Chart, I added (in my Tableau Visualization above) the Motion Chart for any chosen Indicator(s) and the Motion Map Chart for GDP Indicator. Similar Visualization for this Dataset done by Google here: http://goo.gl/g2z1b6 .

As you can see below with samples of just 6 indicators (out of 1000+ published by World Bank), behavior of monitored objects (countries) are vastly different.

GDP trends: clear Leader is USA, with China is the fastest growing among economic leaders and Japan almost stagnant for last 20 years (please note that I use “GDP Colors of each country” for all other 1000+ indicators and Line Charts):

GDPTrends

Life Expectancy: Switzerland and Japan provide longest life to its citizens while India and Russian citizens are expected to live less then 70 years. Australia probably improving life expectancy faster than other 20 countries in this subset.

LifExpectancy

Health Expenditures Per Capita: Group of 4: Switzerland, Norway (fastest growing?), Luxemburg and USA health expenses about $9000 per person per year while India, Indonesia and China spent less then $500:

HealthExpenditurePerCapita

Consumer Price Index: Prices in Russia, India and Turkey growing faster then elsewhere, while prices in Japan and Switzerland almost unchanged in last 20 years:

CPI

Mobile Phones Per 100 Persons: Russia has 182 mobile phones per 100 people(fastest growing in last 10 years) while India has less then 70 cellular phones per 100 people.

CellPhonesPer100

Military Expenses as Percentage of Budget (a lot of missing data when it comes to military expenses!):  USA, India and Russia spending more then others – guess why is that:

MilitaryExpensesPercentageOfBudget

 

You can find many examples of Visual Monitoring of multiple objects overtime. One of samples is https://www.tradingview.com/ where over 7000 objects (publicly traded companies) monitored while observing hundreds of indicators (like share prices, Market Capitalization, EBITDA, Income, Debt, Assets etc.). Example (I did for previous blog post): https://www.tradingview.com/e/xRWRQS5A/

Data Visualization Readings, Q1 2014, selected from Google+ extensions of this blog:
http://tinyurl.com/VisibleData and
http://tinyurl.com/VisualizationWithTableau

dvi032914

Data Visualization Index (using DATA+QLIK+TIBX+MSTR; click on image above to enlarge):
Since 11/1/13 until 3/15/14: DATA stock grew 50%. QLIK 11%, MSTR – 6%, TIBX – lost 1%.
Current Market Capitalization: Tableau – $5.5B, QLIK – $2.6B, TIBCO – 3.5B, Microstrategy – $1.4B
Number of Job Openings Today: Tableau – 231, QLIK – 135, Spotfire (estimate) – 30, Microstrategy – 214
However during last 2 weeks of March of 2014 DATA shares lost 24%, QLIK lost 14%, TIBX and MSTR both lost about 10%

Why use R? Five reasons.
http://www.econometricsbysimulation.com/2014/03/why-use-r-five-reasons.html

Studying Tableau Performance Characteristics on AWS EC2
http://tableaulove.tumblr.com/post/80571148718/studying-tableau-performance-characteristics-on-aws-ec2

Head-to-head comparison of Datawatch and Tableau
http://datawatch.com/datawatch-vs-tableau

Diving into TIBCO Spotfire Professional 6.0
http://www.jenunderwood.com/2014/03/25/diving-into-tibco-spotfire-professional-6-0/

TIBCO beats Q1 2014 estimates but Spotfire falters
http://diginomica.com/2014/03/20/tibco-beats-estimates-spotfire-falters/

Qlik Doesn’t Fear Tableau, Oracle In Data Analytics
http://news.investors.com/031314-693154-qlik-focuses-on-easy-to-use-data-analytics.htm?p=full

Best of the visualisation web… February 2014
http://www.visualisingdata.com/index.php/2014/04/best-of-the-visualisation-web-february-2014/

Datawatch: ‘Twenty Feet From Stardom’
http://seekingalpha.com/article/2101513-datawatch-twenty-feet-from-stardom

Tableau plans to raise $345M — more than its IPO — with new stock offering
http://venturebeat.com/2014/03/16/tableau-plans-to-raise-345m-more-than-its-ipo-with-new-stock-offering/

TIBCO Spotfire Expands Connectivity to Key Big Data Sources
http://www.marketwatch.com/story/tibco-expands-connectivity-to-key-big-data-sources-2014-03-11

Tableau and Splunk Announce Strategic Technology Alliance
http://www.splunk.com/view/SP-CAAAKH5?awesm=splk.it_hQ

The End of The Data Scientist!?
http://alpinenow.com/blog/the-end-of-the-data-scientist/

bigData

Data Science Is Dead
http://slashdot.org/topic/bi/data-science-is-dead/

Periodic Table of Elements in TIBCO Spotfire
http://insideinformatics.cambridgesoft.com/InteractiveDemos/LaunchDemo/?InteractiveDemoID=1

Best of the visualisation web… January 2014
http://www.visualisingdata.com/index.php/2014/03/best-of-the-visualisation-web-january-2014/

Workbook Tools for Tableau
http://powertoolsfortableau.com/tableau-workbooks/workbook-tools/

Tapestry Data Storytelling Conference
http://www.tapestryconference.com/attendees
http://www.visualisingdata.com/index.php/2014/03/a-short-reflection-about-tapestry-conference/ ReadingLogo

URL Parameters in Tableau
http://interworks.co.uk/business-intelligence/url-parameters-tableau/

Magic Quadrant 2014 for Business Intelligence and Analytics Platforms
http://www.gartner.com/technology/reprints.do?id=1-1QLGACN&ct=140210&st=sb

What’s Next in Big Data: Visualization That Works the Way the Eyes and Mind Work
http://insights.wired.com/profiles/blogs/what-s-next-in-big-data-visualization-that-works-the-way-the-eyes#axzz2wPWAYEuY

What animated movies can teach you about data analysis
http://www.cio.com.au/article/539220/whatanimatedmoviescanteachaboutdata_analysis/

Tableau for Mac is coming, finally
http://www.geekwire.com/2014/tableau-mac-coming-finally/

Authenticating an External Tableau Server using SAML & AD FS
http://www.theinformationlab.co.uk/2014/02/04/authenticating-external-tableau-server-using-internal-ad/

Visualize this: Tableau nearly doubled its revenue in 2013
http://gigaom.com/2014/02/04/visualize-this-tableau-nearly-doubled-its-revenue-in-2013/

Qlik Announces Fourth Quarter and Full Year 2013 Financial Results
http://investor.qlik.com/releasedetail.cfm?ReleaseID=827231

InTheMiddleOfWinter2

Tableau Mapping – Earthquakes, 300,000,000 marks using Tableau 8.1 64-bit
http://theywalkedtogether.blogspot.com/2014/01/tableaumapping-earthquakes-300000000.html

Data Science: What’s in a Name?
http://www.linkedin.com/today/post/article/20130215205002-50510-the-data-scientific-method

Gapminder World Offline
http://www.gapminder.org/world-offline/

Advanced Map Visualisation in Tableau using Alteryx
http://www.theinformationlab.co.uk/2014/01/15/DrawingArrowsinTableau

Motion Map Chart
https://apandre.wordpress.com/2014/01/12/motion-map-chart/

One of Bill Gates’s favorite graphs redesigned
http://www.perceptualedge.com/blog/?p=1829

Authentication and Authorization in Qlikview Server
http://community.qlik.com/blogs/qlikviewdesignblog/2014/01/07/authentication-and-authorization

SlopeGraph for QlikView (D3SlopeGraph QlikView Extension)
http://www.qlikblog.at/3093/slopegraph-for-qlikview-d3slopegraph-qlikview-extension/

Revenue Model Comparison: SaaS v. One-Time-Sales
http://www.wovenware.com/blog/2013/12/revenue-model-comparison-saas-v-one-time-sales#.UyimffmwIUo

Scientific Data Has Become So Complex, We Have to Invent New Math to Deal With It
http://www.wired.com/wiredscience/2013/10/topology-data-sets/all/

Posting data to the web services from QlikView
http://community.qlik.com/docs/DOC-5530

It’s your round at the bar
http://interworks.co.uk/tableau/radial-bar-chart/

Lexical Distance Among the Languages of Europe
http://elms.wordpress.com/2008/03/04/lexical-distance-among-languages-of-europe/

SnowInsteadOfRainJan2014-SNOW

For last 6 years every and each February my inbox was bombarded by messages from colleagues, friends and visitors to this blog, containing references, quotes and PDFs to Gartner’s Magic Quadrant (MQ) for Business Intelligence (BI) and Analytics Platforms, latest can be found here: http://www.gartner.com/technology/reprints.do?id=1-1QLGACN&ct=140210&st=sb .

Last year I was able to ignore these noises (funny enough I was busy by migrating thousands of users from Business Objects and Microstrategy to Tableau-based Visual Reports for very large company), but in February 2014 I got so many questions about it, that I am basically forced to share my opinion about it.

  • 1st of all, as I said on this blog many times that BI is dead and it replaced by Data Visualization and Visual Analytics. That was finally acknowledged by Gartner itself, by placing Tableau, QLIK and Spotfire in “Leaders Quarter” of MQ for 2nd year in a row.

  • 2ndly last 6 MQs (2009-2014) are suspicious for me because in all of them Gartner (with complete disregard of reality) placed all 6 “Misleading” vendors (IBM, SAP, Oracle, SAS, Microstrategy and Microsoft) of wasteful BI platforms in Leaders Quarter! Those 6 vendors convinced customers to buy (over period of last 6 years) their BI software for over $60B plus much more than that was spent on maintenance, support, development, consulting, upgrades and other IT expenses.

There is nothing magic about these MQs: they are results of Gartner’s 2-dimensional understanding of BI, Analytics and Data Visualization (DV) Platforms, features and usage. 1st Measure (X axis) according to Gartner is the “Completeness of Vision” and 2nd Measure (Y axis) is the “Ability to Execute”, which allows to distribute DV and BI Vendors among 4 “Quarters”: RightTop – “Leaders”, LeftTop -“Challengers”, RightBottom – “Visionaires” and LeftBottom – “Niche Players” (or you can say LeftOvers).

mq2014

I decided to compare my opinions (expressed on this blog many times) vs. Gartner’s (they wrote 78 pages about it!) by taking TOP 3 Leaders from Gartner, than taking 3 TOP Visionaries from Gartner (Projecting on Axis X all Vendors except TOP 3 Leaders) than taking 3 TOP Challengers from Gartner (Projecting on Axis Y all Vendors except TOP 3 Leaders and TOP 3 Visionaries ) than TOP 3 “Niche Players” from the Rest of Gartner’s List (above) and taking “similar” choices by myself (my list is wider then Gartner’s, because Gartner missed important to me DV Vendors like Visokio and vendors like Datawatch and Advizor Solutions are not included into MQ in order to please Gartner’s favorites), see the comparison of opinions below:

12DVendorsIf you noticed, in order to be able to compare my opinion, I had to use Gartner’s terms like Leader, Challenger etc., which is not exactly how I see it. Basically my opinion overlapping with Gartner’s only in 25% of cases in 2014, which is slightly higher then in previous years – I guess success of Tableau and QLIK is a reason for that.

BI Market in 2013 reached $14B and at least $1B of it spent on Data Visualization tools. Here is the short Summary of the state of each Vendor, mentioned above in “DV Blog” column:

  1. Tableau: $232M in Sales, $6B MarketCap, YoY 82% (fastest in DV market), Leader in DV Mindshare, declared goal is “Data to the People” and the ease of use.

  2. QLIK: $470M in Sales, $2.5B MarketCap, Leader in DV Marketshare, attempts to improve BI, but will remove Qlikview Desktop from Qlik.Next.

  3. Spotfire: sales under $200M, has the most mature Platform for Visual Analytics, the best DV Cloud Services. Spotfire is limited by corporate Parent (TIBCO).

  4. Visokio: private DV Vendor with limited marketing and sales but has one of the richest and mature DV functionality.

  5. SAS: has the most advanced Analytics functionality (not easy to learn and use), targets Data Scientists and Power Users who can afford it instead of free R.

  6. Revolution Analytics: as the provider of commercial version and commercial support of R library is a “cheap” alternative to SAS.

  7. Microsoft: has the most advanced BI and DV technological stack for software developers but has no real DV Product and has no plan to have it in the future.

  8. Datawatch: $33M in sales, $281M MarketCap, has mature DV, BI and real-time visualization functionality, experienced management and sales force.

  9. Microstrategy: $576M in sales, 1.4B MarketCap; BI veteran with complete BI functionality; recently realized that BI Market is not growing and made the desperate attempt to get into DV market.

  10. Panorama: BI Veteran with excellent easy to use front-end to Microsoft BI stack, has good DV functionality, social and collaborative BI features.

  11. Advizor Solutions: private DV Veteran with almost complete set of DV features and ability to do Predictive Analytics interactively, visually and without coding.

  12. RapidMiner: Commercial Provider of open-source-based and easy to use Advanced Analytical Platform, integrated with R.

Similar MQ for “Advanced Analytics Platforms” can be found here: http://www.gartner.com/technology/reprints.do?id=1-1QXWEQQ&ct=140219&st=sg – have fun:

mq2014aap

In addition to differences mentioned in table above, I need to say that I do not see that Big Data is defined well enough to be mentioned 30 times in review of “BI and Analytical Platforms” and I do not see that Vendors mentioned by Gartner are ready for that, but may be it is a topic for different blogpost…

Update: 

My Best Wishes for 2014 to all visitors of this Blog!

New2014

2013 was very successful year for Data Visualization (DV) community, Data Visualization vendors and for this Data Visualization Blog (number of visitors per grew from average 16000 to 25000+ per month).

From certain point of view 2013 was the year of Tableau – it went public, Tableau has now the largest Market Capitalization among DV Vendors (more than $4B as of Today) and its strategy (Data to the People!) became the most popular among DV users and it had (again) largest YoY revenue growth (almost 75% !) among DV Vendors. Tableau already employed more than 1100 people and still has 169+ job openings as of today. I wish Tableau to stay the Leader of our community and to keep their YoY above 50% – this will not be easy.

Qliktech is the largest DV Vendor and it will exceed in 2014 the half-billion dollars benchmark in revenue (probably closer to $600M by end of 2014) and will employ almost 2000 employees. Qlikview is one of the best DV product on market. I wish in 2014 Qlikview will create Cloud Services, similar to Tableau Online and Tableau Public and I wish Qlikview.Next will keep Qlikview Desktop Professional (in addition to HTML5 client).

I wish TIBCO will stop trying to improve BI or make it better – you cannot reanimate a dead horse; instead I wish Spotfire will embrace the approach “Data to the People” and act accordingly. For Spotfire my biggest wish is that TIBCO will spin it off the same way EMC did with VMWare. And yes, I wish Spofire Cloud Personal will be free and enabled to read at least local flat files and local DBs like Access.

2014 (or may be 2015?) can witness new, 4th DV player coming to competition: Datawatch bought recently Panopticon and if it will complete integration of all products correctly and add features which other DV vendors above already have (like Cloud Services), it can be very competitive player. I wish them luck!

TibxDataQlikQwchFrom051713To122413

Microsoft released in 2013 a lot of advanced and useful DV-related functionality and I wish (I recycling this wish for many years now) that Microsoft finally will package the most its Data Visualization Functionality in one DV product and add it to Office 20XX (like they did with Visio) and Office 365 instead of bunch of plug-ins to Excel and SharePoint.

It is a mystery for me why Panorama, Visokio and Advizor Solutions still relatively small players, despite all 3 of them having an excellent DV features and products. Based on 2013 IPO experience with Tableau may be the best way for them to go public and get new blood? I wish to them to learn from Tableau and Qlikview success and try this path in 2014-15…

For Microstrategy my wish is very simple – they are only traditional BI player who realised that BI is dead and they started in 2013 (actually before then 2013) a transition into DV market and I wish them all success they can handle!

I also think that a few thousands of Tableau, Qlikview and Spotfire customers (say 5% of customer base) will need (in 2014 and beyond) more deep Analytics and they will try to complement their Data Visualizations with Advanced Visualization technologies they can get from vendors like http://www.avs.com/

My best wishes to everyone! Happy New Year!

y16_84590563

Since we approaching (in USA that is) a Thanksgiving Day for 2013 and shopping is not a sin for few days, multiple blog visitors asked me what hardware advise I can share for their Data Science and Visualization Lab(s). First of all I wish you will get a good Turkey for Thanksgiving (below is what I got last year):

Turkey2012

I cannot answer DV Lab questions individually – everybody has own needs, specifics and budget, but I can share my shopping thoughts about needs for Data Visualization Lab (DV Lab). I think DV Lab needs many different types of devices: smartphones, tablets, projector (at least 1), may be a couple of Large Touchscreen Monitors (or LED TVs connectable to PCs), multiple mobile workstations (depends on size of DV Lab team), at least one or two super-workstation/server(S) residing within DV Lab etc.

Smartphones and Tablets

I use Samsung Galaxy S4 as of now, but for DV Lab needs I will consider either Sony Xperia Z Ultra or Nokia 1520 with hope that Samsung Galaxy S5 will be released soon (and may be it will be the most appropriate for DV Lab):

sonyVSnokia

My preference for Tablet will be upcoming Google Nexus 10 (2013 or 2014 edition – it is not clear, because Google is very secritive about it) and in certain cases Google Nexus 7 (2013 edition). Until Nexus 10 ( next generation) will be released, I guess that two leading choices will be ASUS Transformer Pad TF701T

t701

and Samsung Galaxy Note 10.1 2014 edition (below is a relative comparison of the size of these 2 excellent tablets):

AsusVsNote10

Projectors, Monitors and may be Cameras.

Next piece of hardware in my mind is a projector with support for full HD resolution and large screens. I think there are many good choices here, but my preference will be BENQ W1080ST for $920 (please advise if you have a better projector in mind in the same price range):

benq_W1080ST

So far you cannot find too many Touchscreen Monitors for reasonable price, so may be these two 27″ touchscreen monitors (DELL P2714T for $620 or Acer T272HL bmidz for $560) are good choices for now:

dell-p2714t-overview1

I also think that a good digital camera can help to Data Visualization Lab and considering something like this (can be bought for $300): Panasonic Lumix DMC FZ72 with 60X optical zoom and ability to do a Motion Picture Recording as HD Video in 1,920 x 1,080 pixels – for myself:

panasonic_lumix_dmc_fz72_08

Mobile and Stationary Workstations and Servers.

If you need to choose CPU, I suggest to start with Intel’s Processor Feature Filter here: http://ark.intel.com/search/advanced . In terms of mobile workstations you can get quad-core notebook (like Dell 4700 for $2400 or Dell Precison 4800 or HP ZBook 15 for $3500) with 32 GB RAM and decent configuration with multiple ports, see sample here:

m4700

If you are OK with 16GB of RAM for your workstation, you may prefer Dell M3800 with excellent touchscreen monitor (3200×1800 resolution) and only 2 kg of weight. For a stationary workstation (or rather server) good choices are Dell Precision T7600 or T7610 or HP Z820 workstation. Either of these workstations (it will cost you!) can support up to 256GB RAM, up to 16 or even 24 cores in case of HP Z820), multiple high-capacity hard disks and SSD, excellent Video Controllers and multiple monitors (4 or even 6!) Here is an example of backplane for HP Z820 workstation:

HP-z820

I wish to visitors of this blog a Happy Holidays and good luck with their DV Lab Shopping!

With releases of Spotfire Silver (soon to to be a Spotfire Cloud), Tableau Online and attempts of a few Qlikview Partners (but not Qliktech itself yet) to the Cloud and providing their Data Visualization Platforms and Software as a Service, the Attributes, Parameters and Concerns of such VaaS or DVaaS ( Visualization as a Service) are important to understand. Below is attempt to review those “Cloud” details at least on a high level (with natural limitation of space and time applied to review).

But before that let’s underscore that Clouds are not in the skies but rather in huge weird buildings with special Physical and Infrastructure security likes this Data Center in Georgia:

GoogleDataCenterInGeorgiaWithCloudsAboveIt2

You can see some real old fashion clouds above the building but they are not what we are talking about. Inside Data Center you can see a lot of Racks, each with 20+ servers which are, together with all secure network and application infrastructure contain these modern “Clouds”:

GoogleDataCenterInGeorgiaInside2

Attributes and Parameters of mature SaaS (and VaaS as well) include:

  • Multitenant and Scalable Architecture (this topic is too big and needs own blogpost or article). You can review Tableau’s whitepaper about Tableau Server scalability here: http://www.tableausoftware.com/learn/whitepapers/tableau-server-scalability-explained
  • SLA – service level agreement with up-time, performance, security-related and disaster recovery metrics and certifications like SSAE16.
  • UI and Management tools for User Privileges, Credentials and Policies.
  • System-wide Security: SLA-enforced and monitored Physical, Network, Application, OS and Data Security.
  • Protection or/and Encryption of all or at least sensitive (like SSN) fields/columns.
  • Application Performance: Transaction processing speed, Network Latency, Transaction Volume, Webpage delivery times, Query response times
  • 24/7 high availability: Facilities with reliable and backup power and cooling, Certified Network Infrastructure, N+1 Redundancy, 99.9% (or 99.99% or whatever your SLA with clients promised) up-time
  • Detailed historical availability, performance and planned maintenance data with Monitoring and Operational Dashboards, Alerts and Root Cause Analysis
  • Disaster recovery plan with multiple backup copies of customers’ data in near real time at the disk level, a 

    multilevel backup strategy that includes disk-to-disk-to-tape data backup where tape backups serve as a secondary level of backup, not as their primary disaster recovery data source.

  • Fail-over that cascades from server to server and from data center to data center in the event of a regional disaster, such as a hurricane or flood.

While Security, Privacy, Latency and Hidden Cost usually are biggest concerns when considering SaaS/VaaS, other Cloud Concerns surveyed and visualized below. Recent survey and diagram are published by Charlie Burns this month:

CloudConcerns2013

Other survey and diagram are published by Shane Schick in October 2011 and in February of 2013 by KPMG. Here are concerns, captured by KPMG survey:

CloudConcernsKPMG

As you see above, Rack in Data Center can contain multiple Servers and other devices (like Routers and Switches, often redundant (at least 2 or sometimes N+1). Recently I designed the Hosting Data VaaS Center for Data Visualization and Business Intelligence Cloud Services and here are simplified version of it just for one Rack as a Sample.

You can see redundant network, redundant Firewalls, redundant Switches for DMZ (so called “Demilitarized Zone” where users from outside of firewall can access servers like WEB or FTP), redundant main Switches and Redundant Load Balancers, Redundant Tableau Servers, Redundant Teradata Servers, Redundant Hadoop Servers, Redundant NAS servers etc. (not all devices shown on Diagram of this Rack):

RackDiagram

20 months ago I checked how many job openings leading DV Vendors have. On 12/5/11 Tableau had 56, Qliktech had 46 and Spotfire had 21 openings. Today morning I checked their career sites again and noticed that both Tableau and Qliktech almost double their thirst for new talents, while Spotfire basically staying on the same level of hiring needs:

  • Tableau has 102(!) openings, 43 of them are engineering positions (I counted their R&D positions and openings in Operation department too) – that is huge! Update as of 9/18/13 has exactly 1000 employees. 1000th employee can be found on this pictureTableau1000Employees091813

  • Qliktech has 87 openings, 29 of them are engineering positions (I included R&D, IT, Tech Support and Consulting).

  • TIBCO/Spotfire has 24 openings, 16 of them are engineering positions (R&D, IT, Tech.Support).

BostonSkylineFromWindow

All 3 companies are Public now, so I decided to include their Market Capitalization as well. Since Spofire is hidden inside its corporate parent TIBCO, I used my estimate that Spotfire’s Capitalization is about 20% of TIBCO’s capitalization (which is $3.81B as of 8/23/13, see https://www.google.com/finance?q=TIBX ). As a result I have this Market Capitalization numbers for 8/23/13 as closing day:

Those 3 DV Vendors above together have almost $8B market capitalization as of evening of 8/23/13 !

Market Capitalization update as of 8/31/13: Tableau: $4.3B, Qliktech $2.9B, Spotfire (as 20% of TIBCO) – $0.72B

Market Capitalization update as of 9/4/13 11pm: Tableau: $4.39B, Qliktech $3B, Spotfire (as 20% of TIBCO) – $0.75B . Also as of today Qliktech employed 1500+ (approx. $300K revenue per year per employee), Tableau about 1000 (approx. $200K revenue per year per employee) and Spotfire about 500+ (very rough estimate, also approx. $350K revenue per year per employee)

I got many questions from Data Visualization Blog’s visitors about differences between compensation for full-time employees and contractors. It turned out that many visitors are actually contractors, hired because of their Tableau or Qlikview or Spotfire skills and also some visitors consider a possibility to convert to consulting or vice versa: from consulting to FullTimers. I am not expert in all these compensation and especially benefits-related questions, but I promised myself that my blog will be driven by vistors’s requests, so I google a little about Contractor vs. Full-Time worker compensation and below is brief description of what I got:

Federal Insurance Contribution Act mandates Payroll Tax splitted between employer (6.2% Social Security with max $$7049.40 and 1.45% Medicare on all income) and employee, with total (2013) as 15.3% of gross compensation.

Historical_Payroll_Tax_Rates

In addition you have to take in account employer’s contribution (for family it is about $1000/per month) to medical benefits of employee, Unemployment Taxes, employer’s contribution to 401(k), STD and LTD (short and long term disability insurances), pension plans etc.

I also added into my estimate of contractor rate the “protection” for at least 1 month GAP between contracts and 1 month of salary as bonus for full-time employees.

RR20120507-BCC-2

Basically the result of my minimal estimate as following: you need to get as a contractor the rate at least 50% more than base hourly rate of the full-time employee. This  base hourly rate of full-time employee I calculate as employee’s base salary divided on 1872 hours: 1872 = (52 weeks*40 hours – 3 weeks of vacation – 5 sick days – 6 holidays) = 2080 hours – 208 hours (Minimum for a reasonable PTO, Personal Time Off) = 1872 working hours per year.

I did not get into account any variations related to the usage of W2 or 1099 forms or Corp-To-Corp arrangements and many other fine details (like relocation requirements and overhead associated with involvement of middlemen like headhunters and recruiters) and differences between compensation of full-time employee and consultant working on contract – this is just a my rough estimate – please consult with experts and do not ask me any questions related to MY estimate, which is this:

  • Contractor Rate should be 150% of the base rate of a FullTimer

RS-COLLEGE LOAN SCAMS low resIn general, using Contractors (especially for business analytics) instead of Full-timers is basically the same mistake as outsourcing and off-shoring: companies doing that do not understand that their main assets are full-time people. Contractors are usually not engaged and they are not in business to preserve intellectual property of company.

Capitalist
For reference see Results of Dr. Dobbs 2013 Salary Survey for Software Developers which are very comparable with salary of Qlikview, Tableau and Spotfire developers and consultants (only in my experience salary of Data Visualization Consultants are 10-15% higher then salaries of software developers):

Fig01SalaryByTitle_full

This means that for 2013 the Average rate for Qlikview, Tableau and Spotfire developers and consultants should be around 160% of the base rate of a average FullTimer, which ESTIMATES to Effective Equivalent Pay to Contractor for 1872 hours per Year as $155,200 and this is only for average consultant... If you take less then somebody tricked you, but if you read above you already know that.

On May 3rd of 2012 the Google+ extension http://tinyurl.com/VisibleData of this Data Visualization blog reached 500+ followers, on July 9 it got 1000+ users, on October 11 it had already 2000+ users, 11/27/12 my G+ Data Visualization Page has 2190+ followers and still growing every day (updated as of 12/01/12: 2500+ followers.

One of reasons of course is just a popularity of Data Visualization related topics and other reason covered in interesting article here:

http://www.computerworld.com/s/article/9232329/Why_I_blog_on_Google_And_how_ .

In any case, it helped me to create a reading list for myself and other people, base on feedback I got. According to CicleCount, as of 11/13/12 update, my Data Visualization Google+ Page ranked as #178 most popular page in USA. Thank you G+ ! Updates:

5/25/13: G+ extension of this blog now has 3873+ followers,
and as of  7/15/13 as of 4277+ followers),
and as of 11/11/13 it has 5013+ followers:

DVFollowersOnGPlus111113

I also have 2nd G+ extension of this blog, see it here:  http://tinyurl.com/VisualizationWithTableau with 375 followers as of 11/11/13:

DVWithTableauFollowersOnGPlus111113

 

The short version of this post: as far as Data Visualization is a concern, the new Power View from Microsoft is the marketing disaster, the architectural mistake and the generous gift from Microsoft to Tableau, Qlikview, Spotfire and dozens of other vendors.

For the long version – keep reading.

Assume for a minute (OK, just for a second) that new Power View Data Visualization tool from Microsoft SQL Server 2012 is almost as good as Tableau Desktop 7. Now let’s compare installation, configuration and hardware involved:

Tableau:

  1. Hardware:  almost any modern Windows PC/notebook (at least dual-core, 4GB RAM).
  2. Installation: a) one 65MB setup file, b) minimum or no skills
  3. Configuration: 5 minutes – follow instructions on screen during installation.
  4. Price – $2K.

Power View:

  1. Hardware: you need at least 2 server-level PCs (each at least quad-core, 16GB RAM recommended). I will not recommend to use 1 production server to host both SQL Server and SharePoint; if you desperate, at least use VM(s).
  2. Installation: a) Each Server  needs Windows 2008 R2 SP1 – 3GB DVD; b) 1st Server needs SQL Server 2012 Enterprise or BI Edition – 4GB DVD; c) 2nd Server needs SharePoint 2010 Enterprise Edition – 1GB DVD; d) A lot of skills and experience
  3. Configurations: Hours or days plus a lot of reading, previous knowledge etc.
  4. Price: $20K or if only for development it is about $5K (Visual Studio with MSDN subscription) plus cost of skilled labor.

As you can see, Power View simply cannot compete on mass market with Tableau (and Qlikview and Spotfire) and time for our assumption in the beginning of this post is expired. Instead now is time to remind that Power View is 2 generations behind Tableau, Qlikview and Spotfire. And there is no Desktop version of Power View, it is only available as a web application through web browser.

Power View is a Silverlight application packaged by Microsoft as a SQL Server 2012 Reporting Services Add-in for Microsoft SharePoint Server 2010 Enterprise Edition. Power View is (ad-hoc) report designer providing for user an interactive data exploration, visualization, and presentation web experience. Microsoft stopped developing Silverlight in favor of HTML5, but Silverlight survived (another mistake) within SQL Server team.

Previous report designers (still available from Microsoft:  BIDS, Report Builder 1.0, Report Builder 3.0, Visual Studio Report Designer) are capable to produce only static reports, but Power View enables users to visually interact with data and drill-down all charts and Dashboard similar to Tableau and Qlikview.

Power View is a Data Visualization tool, integrated with Microsoft ecosystem. Here is a Demo of how the famous Hans Rosling Data Visualization can be reimplemented with Power View:

Compare with previous report builders from Microsoft, Power View allows many new features, like Multiple Views in a Single Report, Gallery preview of Chart Images, export to PowerPoint, Sorting within Charts by measures and Categories, Multiple Measures in Charts, Highlighting of selected data in reports and Charts, Synchronization of Slicers (Cross-Filtering), Measure Filters, Search in Filters (convenient for a long lists of categories), dragging data fields into Canvas (create table) or Charts (modify visualization), convert measures to categories (“Do Not Summarize”), and many other features.

As with any of 1st releases from Microsoft, you can find some bugs from Power View. For example, KPIs are not supported in Power View in SQL Server 2012, see it here: http://cathydumas.com/2012/04/03/using-or-not-using-tabular-kpis/

Power View is not the 1st attempt to be a full player in Data Visualization and BI Market. Previous attempts failed and can be counted as Strikes.

Strike 1: The ProClarity acquisition in 2006 failed, there have been no new releases since v. 6.3; remnants of ProClarity can be found embedded into SharePoint, but there is no Desktop Product anymore.

Strike 2: Performance Point Server was introduced in November, 2007, and discontinued two years later. Remnants of Performance Point can be found embedded into SharePoint as Performance Point Services.

Both failed attempts were focused on the growing Data Visualization and BI space, specifically at fast growing competitors such as Qliktech, Spotfire and Tableau. Their remnants in SharePoint functionally are very behind of Data Visualization leaders.

Path to Strike 3 started in 2010 with release of PowerPivot (very successful half-step, since it is just a backend for Visualization) and xVelocity (originally released under name VertiPaq). Power View is continuation of these efforts to add a front-end to Microsoft BI stack. I do not expect that Power View will gain as much popularity as Qlikview and Tableau and in my mind Microsoft will be a subject of 3rd strike in Data Visualization space.

One reason I described in very beginning of this post and the 2nd reason is absence of Power View on desktop. It is a mystery for me why Microsoft did not implement Power View as a new part of Office (like Visio, which is a great success) – as a new desktop application, or as a new Excel Add-In (like PowerPivot) or as a new functionality in PowerPivot or even as a new functionality in Excel itself, or as new version of their Report Builder. None of these options preventing to have a Web reincarnation of it and such reincarnation can be done as a part of (native SSRS) Reporting Services – why involve SharePoint (which is – and I said it many times on this blog – basically a virus)?

I am wondering what Donald Farmer thinking about Power View after being the part of Qliktech team for a while. From my point of view the Power View is a generous gift and true relief to Data Visualization Vendors, because they do not need to compete with Microsoft for a few more years or may be forever. Now IPO of Qliktech making even more sense for me and upcoming IPO of Tableau making much more sense for me too.

Yes, Power View means new business for consulting companies and Microsoft partners (because many client companies and their IT departments cannot handle it properly), Power View has a good functionality but it will be counted in history as a Strike 3.

I started recently the new Data Visualization Google+ page as the extension of this blog here:

https://plus.google.com/111053008130113715119/posts

.

Internet has a lot of articles, pages, blogs, data, demos, vendors, sites, dashboards, charts, tools and other materials related to Data Visualization and this Google+ page will try to point to most relevant items and sometimes to comment on most interesting of them.

.

What was unexpected is a fast success of this Google+ page – in a very short time it got 200+ followers and that number keeps growing!

.

On Friday July 8, 2011, the closing price of Qliktech’s share (symbol QLIK) was $35.43. Yesterday January 6, 2012, QLIK closed with price $23.21. If you consider yesterday’s price as 100% than QLIK (blue line below) lost 52% of value in just 6 months, while Dow Jones (red line below) basically lost only 2-3% :

Since Qliktech’s Market Capitalization as of yesterday evening was about $1.94B, it means that Qliktech lost in last 6 month about 1 billion dollars in capitalization! That is a sad observation to make and made me wonder why it happened?

I see nothing wrong with Qlikview software, in fact everybody knows (and this blog is the prove for it) that I like Qlikview very much.

So I tried to guess for reasons (for that lost) below, but it just my guesses and I will be glad if somebody will prove me mistaken and explain to me the behavior of QLIK stock during last 6 months…

2011 supposed to be the year of Qliktech: it had successful IPO in 2010, it doubled the size of its workforce (I estimate it has more than 1000 employees by end of 2011), it sales grew almost 40% in 2011, it kept updating Qlikview and it generated a lot of interest to it’s products and to Data Visualization market. In fact Qlliktech dominated its market and its marketshare is about 50% (of Data Visualization market).

So I will list below my guesses about factors which influenced QLIK stock and I do not think it was only one or 2 major factors but rather a combination of them (I may guess wrong or miss some possible reasons, please correct me):

  1. P/E Ratio (price-to-earnings) for QLIK is 293 (and it was even higher), which may indicate that stock is overvalued and investors expectations are too high.

  2. Company insiders (Directors and Officers) were very active lately selling their shares, which may affected the prices of QLIK shares.

  3. 56% of Qliktech’s sales are coming from Europe and European market is not growing lately.

  4. 58% of Qliktech’s sales are coming from existing customers and it can limit the speed of growth.

  5. Most new hires after IPO were sales, pre-sales, marketing and other non-R&D types.

  6. Qliktech’s offices are too diversified for its size (PA, MA, Sweden etc.) and what is especially unhealthy (from my view) is that R&D resides mostly in Europe while Headquarters, marketing  and other major departments reside far from R&D  – in USA (mostly in Radnor, PA)

  7. 2011 turned to be a year of Tableau (as oppose to my expectation to be a year of Qlikview) and Tableau is winning the battle for mindshare with its Tableau Public web service and its free Desktop Tableau Reader, which allows to distribute Data Visualizations without any Web/Application Servers and IT personnel to be involved. Tableau is growing much faster then Qliktech and it generates a huge momentum, especially in USA, where Tableau’s R&D,QA, Sales, Marketing and Support all co-reside in Seattle, WA.

  8. Tableau has the best support for Data Sources; for example, which is important due soon to be released SQL Server 2012, Tableau has the unique ability to read Multidimensional OLAP Cubes from SQL Server Analysis Services and from local Multidimensional Cubes from PowerPivot. Qlikview so far ignored Multidimensional Cubes as data sources and I think it is a mistake.

  9. Tableau Software, while it is 3 or 4 times smaller then Qliktech, managed to be able to have more job openings then Qliktech and many of them in R&D, which is a key for a future growth! Tableau’s sales in 2011 reached $72M, workforce is 350+ now (160 of them were hired in 2011!), number of customers is more then 7000 now…

  10. I am aware of more and more situations when Qlikview is starting to feel (and sometimes lose) a stiff competition; one of the latest cases documented (free registration may be required) here: http://searchdatamanagement.techtarget.co.uk/news/2240112678/Irish-Life-chooses-Tableau-data-visualisation-over-QlikView-Oracle and it happened in Europe, where Qlikview suppose to be stronger then competitors. My recent Data Visualization poll also has Tableau as a winner, while Qlikview only on 3rd place so far.

  11. In case if you miss it, 2011 was successful for Spotfire too. In Q4 2011 Earnings Call Transcript, TIBCO “saw demand simply explode across” some product areas. According to TIBCO, “Spotfire grew over 50% in license revenue for the year and has doubled in the past two years”. If it is true, that means Spotfire Sales actually approached $100M in 2011.

  12. As Neil Charles noted, that Qliktech does not have transparent pricing and “Qlikview’s reps are a nightmare to talk to. They want meetings; they want to know all about your business; they promise free copies of the software. What they absolutely will not do is give you a figure for how much it’s going to cost to deploy the software onto x analysts’ desktops and allow them to publish to a server.” I tend to agree that Qliktech’s pricing policies are pushing many potential customers away from Qlikview toward Tableau where almost all prices known upfront.

I hope I will wake up next morning or next week or next month or next quarter and Qliktech somehow will solve all these problems (may be perceived just by me as problems) and QLIK shares will be priced higher ($40 or above?) than today – at least it is what I wish to my Qliktech friends in new 2012…

Update on 3/2/12 evening: it looks like QLIK shares reading my blog and trying to please me: during last 2 months they regained almost $9 (more then 30%), ending the 3/2/12 session with $29.99 price and regaining more then $550M in market capitalization (qlik on chart to get full-size image of it):

I guess if  QLIK will go in wrong direction again, I have to blog about it, and it will correct itself!

I said on this blog many times that 80% of Data Visualization (DV) is … Data.

SQL Server 2012 is here.

And technology and process of how these Data collected, extracted, transformed and loaded into DV backend and frontend is a key to DV success. It seems to me that one of the best possible technology for building DV backend is around the corner as SQL Server 2012 will be released soon – Release Candidate for it is out…

And famous Microsoft marketing machine is not silent about it. SQL Server 2012 Virtual Launch Event planned for March 7, 2012 and real release probably at the end of March 2012.

Columnstore Index.

I already mentioned on this blog the most interesting feature for me – the introduction of Columnstore Index (CSI) can transform SQL Server into Columnar Database (for DV purposes) and accelerates DV-relevant Queries by 10X or even 100X of times. Oracle does not have it!

.

Some reasonable rules and features applied to CSI: each table can have only one CSI; CSI has Row grouping (about million rows, like paging for columns); table with CSI cannot be replicated. New (unified for small and large memory allocations) memory manager optimized for Columnstore Indexes, supports Windows 8 maximum memory and logical processors.

Power View.

SSRS (Reporting Services) got massive improvements, including new Power View as Builder/Viewer of interactive Reports. I like this feature: “even if a table in the view is based on an underlying table that contains millions of rows, Power View only fetches data for the rows that are visible in the view at any one time” and UI features (some of them are standard for existing Data Visualization tools, like multiple views in Power View reports (see gallery of thumbnails in the bottom of screenshot below):

.

“2 clicks to results”, export to PowerPoint etc. See also video here:

.

.

PowerView is still far behind Tableau and Qlikview as a Visualizer, but at least it makes SSRS reports more interactive and development of them easier. Below are some thumbnails of Data Visualization samples produced with PowerView and presented by Microsoft:

Support for Big Data.

SQL Server 2012 has a lot new features like “deep” HADOOP support (including Hive ODBC Driver) for “big data” projects, ODBC drivers for Linux, grouping databases into Availability Group for simultaneous failover, Contained Databases (enable easy migration from one SQL Server instance to another) with contained Database users.

Parallel Data Warehouse, Azure, Data Explorer.

And don’t forget PDW (SQL Server-based Parallel Data Warehouse;  massive parallel processing (MPP) provides scalability and query performance by running independent servers in parallel with up to 480 cores) and SQL Azure cloud services with it high availability features…

.

New Data Explorer allows discover data in the cloud and import them from standard and new data sources, like OData, Azure Marketplace, HTML etc. and visualize and publish your Data to the cloud.

LocalDB.

LocalDB is a new free lightweight deployment option for SQL Server 2012 Express Edition with fewer prerequisites that installs quickly. It is an embedded SQL Server database for desktop applications (especially for DIY DV apps) or tools. LocalDB has all of the same programability features as SQL Server 2012 Express, but runs in user mode with applications and not as a service. Application that use LocalDB simply open a file. Once a file is opened, you get SQL Server functionality when working with that file, including things like ACID transaction support. It’s not intended for multi-user scenarios or to be used as a server. (If you need that, you should install SQL Server Express.)

BIDS.

SQL Server 2012 is restoring a very desirable feature, which was missing in Visual Studio 2010 for 2+ years – something called BIDS (BI Development Studio was available as part of Visual Studio 2008 and SQL Server 2008). For that a developer needs VS2010 installed with SP1 and then install “SQL Server Data Tools” (currently it is in the state of CTP4, but I guess it will be a real thing when when SQL Server 2012 will be released to production).

SSAS, Tabular Mode, PowerPivot, DAX.

Most important improvement for BI and Data Analytics will be of course the changes in SSAS (SQL Server Analysis Services), including the addition of  Tabular Mode, restoration of BIDS (see above), the ability to design local multidimensional cubes with PowerPivot and Excel and then deploy them directly from Excel as SSAS Cubes, the new DAX language shared between PowerPivot and SSAS, and availability of all those Excel Services directly from SSAS without any need for SharePoint. I think those DV tools who will able to connect to those SSAS and PowerPivot Cubes will have a huge advantage. So far only Tableau has it (and Omniscope has it partially).

Backend for Data Visualization.

All of these features making SQL Server 2012 a leading BI stack and backend for Data Visualization applications and tools. I just wish that Microsoft will develop an own DV front-end tool, similar to Tableau or Qlikview and integrate it with Office 201X (like they did with Visio), but I guess that DV market ( approaching $1B in 2012) is too small compare with markets for Microsoft Office and SQL Server.

Pricing.

Now is time for a “bad news”. The SQL Server 2012 CAL price will increase by about 27%. New pricing you can see below and I predict you will not like it:

Data, Story and Eye Candy.

Data Visualization has at least 3 parts: largest will be a Data, the most important part will be a Story behind those Data and a View (or Visualization) is just an Eye Candy on top of it. However only a View allows users to interact, explore, analyze and drilldown those Data and discover the Actionable Info, which is why Data Visualization (DV) is such a Value for business user in the Big (and even in midsized) Data Universe.

Productivity Gain.

One rarely covered aspect of advanced DV usage is a huge a productivity gain for application developer(s). I recently had an opportunity to estimate a time needed to develop an interactive DV reporting application in  2 different groups of DV & BI environments

Samples of Traditional and Popular BI Platforms.

  1. Open Source toolsets like Jaspersoft 4/ Infobright 4/ MySQL (5.6.3)
  2. MS BI Stack (Visual Studio/C#/.NET/DevExpress/SQL Server 2012)
  3. Tried and True BI like Microstrategy (9.X without Visual Insight)

Samples of Advanced DV tools, ready to be used for prototyping

  1. Spotfire (4.0)
  2. Tableau (6.1 or 7.0)
  3. Qlikview (11.0)

Results proved a productivity gain I observed for many years now: first 3 BI environments need month or more to complete and last 3 DV toolsets required about a day to complete entire application. The same observation done by … Microstrategy when they added Visual Insight (in attempt to compete with leaders like Qlikview, Tableau, Spotfire and Omniscope) to their portfolio (see below slide from Microstrategy presentation earlier this year, this slide did not count time to prepare the data and assume they are ready to upload):

I used this productivity gain for many years not only for DV production but for Requirement gathering, functional Specifications and mostly importantly for a quick Prototyping. Many years ago I used Visio for interactions with clients and collecting business requirements, see the Visio-produced slide below as an approximate example:

DV is the best prototyping approach for traditional BI

This leads me to a surprising point: modern DV tools can save a lot of development time in traditional BI environment as … a prototyping and requirement gathering tool. My recent experience is that you can go to development team which is completely committed for historical or other reasons to a traditional BI environment (Oracle OBIEE, IBM Cognos, SAP Business Objects, SAS, Microstrategy etc.) and prototype for such team dozens and hundreds new (or modify existing) reports in a few days or weeks and give it to the team to port it to their traditional environment.

These DV-based prototypes have completely different behavior from previous generation of (mostly MS-Word and PowerPoint based) BRD (Business Requirement Documents), Functional Specification, Design Documents and Visio-based application Mockups and prototypes: they are living interactive applications with real-time data updates, functionality refreshes in a few hours (in most cases at the same day as new request or requirement is collected) and readiness to be deployed into production anytime!

However, my estimate that 9 out of 10 such BI teams, even they will be impressed by prototyping capabilities of DV tools (and some will use them for prototyping!), will stay with their environment for many years due political (can you say job security) or other (strange to me) reasons, but 1 out of 10 teams will seriously consider to switch to Qlikview/Tableau/Spotfire. I see this as a huge marketing opportunity for DV vendors, but I am not sure that they know how to handle such situation…

Example: using Tableau for Storytelling:

Spreadsheets (VisiCalc or “Visible Calculator” was released by Dan Bricklin and Bob Frankston in October 1979 – 32 years ago – originally for Apple II computer) were one of the very first Business Intelligence (BI) software (sold over 700,000 copies in six years).

It was released on October 19, 1979 – see the original Diary of Dan about it (also see notes of Peter Jennings here and here and especially detailed Bob Frankston’s article here):

For historical purposes I have to mention that VisiCalc actually was not the first spreadsheet program invented (for example I am aware of multi-user spreadsheet software written before VisiCalc in USSR in PL/1 for mainframes with IBM’s IMS Database as a backend ), but it is a first commercial spreadsheet introduced on American market and it was a turning point of PC industry.

The “Visible Calculator” went on sale in November of 1979 and was a big hit. It retailed for US$100 and sold so well that many dealers started bundling the Apple II with VisiCalc. The success of VisiCalc turned Apple into a successful company, selling tens of thousands of the pricey 32 KB Apple IIs (no matter how hard Bob Frankston tried, he could not fit VisiCalc in the 16 KB of RAM on the low-end Apple II. VisiCalc would only be available for the much more expensive 32 KB Apple II) to businesses that wanted them only for the spreadsheet. Version of VisiCalc for Atari was even retailed for $200!

VisiCalc was published without any Patent and it is a living prove that Patent System currently is useless for people, abused by large corporations for their own benefit, and it is actually a brake for innovations and it is not protecting inventors. Absence of patent protection for VisiCalc created the Spreadsheet Revolution and Innovations (SuperCalc, Lotus 1-2-3, QuattroPro, Excel, OpenOffice’s Calc, Google’s Spreadsheets and many others) and tremendously accelerated PC industry.

As Dan Bricklin said it by himself “We all borrowed from each other” and as George Bernard Shaw said: If you have an apple and I have an apple and we exchange these apples then you and I will still each have one apple. But if you have an idea and I have an idea and we exchange these ideas, then each of us will have two ideas.”

Application of Spreadsheets in the BI field began with the integration of OLAP (On-Line Analytical Processing) and Pivot tables. In 1991, Lotus (in addition to 1-2-3) released Improv with Pivoting functionality (also see Quantrix as a reborned [originally in 1994-95] Improv), followed by Microsoft’s release (in Excel 5) of PivotTable in 1993 (trademarked by Microsoft). 500+ millions people currently using Excel and at least 5% of them using it for BI and Data Visualization purposes. PowerPivot added to Excel 2010 speedy and powerful in-memory columnar database which enables millions of end-users to have a self-serviced BI.

Essbase was the first scalable OLAP software to handle large data sets that the early spreadsheet software was incapable of. This is where its name comes from: Extended Spread Sheet Database (Essbase owned by Oracle now).  Currently one of the best OLAP and BI software is SSAS (Analysis Services from Microsoft SQL Server 2008 R2 and upcoming SQL Server 2012 with its new Tabular mode) and Excel 2010 with its PowerPivot, PivotTables and Pivot Charts is one of the most popular front-end for SSAS.

There is no doubt that Excel is the most commonly used software for “BI purposes”. While Excel is general business software, its flexibility and ease of use makes it popular for data analysis with millions of users worldwide. Excel has an install base of hundreds of millions of desktops: far more than any other “BI platform”. It has become a household name.With certain precaution it can be used for a good or at least prototyping Data Visualization (most of charts below created with Excel):

From educational utilization to domestic applications to prototyping (or approximated) Data Visualization and enterprise implementation, Excel has been proven incredibly indispensable. Most people with commercial or corporate backgrounds have developed a proficient Excel skillset. This makes Excel the ultimate self-service BI platform and spreadsheet technologies as a common ground for all viable Data Visualization technologies on market.

7 months ago I published a poll on LinkedIn and got a lot of responses, 1340 votes (in average 1 vote per hour) and comments. People asked me many times to repeat this poll from time to time. I guess it is time to re-Poll. I added 2 more choices (LinkedIn allows maximum 5 choices in their polls and it is clear not enough for this poll), based on a feedback I got: Omniscope and Visual Insight/Microstrategy. I also got some angry voters complaining that certain vendors are funding this poll. This is completely FALSE, I am unaffiliated with any of vendors, mentioned in this poll and I am working for completely independent (from those vendors) software company, see the About page of this Blog.


Microsoft finally released SQL Server 11 “Denali” as CTP3 (Community Technology Preview) for public … Preview. Microsoft is (these are politeness words I can type) stubbornly refusing to have/build own Data Visualization Product. I doubt Crescent “experience” can be considered as a product, especially because it is Silverlight-base, while world already moved to HTML5.

If you have 7 minutes, you can watch Crescent Demo from WPC11, which is showing that while trailing a few years behind DV Leaders and Google, Microsoft is giving to its die hard followers something to cheer about:

I have to admit, that while there is nothing new (for DV expert) in video above, it is a huge progress compare with Excel-based Data Visualizations, which Microsoft tried to promote as a replacement of ProClarity and PerformancePoint Server. Even Microsoft itself positions Crescent (which is 32-bit only!) as a replacement for SSRS Report Builder, so DV Leaders can sleep well another night.

However, Microsoft’s BI Stack is the number 4 or 5 on my list of DV Leaders and CTP3 is so rich with new cool functionality, that it deserves to be covered on this blog.

Of course major news is availability of Tabular Data Model, which means VertiPaq in-memory columnar Engine, similar to PowerPivot Engine but running on Server without any SharePoint (which is a slow virus, as far as I am concerned) and without stupid SharePoint UI and limitations and I quote Microsoft: ” In contrast with the previous release, where VertiPaq was only available via in PowerPivot for SharePoint, you can now use VertiPaq on a standalone Analysis Services instance with no dependency on SharePoint.“!

SSAS (SQL Server Analysis Services) has new (they may existed before, but before CTP3 – ALL who knew that were under NDA) features like memory paging (allows models to be larger than the physical memory of the server, means unlimited scalability and BIG Data support), row level security (user identity used to hide/show visible data), KPI, Partitions; CTP3 removes the maximum 4GB file size limit for string storage file, removes the limit of 2 billion rows per table (each column is still limited to a maximum of 2 billion distinct values, but in columnar database it is much more tolerable restriction!).

New version of PowerPivot is released with support of  Tabular Model and I quote: “You can use this version of the add-in to author and publish PowerPivot workbooks from Excel 2010 to Microsoft SQL Server” and it means no SharePoint involvement again! As Marco Russo put it: “Import your existing PowerPivot workbooks in a Tabular project (yes, you can!)” and I agreed 100% with Marco when he said 4 times: Learn DAX!

After 3 years of delays, Microsoft is finally has BIDS for Visual Studio 2010  and that is huge too, I quote again: “The Tabular Model Designer … is now integrated with Microsoft SQL Server “Denali” (CTP 3) Business Intelligence Development Studio.” It means that BIDS now is not just available but is the main unified development interface for both Multidimensional and Tabular Data Models. Now we can forget about Visual Studio 2008 and finally use more modern VS2010!

Another extremely important for Data Visualization feature is not in SSAS but in SQL Server itself: Columnstore index is finally released and I a quote 1 more time again: “The … SQL Server (CTP 3) introduces a new data warehouse query acceleration feature based on a new type of index called the columnstore. This new index … improves DW query performance by hundreds to thousands of times in some cases, and can routinely give a tenfold speedup for a broad range of decision support queries… columnstore indexes limit or eliminate the need to rely on pre-built aggregates, including user-defined summary tables, and indexed (materialized) views. Furthermore, columnstore indexes can greatly improve ROLAP performance” (ROLAP can be used for real-time Cubes and real-time Data Visualizations).

All these cool SQL Server 11 new stuff is coming soon into Azure Cloud and this can be scary for any DV vendor, unless it knows (Tableau does; Qliktech and Spotfire still ignore SSAS) how to be friendly with Microsoft.

As we know now the newly coined by Microsoft term BISM (Business  Intelligence  Semantic Model) was a marketing attempt to have a “unified” umbrella

for 2 different Data Models and Data Engines: Multidimensional Cubes (invented by Mosha Pasumansky 15 years ago and the foundation for SSAS and MDX – SQL Server Analysis Services) and Tabular Model (used in PowerPivot and VertiPaq in-memory columnar Database with new DAX Language which is going to be very important for future Data Visualization projects).

New CTP3-released BIDS 2010 (finally almighty Visual Studio 2010 will have a “Business Intelligence Development Studio” after 3+ years of unjustified delays!) UI-wise will able to handle these 2 Data Models, but it is giving me a clue why Mosha left Microsoft for Google. And lack of DV product is a clue for me why Donald Farmer (face of Microsoft BI) left Microsoft for Qliktech.

Even more: if you need both Data Models to be present, you need to install 2 (TWO!) different instances of “Analysis Services”: one with Multidimensional Engine and one with new Tabular (VertiPaq/PowerPivot) Engine. It seems to me not as ONE “BI” architecture but TWO “BI” Architectures, interface-glued on Surface by BIDS 2010 and on back-end by all kind of Data Connectors. Basically Microsoft is in confused BI state now because financially it can afford 2 BI Architectures and NO Data Visualization Product!

I cannot believe I am saying this, but I wish Bill Gates back from retirement (it will be good for Microsoft shares and good for Microsoft market capitalization too – just ask Apple’s shareholders about Steve and they will say he is a god)!

Permalink: https://apandre.wordpress.com/2011/07/14/tabular-model/

Below is a Part 3 of the Guest Post by my guest blogger Dr. Kadakal, (CEO of Pagos, Inc.). This article is about of how to build Dashboards and Data Visualizations with Excel. The topic is large, and the first portion of article (published on this blog 3 weeks ago) contains the the general Introduction and the Part 1 “Use of Excel as a BI Platform Today“.  The Part 2 – “Dos and Don’ts of building dashboards in Excel“ published 2 weeks ago  and Part 3 – “Publishing Excel dashboards to the Internet“ is started below and its full text is here.

As I said many times, BI is just a marketing umbrella for multiple products and technologies and Data Visualization became recently as one of the most important among those. Data Visualization (DV) so far is a very focused technology and article below shows how to publish Excel Data Visualizations and Dashboards on Web. Actually a few Vendors providing tools to publish Excel-based Dashboards on Web, including Microsoft, Google, Zoho, Pagos and 4+ other vendors:

I leave to the reader to decide if other vendors can compete in business of publishing Excel-based Dashbaords on Web, but the author of the artcile below provides a very good 3 criterias of how to select the vendor, tool and technology for it (and when I used it myself it left me only with 2 choices – the same as described in article).

Author: Ugur Kadakal, Ph.D., CEO and founder of Pagos, Inc. 

Publishing of Excel Dashboards on the Internet

Introduction

In previous article (see “Excel as BI Platform” here) I discussed Excel’s use as a Business Intelligence platform and why it is exceedingly popular software among business users. In 2nd article (“Dos&Don’ts of Building Successful Dashboards in Excel) I talked about some of the principles to follow when building a dashboard or a report in Excel. Together this is a discussion of why Excel is the most powerful self-service BI platform.

However, one of the most important facets of any BI platform is web enablement and collaboration. It is important for business users to be able to create their own dashboards but it is equally important for them to be able to distribute those dashboards securely over the web. In this article, I will discuss two technologies that enable business users to publish and distribute their Excel based dashboards over the web.

Selection Criteria

The following criteria were selected in order to compare the products:

  1. Ability to convert a workbook with most Excel-supported features into a web based application with little to no programming.
  2. Dashboard management, security and access control capabilities that can be handled by business users.
  3. On-premise, server-based deployment options.

Criteria #3 eliminates online spreadsheet products such as Google Docs or Zoho. As much as I support cloud based technologies, in order for a BI product to be successful it should have on-premise deployment options. Without on-premise you neglect the possibility of integration with other data sources within an organization.

There are other web based Excel conversion products on the market but none of them meet the criteria of supporting most Excel features relevant to BI; therefore, they were not included in this article about how to publish Excel Dashboard on Web .

Below is a Part 2 of the Guest Post by my guest blogger Dr. Kadakal, (CEO of Pagos, Inc.). This article is about of how to build Dashboards and Data Visualizations with Excel. The topic is large, and the first portion of article (published on this blog last week) contains the the general Introduction and the Part 1 “Use of Excel as a BI Platform Today“.

The Part 2 – “Dos and Don’ts of building dashboards in Excel“ is below and Part 3 – “Publishing Excel dashboards to the Internet“ is coming soon. It is easy to fall into a trap with Excel, but if  you avoid those risks as described in article below, Excel can become of one of the valuable BI and Data Visualization (DV) tool for user. Dr. Kadakal said to me recently: “if the user doesn’t know what he is doing he may end up spending lots of time maintaining the file or create unnecessary calculation errors”. So we (Dr. Kadakal and me) hope that article below can save time for visitors of this blog.

BI in my mind is a marketing umbrella for multiple products and technologies, including RDBMS, Data Collection, ETL, DW, Reporting, Multidimensional Cubes, OLAP, Columnar and in-Memory Databases, Predictive and Visual Analytics, Modeling and DV.

Data Visualization (aka DV), on other hand, is a technology, which enabling people to explore, drill-down, visually analyze their data and visually search for data patterns, like trends, clusters, outliers, etc. So BI is marketing super-abused term, while DV so far is focused technology and article below shows how to use Excel as a great Dashboard builder and Data Visualization tool.

Dos&Don’ts of Building Successful Dashboards in Excel

Introduction (click to see the full article here)

In previous week’s post (see also article “Excel as BI Platform” here) I discussed Excel’s use as a Business Intelligence platform and why it is exceedingly popular software among business users. In this article I will talk about some of the principles to follow when building a dashboard or a report in Excel.

One of the greatest advantages of Excel is its flexibility: it puts little or no constraints on the user’s ability to create their ideal dashboard environments. As a result, Excel is being used as a platform for solving practically any business challenge. You will find individuals using Excel to solve a number of business-specific challenges in practically any organization or industry. This makes Excel the ultimate business software.

On the other hand, this same flexibility can lead to errors and long term maintenance issues if not handled properly. There are no constraints on data separation, business logic or the creation of a user interface. Inexperienced users tend to build their Excel files by mixing them up. When these facets of a spreadsheet are not properly separated, it becomes much harder to maintain those workbooks and they become prone to errors.

In this article, I will discuss how you can build successful dashboards and reports by separating data, calculations and the user interface. The rest of this post you can find in this article

 Dos and Don’ts of building dashboards in Excel” here.

It discusses how to prepare Data (both static and external) for dashboards, how to build formulas and calculation models, UI and Input Controls for Dashboards and of course – Pivots,Charts, Sparklines and Conditional Formatting for innovative and powerful Data Visualizations in Excel.

This is a Part 1 of surprise Guest post. My guest is Ugur Kadakal, Ph.D., he is the CEO and founder of Pagos, Inc., which he started almost 10 years ago.

Dr. Kadakal is an expert in Excel, Business Intelligence, Data Analytics and Data Visualization. His comprehensive knowledge of Excel, along with his ambitious inventions and ideas, supply the foundation for all Pagos products, which include SpreadsheetWEB (which converts Excel spreadsheets into web applications), SpreasheetLIVE  (a fully-featured, browser-based spreadsheet application environment) and Pagos Spreadsheet Component (which integrates Excel spreadsheets into enterprise web applications).

Pagos started and hosted the largest free collection and repository of professional templates of Excel spreadsheets on the web: http://spreadsheetzone.com . 3 Excel-based Dashboard below can be found on this very popular repository and done by Dr. Kadakal:

Dashboard 1 : Human Resources Dashboard: http://spreadsheetzone.com/templateview.aspx?i=498

Dashboard 2 : Business Activity Dashboard in EuroZone: http://spreadsheetzone.com/templateview.aspx?i=490

Dashboard 3 : Energy Dashboard for Euro Zone: http://spreadsheetzone.com/templateview.aspx?i=491

The topic is large, so this Guest article is splitted on 3 blog posts. The first portion of article contains the Introduction and Part 1 “Use of Excel as a BI Platform Today“, then I expect Dr. Kadakal will do at least 2 more posts: Part 2 – “Dos and Don’ts of building dashboards in Excel“, Part 3 – “Moving Excel dashboards to the Web“.

Excel as a Business Intelligence Platform – Part 1

Introduction

Electronic spreadsheets were one of the very first Business Intelligence (BI) software. While the availability of spreadsheet software and it use as a tool for data analysis dates back to the 1960s, its application in the BI field began with the integration of OLAP and pivot tables. In 1991, Lotus released Improve, followed by Microsoft’s release of PivotTable in 1993. However, Essbase was the first scalable OLAP software to handle large data sets that the early spreadsheet software was incapable of. This is where its name comes from: Extended Spread Sheet Database.

There is no doubt that Microsoft Excel is the most commonly used software for BI purposes. While Excel is general business software, its flexibility and ease of use makes it popular for data analysis with millions of users worldwide. Excel has an install base of hundreds of millions of desktops: far more than any other BI platform. It has become a household name. From educational utilization to domestic applications and enterprise implementation, Excel has been proven incredibly indispensable. Most people with commercial or corporate backgrounds have developed a proficient Excel skillset. This makes Excel the ultimate self-service BI platform. However, like all systems, Excel has some weaknesses that make it difficult to use as a BI tool under certain conditions.

Use of Excel as a BI Platform Today

Small Businesses

Traditionally, small businesses are not considered as an important market segment by most BI vendors. Their data analysis and reporting needs are limited primarily due to their smaller commercial volumes. However, this is changing quickly as smaller organizations begin to collect large amounts of data, thanks to the Internet and social media, and require tools to manage that data. However, what is not changing is the limited financial resources available to them. Small businesses cannot spare to spend large amounts of money on BI software or consultants to aid them in the creation of the applications. That’s why Excel is the ideal platform for them and will most probably remain that way for a foreseeable future. The reasons are clear: (1) most of them already have Excel licenses, (2) most of their users know how to use Excel and (3) their needs are simpler and can be met with Excel.

Mid-Range Businesses

Mid-range businesses are a quickly growing market segment for BI vendors. Traditionally, Excel as a BI platform has been more popular among these businesses. Cost and availability are the primary factors in this. However, two aspects have been steering them to searching for alternatives: (1) Excel can no longer handle their growing data volumes and (2) other BI vendors started offering cost-effective alternatives.

As a result, Excel’s market share in this field is in decline although it still remains the most popular. On the other hand, with the release of Office 2010 and its extended capabilities for handling very large data sets, Excel stands a good chance at reversing this decline.

Large Enterprises

The situation with large enterprises is rather complex. Most of them already have large-scale a BI implementation in place. Those implementations often connect various databases and data warehouses within the organizations. They have made significant investments and continue doing so to expand and maintain their BI systems. They already have a number of dashboards and reports designed to serve their business units. However, business users always need new and different dashboards and reporting tools. The only software that gives them the ultimate flexibility in creating their own reports is Excel. As a result, even in large Enterprises, usage of Excel for BI purposes is common. Business users often go to their data warehouses or BI tools and get a data extract to bring into Excel. They can then prepare their analysis and build their reports in Excel.

Enterprises will continue using their existing platforms because they have made huge investments building those systems. However, Excel use by business users as their secondary BI and reporting tool will continue to rise unless the alternative vendors significantly improve their self-servicing capabilities.

Summary

Excel is one of the ultimate business platforms and offers unparalleled features and capabilities to non-programmers. This makes it an ideal self-service BI platform. In this article, we examined the use of Excel as a BI platform in companies of different sizes. In the next article of this series, we will discuss how to use Excel more efficiently as a BI platform, from handling data to calculations and visual interactions.

TIBCO released Spotfire 3.3 and first (see what is new here) that jumped to my eyes was how mature this product is. For example, among new features is improved scalability – each additional simultaneous user of a web analysis initially claims very little additional system memory:

Many Spotfire customers will be able to support a greater number of web users on their existing hardware by upgrading to 3.3. Spotfire Web Player 3.3 includes significant improvements in memory consumption (as shown above for certain scenarios). Theoretically goal is to minimize the amount of system memory needed to support larger numbers of simultaneous users on the same analysis file. Main use case here: the larger the file and the greater the number of simultaneous web users on that file, then less initial system memory required to support each additional user: it is greatly reduced compared to version 3.2.1 and earlier.

Comparison with competition and thorough testing of new Spotfire scalability has to be done (similar to what Qliktech done with Qlikview here), but my initial reaction is as I said in a Title: we are witnessing a very mature software. Apparently the Defense Intelligence Agency (DIA) agrees with me and Defense Intelligence Agency Selects TIBCO Spotfire Analytics Solutions for Department of Defense Intelligence Information System Community. “With more than 16,500 military and civilian employees worldwide, DIA is a major producer and manager of foreign military intelligence”

Spotfire 3.3 also includes collaborative bookmarking, which enables all Spotfire users  to capture a dashboard – its complete configuration, including markings, drop down selections, and filter settings and share that visualization immediately with other users of that same dashboard, regardless of client in use. Spotfire actually not just a piece of Data Visualization Software, but a real Analytical Platform with large portfolio of products, including completely integrated S-PLUS (commercial version of R Library which has million of users), best Web Client (you can go Zero-footprint with Spotfire Web Player or/and partially free Spotfire Silver), free iPad Client version 1.1.1 (requires iTunes, so be prepared for Apple intrusion), very rich API, SDK, integration with Visual Studio, support of IronPython and JavaScript , well-thought Web Architecture, set of Extension Points etc.

System requirements for Spotfire 3.3 can be found here. Coincidentally with 3.3 Release Spotfire VAR Program got expansion too. Spotfire has a very rich set of training options, see it here. You can also find set of good Spotfire videos from Colin White’s Screencast Library, especially 2011 Webcasts.

My only and large concern with Spotfire is its focus, since it is part of a large corporation TIBCO, which has 50+ products and 50+ reasons to focus on something else. Indirectly it can be confirmed with sales: my estimate that Tableau is growing much faster than Spotfire (sales-wise) and Qlikview Sales probably 3 times larger (dollar-wise) than Spotfire sales. Since TIBCO bought Spotfire in 2007, I expected Spotfire will be integrated with other great TIBCO products, but after 4 years it is still not a case… And TIBCO has no reason to change its corporate policies, since its busines is good and stock is doing well:

(at least 500% increase of share price since end of 2008!). Also see article written by Ted Stamas for SeekingAlpha and comparison of TIBX vs. ETF here:

I think it is interesting to notice that TIBCO recently rejected a buyout offer from HP!

For many years, Gartner keeps annoying me every January by publishing so called “Magic Quadrant for Business Intelligence Platforms” (MQ4BI for short) and most vendors (mentioned in it; this is funny, even Donald Farmer quotes MQ4BI) almost immediately re-published it either on so-called reprint (e.g. here – for a few months) area of Gartner website or on own website; some of them also making this “report” available to web visitors in exchange for contact info – for free. To channel my feeling toward Gartner  to a  something constructive, I decided to produce my own “Quadrant” for Data Visualization Platforms (DV “Quadrant” or Q4DV for short) – it is below and is a work in-progress and will be modified and republished overtime:

3 DV Leaders (green dots in upper right corner of Q4DV above) compared with each other and with Microsoft BI stack on this blog, as well as voted in DV Poll on LinkedIn. MQ4BI report actually contains a lot of useful info and it deserved to be used as a one of possible data sources for my new post, which has more specific target – Data Visualization Platforms. As I said above, I will call it Quadrant too: Q4DV. But before I will do that, I have to comment on Gartner’s annual MQ4BI.

MQ4BI customer survey included vendor-provided references, as well as survey responses from BI users in Gartner’s BI summit and inquiry lists. There were 1,225 survey responses (funny enough, almost the same number of responces as on my DV Poll on LinkedIn), with 247 (20%) from non-vendor-supplied reference lists. Magic Quadrant Customer Survey’s results the Gartner promised to publish in 1Q11. The Gartner has a somewhat reasonable “Inclusion and Exclusion Criteria” (for Data Visualization Q4DV I excluded some vendors from Gartner List and included a few too), almost tolerable but a fuzzy BI Market Definition (based on 13 loosely pre-defined capabilities organized into 3 categories of functionality: integration, information delivery and analysis).

I also partially agree with the definition and the usage of “Ability to Execute” as one  (Y axis) of 2 dimensions for bubble Chart above (called the same way as entire report “Magic Quadrant for Business Intelligence Platforms”). However I disagree with Gartner’s order of vendors in their ability to execute and for DV purposes I had to completely change order of DV Vendors on X axis (“Completeness of Vision”).

For Q4DV purposes I am reusing Gartner’s MQ as a template, I also excluded almost all vendors, classified by Gartner as niche players with lower ability to execute (bottom-left quarter of MQ4BI), except Panorama Software (Gartner put Panorama to a last place, which is unfair) and will add the following vendors: Panopticon, Visokio, Pagos and may be some others after further testing.

I am going to update this DV “Quadrant”, using the method suggested by Jon Peltier: http://peltiertech.com/WordPress/excel-chart-with-colored-quadrant-background/ – Thank you Jon! I hope I will have time before end of 2011 for it…

Permalink: https://apandre.wordpress.com/2011/02/13/q4dv/

On New Year Eve I started on LinkedIn the Poll “What tool is better for Data Visualization? and 1340 people voted there (which is unusually high return for LinkedIn polls, most of them getting less then 1000 votes), in average one vote per hour during 8 weeks, which is statistically significant as a reflection of the fact that the Data Visualization market has 3 clear leaders (probably at least a generation ahead of all other competitors: Spotfire, Tableau and Qlikview. Spotfire is a top vote getter: as of 2/27/11, 1pm EST: Spotfire got 450 votes (34%), Tableau 308 (23%), Qlikview 305 (23% ; Qlikview result improved during last 3 weeks of this poll), PowerPivot 146 (11%, more votes then all “Other” DV Tools) and all Others DV tools got just 131 votes (10%). Poll got 88 comments (more then 6% of voters commented on poll!) , will be open for more unique voters until 2/27/11, 7pm and its results consistent during last 5 weeks, so statistically it represents the user preferences of the LinkedIn population:

URL is http://linkd.in/f5SRw9 but you need to login to LinkedIn.com to vote. Also see some demographic info (in somewhat ugly visualization by … LinkedIn) about poll voters below:

Interesting that Tableau voters are younger then for other DV tools and more then 82% voters in poll are men. Summary of some comments:

  • – poll’s question is too generic – because an answer partially depends on what you are trying to visualize;
  • – poll is limited by LinkedIn restrictions, which allows no more than 5 possible/optional answers on Poll’s question;
  • – poll’s results may correlate with number of Qlikview/Tableau/Spotfire groups (and the size of their membership) on LinkedIn and also ability of employees of vendors of respective tools to vote in favor of the tool, produced by their company (I don’t see this happened). LinkedIn has 85 groups, related to Qlikview (with almost 5000 members), 34 groups related to Tableau (with 2000+ members total) and 7 groups related to Spotfire (with about 400 members total).
  • Randall Hand posted interesting comments about my poll here:    http://www.vizworld.com/2011/01/tool-data-visualization/#more-19190 . I disagreed with some of Randall’s assessments that “Gartner is probably right” (in my opinion Gartner is usually wrong when it is talking about BI, I posted on this blog about it and Randall agreed with me) and that “IBM & Microsoft rule … markets”. In fact IBM is very far behind (of Qlikview, Spotfire and Tableau) and Microsoft, while has excellent technologies (like PowerPivot and SSAS) are behind too, because Microsoft made a strategic mistake and does not have a visualization product, only technologies for it.
  • Spotfire fans from Facebook had some “advise” from here: http://www.facebook.com/TIBCOSpotfire (post said “TIBCO Spotfire LinkedIn users: Spotfire needs your votes! Weigh in on this poll and make us the Data Visualization tool of choice…” (nothing I can do to prevent people doing that, sorry). I think that the poll is statistically significant anyway and voters from Facebook may be added just a couple of dozens of votes for … their favorite tool.
  • Among Other Data Visualization tools, mentioned in 88 comments so far were JMP, R, Panopticon, Omniscope (from Visokio), BO/SAP Explorer and Excelsius, IBM Cognos, SpreadsheetWEB, IBM’s Elixir Enterprise Edition, iCharts, UC4 Insight, Birst, Digdash, Constellation Roamer, BIme, Bissantz DeltaMaster, RA.Pid, Corda Technologies, Advizor, LogiXml,TeleView etc.

Permalink: https://apandre.wordpress.com/dvpoll/

“Big Data Analytics” (BDA) is going to be a new buzzword for 2011. The same and new companies (and in some cases even the same people) who tried for 20+ years to use the term BI in order to sell their underused software now trying to use the new term BDA in hope to increase their sales and relevancy. Suddenly one of main reasons why BI tools are underused is a rapidly growing size of data.

Now new generation of existing tools (Teradata, Exadata, Netezza, Greenplum, PDW  etc.) and of course “new” tools (can you say VoltDB, Aster Data (Teradata now!), nPario “Platform”. Hadoop, MapReduce, Cassandra, R, HANA, Paradigm4, MPP appliances etc. which are all cool and hot at the same time) and companies will enable users to collect, store, access and manipulate much larger datasets (petabytes).

For users, the level of noise will be now much bigger than before (and SNR – Signal-to-Noise ratio will be lower), because BDA is solving a HUGE (massive amounts of data are everywhere, from genome to RFID to application and network logfiles  to health data etc.) backend problem, while users interact with front-end and concern about trends, outliers, clusters, patterns, drilldowns and other visually intensive data phenomenas. However, SNR can be increased if  BDA technologies will be used together and as supporting tools to the signal-producing tools which are … Data Visualization tools.

Example of that can be a recent partnership between Tableau Software and Aster Data (Teradata bought Aster Data in March 2011!). I know for sure that EMC trying to partner Greenplum with most viable Data Visualizers, Microsoft will integrate its PDW with PowerPivot and Excel and I can assume of how to integrate Spotfire with BDA. Integration of Qlikview with BDA can be more difficult, since Qlikview currently can manipulate only data in own memory. In any case, I see DV tools as the main attraction and selling point for end-users and I hope BDA vendors can/will understand this simple truth and behave accordingly.

Permalink: https://apandre.wordpress.com/2011/01/16/bigdata/

I never saw before when one man moved from one company to another, then 46+ people will almost immediately comment on it. But this is what happened during last few days, when Donald Farmer, the Principal Program Manager for Microsoft BI Platform for 10 years, left Microsoft for Qliktech. Less than one year ago, Donald compared Qlikview and PowerPivot and while he was respectful to Qlikview, his comparison favored PowerPivot and Microsoft BI stack. I can think/guess about multiple reasons why (and I quote him: “I look forward to telling you more about this role and what promises to be a thrilling new direction for me with the most exciting company I have seen in years”) he did it, for example:

  • Microsoft does not have a DV Product (and one can guess that Donald wants to be the “face” of the product),
  • Qliktech had a successful IPO and secondary offering (money talks, especially when 700-strong company has $2B market capitalization and growing),
  • lack of confidence in Microsoft BI Vision (one can guess that Donald has a different “vision”),
  • SharePoint is a virus (SharePoint created a billion dollar industry, which one can consider wasted),
  • Qlikview making a DV Developer much more productive (a cool 30 to 50 times more productive) than Microsoft’s toolset (Microsoft even did not migrate the BIDS 2008 to Visual Studio 2010!),
  • and many others (Donald said that for him it is mostly user empowerment and user inspiration by Qlikview – sounds like he was underinspired with Microsoft BI stack so is it just a move from Microsoft rather then move  to Qliktech? – I guess I need a better explanation),

but Donald did explain it in his next blog post: “QlikView stands out for me, because it not only enables and empowers users; QlikView users are also inspired. This is, in a way, beyond our control. BI vendors and analysts cannot prescribe inspiration“. I have to be honest – and I repeat it again – I wish a better explanation… For  example, one my friend made a “ridiculous guess” that Microsoft sent Donald inside Qliktech to figure out if it does make sense to buy Qliktech and when (I think it is too late for that, but at least it is an interesting thought: good/evil  buyer/VC/investor will do a “due diligence” first, preferably internal and “technical due diligence” too) to buy it and who should stay and who should go.

I actually know other people recently moved to Qliktech (e.g. from Spotfire), but I have a question for Donald about his new title: “QlikView Product Advocate”. According to http://dictionary.reference.com/ the Advocate is a person who defends, supports and promotes a cause. I will argue that Qlikview does not need any of that (no need to defend it for sure, Qlikview has plenty of Supporters and Promoters); instead Qlikview needs a strong strategist and visionary

(and Donald is the best at it) who can lead and convince Qliktech to add new functionality in order to stay ahead of competition with at least Tableau, Spotfire and Microsoft included. One of many examples will be an ability to read … Microsoft’s SSAS multidimensional cubes, like Tableau 6.0 and Omniscope 2.6 have now.

Almost unrelated – I updated this page:  https://apandre.wordpress.com/market/competitors/qliktech/

Permalink: https://apandre.wordpress.com/2011/01/09/farmer_goes_2_qlikview/

Happy holidays to visitors of this blog and my best wishes for 2011! December 2010 was so busy for me, so I did not have time to blog about anything. I will just mention some news in this last post of 2010.

Tableau sales will exceed $40M in 2010 (and they planning to employ 300+ by end of 2011!), which is almost 20% of Qliktech sales in 2010. My guesstimate (if anybody has better data, please comment on it) that Spotfire’s sales in 2010 are about $80M. Qliktech’s market capitalization exceeded recently $2B, more than twice of Microstrategy ($930M as of today) Cap!

I recently noticed that Gartner trying to coin the new catch phrase because old (referring to BI, which never worked because intelligence is attribute of humans and not attribute of businesses) does not work. Now they are saying that for last 20+ years when they talked about business intelligence (BI) they meant an intelligent business. I think this is confusing because (at least in USA) business is all about profit and Chief Business Intelligent Dr. Karl Marx will agree with that. I respect the phrase “Profitable Business” but “Intelligent Business” reminds me the old phrase “Crocodile tears“. Gartner also saying that BI projects should be treated as a “cultural transformation” which reminds me a road paved with good intentions.

I also noticed the huge attention paid by Forrester to Advanced Data Visualization and probably for 4  good reasons (I have the different reasoning, but I am not part of Forrester) :

  • data visualization can fit much more (tens of thousands) data points into one screen or page compare with numerical information and datagrid ( hundreds datapoints per screen);
  • ability to visually drilldown and zoom through interactive and synchronized charts;
  • ability to convey a story behind the data to a wider audience through data visualization.
  • analysts and decision makers cannot see patterns (and in many cases also trends and outliers) in data without data visualization, like 37+ years old example, known as Anscombe’s quartet, which comprises four datasets that have identical simple statistical properties, yet appear very different when visualized. They were constructed by F.J. Anscombe to demonstrate the importance of Data Visualization (DV):
Anscombe’s quartet
I II III IV
x y x y x y x y
10.0 8.04 10.0 9.14 10.0 7.46 8.0 6.58
8.0 6.95 8.0 8.14 8.0 6.77 8.0 5.76
13.0 7.58 13.0 8.74 13.0 12.74 8.0 7.71
9.0 8.81 9.0 8.77 9.0 7.11 8.0 8.84
11.0 8.33 11.0 9.26 11.0 7.81 8.0 8.47
14.0 9.96 14.0 8.10 14.0 8.84 8.0 7.04
6.0 7.24 6.0 6.13 6.0 6.08 8.0 5.25
4.0 4.26 4.0 3.10 4.0 5.39 19.0 12.50
12.0 10.84 12.0 9.13 12.0 8.15 8.0 5.56
7.0 4.82 7.0 7.26 7.0 6.42 8.0 7.91
5.0 5.68 5.0 4.74 5.0 5.73 8.0 6.89

In 2nd half of 2010 all 3 DV leaders released new versions of their beautiful software: Qlikview, Spotfire and Tableau. Visokio’s Omniscope 2.6 will be available soon and I am waiting for it since June 2010… In 2010 Microsoft, IBM, SAP, SAS, Oracle, Microstrategy etc. all trying hard to catch up with DV leaders and I wish to all of them the best of luck in 2011. Here is a list of some other things I still remember from 2010:

  • Microsoft officially declared that it prefers BISM over OLAP and will invest into their future accordingly. I am very disappointed with Microsoft, because it did not include BIDS (Business Intelligence Development Studio) into Visual Studio 2010. Even with release of supercool and free PowerPivot it is likely now that Microsoft will not be a leader in DV (Data Visualization), given it discontinued ProClarity and PerformancePoint and considering ugliness of SharePoint. Project Crescent (new visualization “experience” from Microsoft) was announced 6 weeks ago, but still not too many details about it, except that it mostly done with Silverlight 5 and Community Technology Preview will be available in 1st half of 2011.
  • SAP bought Sybase, released new version 4.0 of Business Objects and HANA “analytic appliance”
  • IBM bought Netezza and released Cognos 10.
  • Oracle released OBIEE 11g with ROLAP and MOLAP unified
  • Microstrategy released its version 9 Released 3 with much faster performance, integration with ESRI and support for web-serviced data
  • EMC bought Greenplum and started new DCD (Data Computing Division), which is obvious attempt to join BI and DV market
  • Panorama released NovaView for PowerPivot, which is natively connecting to the PowerPivot in-memory models.
  • Actuate’s BIRT was downloaded 10 million times (!) and has over a million (!) BIRT developers
  • Panopticon 5.7 was released recently (on 11/22/10) and adds the ability to display real-time streaming data.

David Raab, one of my favorite DV and BI gurus, published on his blog the interesting comparison of some leading DV tools. According to David’ scenario, one of possible ranking of DV Tools can be like that: Tableau is 1st than  Advizor (version 5.6 available since June 2010), Spotfire and Qlikview (seems to me David implied that order). In my recent DV comparison “my scenario” gave a different ranking: Qlikview is slightly ahead, while Spotfire and Tableau are sharing 2nd place (but very competitive to Qlikview) and Microsoft is distant 4th place, but it is possible that David knows something, which I don’t…

In addition to David, I want to thank  Boris Evelson, Mark Smith, Prof. Shneiderman, Prof. Rosling, Curt Monash, Stephen Few and others for their publications, articles, blogs and demos dedicated to Data Visualization in 2010 and before.

Permalink: https://apandre.wordpress.com/2010/12/25/hny2011/

Microsoft reused its patented VertiPaq column-oriented DB technology in upcoming SQL Server 11.0 release by introducing columnstore indexes, where each columns stored in separate set of disk pages. Below is a “compressed” extraction from Microsoft publication and I think it is very relevant to the future of Data Visualization techologies. Traditionally RDBMS uses “row store” where

heap or a B-tree contains multiple rows per page. The columns are stored in different groups of pages in the columnstore index. Benefits of this are:

  • only the columns needed to solve a query are fetched from disk (this is often fewer than 15% of the columns in a typical fact table),
  • it’s easier to compress the data due to the redundancy of data within a column, and
  • buffer hit rates are improved because data is highly compressed, and frequently accessed parts of commonly used columns remain in memory, while infrequently used parts are paged out.

“The columnstore index in SQL Server employs Microsoft’s patented Vertipaq™ technology, which it shares with SQL Server Analysis Services and PowerPivot. SQL Server columnstore indexes don’t have to fit in main memory, but they can effectively use as much memory as is available on the server. Portions of columns are moved in and out of memory on demand.” SQL Server is the first major database product to support a pure Columnstore index. Columnstore recommended for fact tables in DW in datawarehouse, for large dimensions (say with more than 10 millions of records) and any large tables designated to be used as read-only.

“In memory-constrained environments when the columnstore working set fits in RAM but the row store working set doesn’t fit, it is easy to demonstrate thousand-fold speedups. When both the column store7and the row store fit in RAM, the differences are smaller but are usually in the 6X to 100X range for star join queries with grouping and aggregation.” Your results will of course depend on your data, workload, and hardware. Columnstore index query processing is most heavily optimized for star join queries. OLTP-style queries, including point lookups, and fetches of every column of a wide row, will usually not perform as well with a columnstore index as with a B-tree index.

Columnstore compressed data with a factor of 4 to a factor of 15 compression with different fact tables. The columnstore index is a secondary index; the row store is still present, though during query processing it is often not need, and ends up being paged out. A clustered columnstore index, which will be the master copy of the data, is planned for the future. This will give significant space savings.

Tables with columnstore indexes can’t be updated directly using INSERT, UPDATE, DELETE, and MERGE statements, or bulk load operations. To move data into a columnstore table you can switch in a partition, or disable the columnstore index, update the table, and rebuild the index. Columnstore indexes on partitioned tables must be partition-aligned. Most data warehouse customers have a daily, weekly or monthly load cycle, and treat the data warehouse as read-only during the day, so they’ll almost certainly be able to use columnstore indexes.You can also create a view that uses UNION ALL to combine a table with a column store index and an updatable table without a columnstore index into one logical table. This view can then be referenced by queries. This allows dynamic insertion of new data into a single logical fact table while still retaining much of the performance benefit of columnstore capability.

Most important for DV systems is this statement: “Users who were using OLAP systems only to get fast query performance, but who prefer to use the T-SQL language to write queries, may find they can have one less moving part in their environment, reducing cost and complexity. Users who like the sophisticated reporting tools, dimensional modeling capability, forecasting facilities, and decision-support specific query languages that OLAP tools offer can continue to benefit from them. Moreover, they may now be able to use ROLAP against a columnstore-indexed SQL Server data warehouse, and meet or exceed the performance they were used to in the past with OLAP, but save time by eliminating the cube building process“. This sounds like Microsoft finally figured out of how to compete with Qlikview (technology-wise only, because Microsoft still does not have – may be intentionally(?) – DV product).

Permalink: https://apandre.wordpress.com/2010/12/03/columnstore-index/

Microsoft used to be a greatest marketing machine in software industry. But after loosing search business to Google and smartphone business to Apple and Google they lost their winning skills. It is clear now that this is also true in so called BI Market (Business Intelligence is just a marketing term).  Microsoft bought ProClarity and it disappeared, they released PerformancePoint Server and it is disappearing too. They have (or had?) the best BI Stack (SQL Server 2008 R2 and its Analysis Services, Business Intelligence Development Studio 2008 (BIDS), Excel 2010, PowerPivot etc.) and they failed to release any BI or Data Visualization Product, despite having all technological pieces and components. Microsoft even released Visual Studio 2010 without any support for BIDS and recently they talked about their Roadmap for BI and again – they delayed the mentioning of BIDS 2010 and they declared NO plans for BI or DV products! Instead they are talking about “new ad hoc reporting and data visualization experience codenamed “Project Crescent””!

And than they have a BISM model as a part of Roadmap: “A new Business Intelligence Semantic Model (BISM) in Analysis Services that will power Crescent as well as other Microsoft BI front end experiences such as Excel, Reporting Services and SharePoint Insights”.

Experience and Model instead of Product? What Microsoft did with PowerPivot is clear: they gave some users the reason to upgrade to Office 2010, and as a result, Microsoft preserved and protected (for another 2 years?) their lucrative Office business but diminished their chances to get a significant pie of $11B (and  growing 10% per year) BI Market. new BISM (Business Intelligence Semantic Model) is a clear sign of losing technological edge:

image

I have to quote (because they finally admitted that BIDS will be  replaced by BISM – when “Project Juneau” will be available): “The BI Semantic Model can be authored by BI professionals in the Visual Studio 2010 environment using a new project type that will be available as part of “Project Juneau”. Juneau is an integrated development environment for all of SQL Server and subsumes the Business Intelligence Development Studio (BIDS). When a business user creates a PowerPivot application, the model that is embedded inside the workbook is also a BI Semantic Model. When the workbook is published to SharePoint, the model is hosted inside an SSAS server and served up to other applications and services such as Excel Services, Reporting Services, etc. Since it is the same BI Semantic Model that is powering PowerPivot for Excel, PowerPivot for SharePoint and Analysis Services, it enables seamless transition of BI applications from Personal BI to Team BI to Organizational (or Professional) BI.

Funniest part of this quote above that Microsoft is honestly believe that SharePoint is not a Virus but a viable Product and it will escape the fate of its “step-brother” – PerfromancePoint Server. Sweet dreams! It is clear that Microsoft failed to understand that Data Visualization is the future of BI market and they keep recycling for themselves the obvious lie “Analysis Services is the industry leading BI platform in this space today“! Indirectly they acknowledged it in a very next statement : “With the introduction of the BI Semantic Model, there are two flavors of Analysis Services – one that runs the UDM (OLAP) model and one that runs the BISM model”. Hello?

Why we need 2 BI Models instead of 1 BI product? BIDS 2008 itself is already buggy and much less productive development environment than Qlikview, Spotfire and Tableau, but now Microsoft wants us to be confused with 2 co-existing approaches: OLAP and BISM? And now get this: “you should expect to see more investment put into the BISM and less in the UDM(OLAP)”!

Dirty Harry will say in such situation: “Go ahead, make my day!” And I guess that Microsoft  does not care that Apple’s  Market CAP is larger than Microsoft now.

Afterthought (looking at this from 2011 point of view): I am thinking now that I know why Donald Farmer left Microsoft 2 months after BISM announcement above.

p010: http://wp.me/pCJUg-7r

It looks like honeymoon for Qlikview after Qliktech’s IPO is over. In addition to Spotfire 3.2/Silver, now we have the 3rd great piece of software in form of Tableau 6. Tableau 6.0 released today (both 32-bit and 64-bit) with new in-memory data engine (very fast, say 67 millions of rows in 2 seconds) and quick data blending from multiple data sources while normalizing across them. Data Visualization Software available as a Server (with web browsers as free Clients) and as a Desktop (Pro for $1999, Personal for $999, Reader for free).

New Data Sources include local PowerPivot files(!),  Aster Data ; new Data Connections include OData , (recently released) Windows Azure Marketplace Datamarket; Data Connection can be Direct/Live or to in-memory data engine. Tableau 6 does full or partial automatic data updates; supports parameters for calculations, what-if modeling, and selectability of Displaying fields in Chart’s axis; combo charts of any pair of charts; has new project views, supports Motion Charts

tumblr_mssaaxhajz1stz40uo1_500

(a la Hans Rosling) etc. Also see Ventana Research and comments by Tableau followers. This post can be expanded, since it is officially 1st day of release.

n009: http://wp.me/sCJUg-tableau6

Data Visualization stands on the shoulders of the giants  – previously tried and true technologies like Columnar Databases, in-memory Data Engines and multi-dimensional Data Cubes (known also as OLAP Cubes).

OLAP (online analytical processing) cube on one hand extends a 2-dimensional array (spreadsheet table or array of facts/measures and keys/pointers to dictionaries) to a multidimensional DataCube, and on other hand DataCube is using datawarehouse schemas like Star Schema or Snowflake Schema.


The OLAP cube consists of facts, also called measures, categorized by dimensions (it can be much more than 3 Dimensions; dimensions referred from Fact Table by “foreign keys”). Measures are derived from the records in the Fact Table and Dimensions are derived from the dimension tables, where each column represents one attribute (also called dictionary; dimension can have many attributes). Such multidimensional DataCube organization is close to a Columnar DB data structures. One of the most popular usage of datacubes is a visualization of them in form of Pivot tables, where attributes used as rows, columns and filters while values in cells are appropriate aggregates (SUM, AVG, MAX, MIN, etc.) of  measures.

OLAP operations are foundation for most UI and functionality used by Data Visualization tools. The DV user (sometimes called analyst) navigates through the DataCube and its DataViews for a particular subset of the data, changing the data’s orientations and defining analytical calculations. The user-initiated process of navigating by calling for page displays interactively, through the specification of slices via rotations and drill down/up is sometimes called “slice and dice”. Common operations include slice and dice, drill down, roll up, and pivot:

Slice:

A slice is a subset of a multi-dimensional array corresponding to a single value for one or more members of the dimensions not in the subset.

Dice:

The dice operation is a slice on more than two dimensions of a data cube (or more than two consecutive slices).

Drill Down/Up:

Drilling down or up is a specific analytical technique whereby the user navigates among levels of data ranging from the most summarized (up) to the most detailed (down).

Roll-up:

(Aggregate, Consolidate) A roll-up involves computing all of the data relationships for one or more dimensions. To do this, a computational relationship or formula might be defined.

Pivot:

This operation is also called rotate operation. It rotates the data in order to provide an alternative presentation of data – the report or page display takes a different dimensional orientation.

OLAP Servers with most marketshare are: SSAS (Microsoft SQL Server Analytical Services), Intelligence Server (Microstrategy), Essbase (Oracle also has so called Oracle Database OLAP Option), SAS OLAP Server, NetWeaver Business Warehouse (SAP BW), TM1 (IBM Cognos), Jedox-Palo (I cannot recommend it) etc.

Microsoft had (and still has) the best IDE to create OLAP Cubes (it is a slightly redressed version of Visual Studio 2008, known as BIDS – Business Intelligence Development Studio usually delivered as part of SQL Server 2008) but Microsoft failed (for more than 2  years) to update it for Visual Studio 2010 (update is coming together with SQL Server 2012). So people forced to keep using BIDS 2008 or use some tricks with Visual Studio 2010.

Permalink: https://apandre.wordpress.com/2010/06/13/data-visualization-and-cubes/

Recently I had a few reasons to review Data Visualization technologies in Google portfolio. In short: Google (if it decided to do so) has all components to create a good visualization tool, but the same thing can be said about Microsoft and Microsoft decided to postpone the production of DV tool in favor of other business goals.

I remember a few years ago Google bought a Gapminder (Hans Rosling did some very impressive Demos

tumblr_mssaaxhajz1stz40uo1_500

with it a while ago):

and converted it to a Motion Chart “technology” of its own. Motion Chart (For Motion Chart Demo I did below, please Choose a few countries (e.g. check checkboxes for US and France) and then Click on “Right Arrow” button in the bottom left corner of the Motion Chart below)

(see also here a sample I did myself, using Google’s motion Chart) allows to have 5-6 dimensions crammed into 2-dimensional chart: shape, color and size of bubbles, Axes X and Y as usual (above it will be Life Expectancy and Income per Person) and animated time series (see light blue 1985 in background above – all bubbles will move as “time” goes by). Google uses this and other own visualization technologies in its very useful Public Data Explorer.

Google Fusion Tables is a free service for sharing and visualizing data online. It allows you to upload and share data, merge data from multiple tables into interesting derived tables, and see the most up-to-date data from all sources, it has  TutorialsUser’s GroupDeveloper’s Guide and sample code, as well as examples. You can check a video here:

The Google Fusion Tables API enables programmatic access to Google Fusion Tables content. It is an extension of Google’s existing structured data capabilities for developers. Developer can populate a table in Google Fusion Tables with data, from a single row to hundreds at a time. The data can come from a variety of sources, such as a local database, .CSV file, data collection form, or mobile device. The Google Fusion Tables API is built on top of a subset of the SQL querying language. By referencing data values in SQL-like query expressions, developer can find the data you need, then download it for use by your application. Your app can do any desired processing on the data, such as computing aggregates or feeding into a visualization gadget. Data can be synchronized when you add or change data in the tables in your offline repository, you can ensure the most up-to-date version is available to the world by synchronizing those changes up to Google Fusion Tables.

Everybody knows about Google Web Analytics for your web traffic, visitors, visits, pageviews, length and depth of visits, presented by very simple charts and dashboard, see sample below:

Less people know that Panorama Software has OEM partnership with Google, enabling Google Spreadsheets with SaaS Data Visualizations and Pivot Tables.

Google has Visualization API (and interactive Charts, including all standard Charts, GeoMap, Intensity Map, Map, DyGraph, Sparkline, WordCloud and other Charts) which enables developers to expose own data, stored on any data-store that is connected to the web, as a Visualization compliant datasource. The Google Visualization API also provides a platform that can be used to create, share and reuse visualizations written by the developer community at large. Google provides samples, Chart/API Gallery (Javascript-based visualizations) and Gadget Gallery.

And last but not least, Google has excellent back-end technologies needed for big Data Visualization applications, like BigTable (BigTable is a compressed, high performance, and proprietary database system built on Google File System (GFS), Chubby Lock Service, and a few other Google programs; it is currently not distributed or used outside of Google, although Google offers access to it as part of their Google App Engine) and MapReduce. Add to this list Google Maps and Google Earth

and ask yourself then: what is stopping Google to produce a Competitor for the Holy Trinity (of Qlikview+Spotfire+Tableau) of DV?

Permalink: https://apandre.wordpress.com/2011/02/08/dvgoogle/

Data Visualization can be a good thing for Trend Analysis: it allows to “see this” before “analyze this” and to take advantage of human eye ability to recognize trends quicker than any other methods. Dr. Ahlberg started (after selling Spotfire to TIBCO and claiming that “Second place is first loser”) a “Recorded Future” to basically sell … future trends in form (mostly) of Sparklines; he succeeded at least in selling RecordedFuture to investors from CIA and Google. Trend analysis is an attempt to “spot” a pattern, or trend, in data (in most cases well-ordered set of datapoints, e.g. by timestamps) or predict future events.

Visualizing Trends means in many cases either Time Series Chart (can you spot a pattern here with your naked eye?):

or Motion Chart (both best done by … Google, see it here http://visibledata.blogspot.com/p/demos.html ) – can you predict the future here(?):

or Sparklines (I like Sparkline implementations by Qlikview and Excel 2010) – sparklines are scale-less visualization of “trends”:

may be Scatter (Excel is good for it too):

and in some cases Stock Chart (Volume-Open-High-Low-Close, best done with Excel) – for example Microsoft stock is fluctuating near the same level for many years, so I guess there is no visible trend  here, which may be spells a trouble for Microsoft future (compare with visible trend of Apple and Google stocks):

Or you can see Motion, Timeline, Sparkline and Scatter charts alive/online below: for Motion Chart Demo, please Choose a few countries (e.g. check checkboxes for US and France) and then Click on “Right Arrow” button in the bottom left corner of the Motion Chart below:

In statistics trend analysis often refers to techniques for extracting an underlying pattern of behavior in well-ordered dataset which would otherwise be partly hidden by “noise data”. It means that if one cannot “spot” a pattern by visualizing such a dataset, then (and only then) it is time to apply regression analysis and other mathematical methods (unless you smart or lucky enough to remove a noise from your data). As I said in a beginning: try to see it first! However, extrapolating the past to the future can be a source for very dangerous mistakes (just check a history of almost any empire: Roman, Mongol, British, Ottoman, Austrian, Russian etc.)

Human eye has own Curse of Dimensionality (term suggested in 1961 by R.Bellman and described independently by G. Hughes in 1968). In most cases the data (before they visualized) usually organized in multidimensional Cubes (n-Cubes) and/or Data Warehouses and/or speaking more cloudy – in Data Cloud – need to be projected into less-dimensional datasets (small-dimensional Cubes, e.g. 3d-Cubes) before they can be exposed through (preferably  interactive  and  synchronized set of charts, sometimes called dashboards) 2-dimensional surface of computer monitor in form of Charts.

Projection of DataCloud to DataCubes and then to Charts

During last 200+ years people kept inventing all type of charts to be printed on paper or shown on screen, so most charts showing 2- or 3-dimensional datasets. Prof. Hans Rosling led Gapminder.org to create the web-based, animated 6-dimensional Color Bubble Motion Chart (Trendalyzer):

tumblr_mssaaxhajz1stz40uo1_500

ansd screenshot of it here:

which he used in his famous demos: http://www.gapminder.org/world/ , where 6 dimensions in this specific Chart are (almost a record for 2-dimensional chart to carry):

  • X coordinate of the Bubble = Income per person,
  • Y coordinate of the Bubble = Life expectancy,
  • Size of the Bubble = Population of the Country,
  • Color of the Bubble = Continent of the Country,
  • Name of the Bubble = Country,
  • Year = animated 6th Dimension/Parameter as time-stamp of the Bubble.

Trendalyzer was bought from Gapminder in 2007 by Google and was converted into Google Motion Chart, but Google somehow is not in rush to enter the Data Visualization (DV) market.

Dimensionality of this Motion Chart can be pushed even further to 7 dimensions (dimension as an expression of measurement without units) if we will use different Shapes (in addition to filled Circles we can use Triangles, Squares etc.) but it will be literally pushing the limit of what human eye can handle. If you will add to the consideration a tendency of DV Designers to squeeze more than one chart on a screen (how about overcrowded Dashboards with multiple synchronized interactive Charts?), we are literally approaching the limits of both human eye and human brain, regardless of the dimensionality of the Data Warehouse in backend.

Below I approximately assessed the dimensionality of datasets for some popular charts (please feel free to send me the corrections). For each Dataset and respective Chart I estimated the number of measures (usually real or integer number, can be a calculation from other dimensions of dataset), the number of attributes (in many cases they are categories, enumerations or have string as datatype) and 0 or 1 parameter (presenting a well-ordered set, like time (for time series), date, year, sequence (can be used for Data Slicing), natural, integer or real  number) and Dimensionality (the number of Dimensions) as a total number of measures, attributes and parameters in a given dataset.

Chart Measures Attributes Parameter Dimensionality
Gauge, Bullet, KPI 0 0
Monochromatic Pie 1 1
Colorful Pie 1 1 2
Bar/Column 1 1 2
Sparkline 1 1 2
Line 1 1 2
Area 1 1 2
Radar 1 1 2
Stacked Line 1 1 1 3
Multiline 1 1 1 3
Stacked Area 1 1 1 3
Overlapped Radar 1 1 1 3
Stacked Bar/Column 1 1 1 3
Heatmap 1 2 3
Combo 1 2 3
Mekko 2 1 3
Scatter (2-d set) 2 1 3
Bubble (3-d set) 3 1 4
Shaped Motion Bubble 3 1 1 5
Color Shaped Bubble 3 2 5
Color Motion Bubble 3 2 1 6
Motion Chart 3 3 1 7


The diversity of Charts and their Dimensionality adding another complexity for DV Designer: what Chart(s) choose. You can find on web some good suggestions about that. Dr. Andrew Abela created Chart Chooser Diagram

Choosing a good chart by Dr. Abela

and it was even converted into online “application“!

Permalink: https://apandre.wordpress.com/2011/03/02/dimensionality/

How do I know what I think until I see what I say?” Or let me rephrase Mr. E.M. Forster: “How do YOU know what I think until I will blog about it“?

I resisted to an idea to have a blog since 1996, because I perceived the blogging as very similar to a fasting in desert (actually after a few months of blogging I am amazed – according to WordPress Statistics – that my blog has hundreds and hundreds of visitors every day!). But recently I got a few excellent pushes to start my own blog because when I posted comments on somebody’s blog they got deleted against my will. Turned out that owners of those blogs can delete my comments and thoughts anytime if he/she/they do not like what I said. It happened to me on one of Forrester’s Blogs and it happened to me on my own profile on LinkedIn – when I posted so called “update” and some of LinkedIn employees decided to delete it. In both cases above administrators even did not bother to send me my own thoughts for archiving purposes – they just disappear!

So I decided to start the blog about Data Visualization (DV),

because I am doing DV for many years and accumulated many DV implementations and thoughts about DV, DV tools, DV Vendors, DV Market etc. For now I will have 8 main pages (and they will be used as root pages for hierarchy of sub-pages):

  • Home Page of this blog  is a place where all posts and comments will go,
  • Visualization Page (with sub-pages) is for DV Samples and Demos,
  • DataViews Page (and it’s sub-pages) is about … Data Views, Charts and Chartology,
  • Tools Page designated for DV Software and comparison of DV Tools,
  • Solutions Page will describe possible DV solutions, DV System, products  and DV services I can provide,
  • Market Page dedicated to DV Vendors and DV market news and analyses,
  • Data Page is about ETL processes, Data Collection and Data Sources
  • About page can give you an info about me

Another argument (for me to do DV blogging) was said 2500 years ago by Confucius:” Choose a job you love, and you will never have to work a day in your life.” And finally, I have to mention this 500-years old story in hope it will help me to filter out from this blog all unneeded pieces: “An admirer asked Michelangelo how he sculpted the famous statue of David that now sits in the Academia Gallery in Florence. How did he craft this masterpiece of form and beauty? Michelangelo’s offered this strikingly simple description: He first fixed his attention on the slab of raw marble. He studied it and then “chipped away all that wasn’t David.”

p001: http://wp.me/pCJUg-3