By Kurt Cagle
What does your data look like?
In the era of big data, information comes from an astonishing array of sources. Structured data from within relational databases. Semi-structured documents in various forms of markup. Lightweight data structures stored as JSON, data coming from Twitter and Facebook status messages, along with an explosion of data coming from smartphones and other “sensors.” Natural language processing and big data scoops like Hadoop are also making legacy log data and media data-audio, video, voice, printed material valid sources for data.
So the question is “what does that data look like?”
The reality for most organizations is that data contained in databases or data streams is effectively opaque. The human brain gives up trying to put numbers in perspective after about 10 items. Spreadsheets (and the graphing and visualization tools that have been added to them) have helped, but even there, there’s a clear upper limit beyond which on has too much information to understand.
This is where the Data Visualization Analyst (DVA) comes in. Data visualization is often treated as an offshoot of user experience, but in many respects it is in fact more appropriately seen as a (potentially major) player within a data science team. The role of the visualization analyst is to take the vast amounts of information that companies are increasingly called up to hand and condense this down into visualizations-graphs, charts, maps and similar infographics. What makes this position so challenging is that the expectation has been increasingly that such infographics are also dynamic-at a minimum animated and, more often than not, fully interactive and working with live data.
As such, the data visualization analyst has an especially rare set of skills. They have to have a sufficient understanding of analytics and analytics tools to be able to process data from various sources to make meaningful conclusions (as well as some experience with ETL to get the data into the required form). They have to have fairly sophisticated programming skills in order to be able to either build or at a minimum design such interactive applications.They have to have a good sense of visual and user interface design, and they need to be able to tell a story concisely with pictures accompanied by a light smattering of words.
Few of these are “magic bullet” solutions by themselves, however. One of the key strengths of the visualization analyst is the ability to both understand what constitutes important data for a given situation and to effectively tell a story with that data, regardless of the tools he or she uses. This also means that the visualization analyst should understand business thinking and realities as well as the technical realm of data analysis, and be able to communicate visually between these two domains. As such, this analyst role should be seen as an adjunct to the C-suite, as the presentations that they build may very well make the difference between making the sale and blowing it.
Given that such analysts are comparatively rare, the best strategy for developing a solid visualization analyst is to find someone with strong user experience skills (UX) and assign them to work with your company’s lead data scientist or as part of a data science team (if your company has one). It’s also possible to come from the data sciences side directly, but in general people with strong analytical skills usually tend to be fairly weak at the kind of story-telling and presentation skills that UX people learn early on.
Similarly, if you are personally interested in become a data visualization analyst, a good place to start is to work with the various toolsets described above, as well as to gain some familiarization with many of the tools that have become prerequisites in the field, such as Pentaho, Alteryx, Matlab or the R programming language. These tend to have powerful tools for statistical operations but limited visualization (R in particular tends to have fairly mundane visualization tools, despite it becoming a prerequisite for data scientists).
Data visualization analysts are poised to become an indispensable part of the emerging data science team of modern corporations. They make sense of the ever increasing barrage of data coming from both within and outside the company, provide ways of making such data meaningful without specialized knowledge, and help to establish branding for one of the most important commodities your company has-its data.
Kurt Cagle is a technology writer and data scientist working with Avalon Consulting, LLC. He is the author of 18 books on Web technologies, data architecture and semantics. He lives in Seattle, where he’s finishing up his first novel.