Challenge: optimising data presentation on a mobile device

How do we maximize the perception and understanding of data on a minimal screen on a mobile device? The current blog aims to discuss some interesting areas of human perception, such as illusions. These areas are then relate these to the issues involved in displaying data on a mobile device to ensure that the real meaning of the data is communicated clearly and quickly to those ‘on the move’.

As a strategy for displaying data on mobile devices, it is better to maximise the ratio between the ‘ink’ used for decoration, and ‘ink’ used for displaying data. If a lot of ink is use for decorative purposes rather than data display purposes, then the report consumer’s visual processing system is required to do more work in order to understand the message being displayed. It is important to maximize the data/ink ratio in order to simplify the message that is being given in the graph.

There are a number of different issues involved in ensuring that data is presented clearly and concisely in a way that the human brain can process in an optimal manner. One major issue is the area of illusions, in which the image presented to the eye can cause a distraction that can mean that the data is not communicated clearly on a small mobile screen.

One common illusion that can apply here is the Hermann grid illusion. This illusion was discovered by mathematician Hermann, when he was reading a book by Irish physicist John Tyndall. This illusion can make a black and white matrix appear as if it contains grey as well. Although the same light intensity is transmitted along the white spaces in the matrix, the intersections ca
n appear to be grey rather than white. An example follows:

It is known that the Hermann Grid illusion is not just restricted to black and white: it has been found in other colours as well. For example, a red and white matrix may appear to have reddish intersections. The dependency is that there must be a high contrast between the two colours involved in the grid.

This illusion occurs as a result of the cell activity of the retina in the eye in resolving the ‘gain’ or the average power of light intensity in the image. Although a television or computer screen only has one ‘brightness’ setting, the eye can vary the ‘brightness’ setting for different parts of the image. The fovea, of centre of the retina, is the region of highest visual acuity: it has a greater concentration of cells and thus has finer-grained control over the ‘brightness’ settings. So, if you look directly at a grid intersection, the intersection is resolved by the fovea and it appears white. On the other hand, the neighbouring intersections can appear grey. The peripheral areas of the retina do not have the same fine-grained control over the brightness setting. This means that the neighbouring intersections can appear gray as the ‘gain’ or average light intensity of the image is turned down.

The Hermann grid illusion will not work if there is a low contrast between the colours involved in the grid. As a sidenote, this is why Excel has light blue cells with a white background: there is not enough contrast between the light blue and the white colours to confuse the visual system, so the illusion does not appear.

Another illusion which has direct impact on the display of data on a mobile device is the Moiré effect. Moiré patterns are a focus of interest in visual systems, and more generally in physics as an area of investigation of the fundamental substance of the universe. A Moiré effect can be described as a visually disturbing experience, when the consumed image contains a series lines or dots that are superimposed at an angle to a parallel series of lines or dots. An example can be found below:

Here, the Moiré effect can be seen in the horizontal bars that have been produced as a result of a series of vertical lines which have been overlaid by an identical series of lines at an angle. Moiré effects can be used to create some beautiful patterns as well, such as Kolomyjeck’s Moire images.

Moiré effects are also thought to occur in nature as well. Recently, string theorists in physics have an area of study which focuses on Moiré patterns and the creation of the universe. M theory proponents have held that the universe was formed originally out of two membranes, which collided. The vibrations that occurred at the intersection points of the two membranes is posited to be responsible for all of the observed particles in the universe. The resulting pattern between these two membranes is like a moiré effect, and some work is being conducted to understand the different moiré patterns that could be held responsible for producing fundamentally the laws of physics. A pictorial representation of superstrings and M theory can be found below:

Thus, it is important to avoid the Hermann grid illusion and the Moiré effect in the design of data display, particularly for mobile devices. This directly affects the colours and lines used in the display.

With respect to portraying the data so that the meaning of the data is clearly presented, some recommendations are given below:
  • It is recommended that report creators avoid high-contrast grids. For example, a high-contrast grid may invoke the Hermann grid illusion in the report consumer, and this may distract the report consumer from the ultimate message of the data. In fact, if possible, mute grid-lines or even avoid them altogether if possible. Gridlines, whether vertical or horizontal will increase the ‘ink’ on the page; the main message here is to increase the meaning of the data, rather than increase the amount of ink which is not totally necessary.
  • solid colour should be used in bar charts, piecharts and so on; not bitmaps, hatches or grids. A ‘hatched’ graph can seem to ‘buzz’ in the eye of the viewer. This can produce a Moiré effect in the perception of the report user, which can be annoying and distract attention away from the true meaning of the data.
  • there is no need to ‘box’ the colours in a bar chart by drawing black lines around them. This may produce the Hermann grid illusion by producing ‘fuzziness’ at the intersecting black points. Also, this increases the data/ink ratio. It is argued here that the eye is very capable of fine-grained detail, as evidenced by the high visual acuity found in the fovea. Thus, the eye can distinguish clearly between blocks of colour. This is enough to get the clarity of meaning across in the report, as well as reduce load on the visual system by reducing complexity.
  • use 2D images: they are processed more quickly than 3D images. Although these graphs may look nice, they can sometimes be difficult to interpret due to the additional hatching that may be involved in the production of the 3D effect. In order to ensure that the data is understood in a way that does not require an overhead on the visual system, it is recommended that graphs are kept to 2D. Further, this can also maximize the data/ink ratio, which is recommended as an overall strategy.
  • data display is restricted to displaying data meaningfully, and that non-essential images should not be included. This may include company logos, for example. There is no need to include them since they do not add to the meaning of the data; instead, they also serve to reduce the data/ink ratio. Additionally, it is not possible to control elements of the company logo which may produce effects such as the Hermann grid illusion or the Moiré effect; so it is best to reduce them altogether.

To summarise, this blog has discussed some of the physiological and psychological elements of visual processing, and has applied these principles to displaying data on a mobile device. From these psychological and physiological discussions, some recommendations regarding data display have been given here. To add this post to Twitter, please click here


Data Mining – useful or not?

What distinguishes Data Mining from other methods of exploring data, and what is its usefulness? Critics might say that if you torture the data enough, it will eventually confess! Computers contain lots of data, but people need help to turn this data into intelligence. As Frawley (1992) nicely puts it, “Computers have promised us a fountain of wisdom but delivered a flood of data.” Data Mining can help us to move from data keepers to intelligence gatherers. There are lots of definitions out there, but I like the Gartner one best: “A broad category of applications and technologies for gathering, storing, analysing, sharing and providing access to data to help enterprise users make better business decisions.” Data mining is particularly useful when it can be applied to customer insight, and this blog will aim to apply DM to customer insight.

Customer insight is a cornerstone of the activity of any business. It is at the core of ensuring that the business continues to devise products and services that will entice new customers in addition to turning satisfied customers into long-term, profitable, repeat customers. Customer insight has different perspectives:

Demographics

understanding the classifications of customers characteristics who purchase from the organisation. This includes location, age, sex and so on.

Behaviour classification

understanding the classifications of customer behaviour and lifecycle in contact with the business. This can involve an analysis of their spend, and frequency of touchpoints with the business.

Propensity to Churn

analysing of customers who have previously churned to discover patterns in their behaviour. In turn, these patterns can be applied to existing customers to pinpoint those who might churn early so that strategies can be put in place to avoid this.

Propensity to Buy

analysing the characteristics of customers who purchase certain products, and searching for patterns amongst a set of customers in order to predict their probability of purchasing these products as well. These propensity models are useful for understanding which customers are most likely to purchase a given set of products. This model can assist in decision making and in focusing marketing efforts.

As a first step to customer insight, analytical tools can summarise and aggregate historical information about customers. One particular technology which is good for summarising and aggregating data is called OLAP (On Line Analytical Processing). This could provide analysts with information such as the top ten performing products for a given month, or the top customers who purchased most of a specific type of product.

Historical analytics can help to support the marketing process, which can also be augmented by predictive analytics, alternatively known as data mining, which can help to identify patterns in customer behavior. The customer behavior patterns found can be used to make more informed, quicker decisions about targeting customers. Thus, a customer data warehouse can progress from simply becoming a data store to becoming an intelligence tool which is used to inform decisions and direct marketing spend.

Historical and predictive analytics are not mutually exclusive, but instead work together to inform the marketing process. For example, in a marketing campaign scenario, it could be possible to evaluate the success and impact of a marketing campaign by looking at historical customer activity, and then looking at the end-to-end process of CRM analytics by using data mining to search for related patterns in the customer behaviour.

Data Mining (DM) offers three main activities: data exploration, pattern discovery and predictions. The key differentiator is that data mining performs predictions, by using predictive techniques to find patterns in data. Microsoft offers Data Mining at no extra cost as part of SQL Server 2005 and 2008, which is geared towards the average Excel user. It is proposed here that companies who already have SQL Server software could consider these tools for the purposes of starting to conduct predictive analytics, or data mining, on their customer data – a lot of companies already have SQL Server 2005/ 2008 already, so why not try the data mining functionality that comes with the software? Businesses already have a great deal of operational data about their customers, which could be leveraged in the predictive analytics process. In particular, the existing data could be analysed to discover patterns in customer behaviour using data mining. It can offer a different way of looking at the data; it may reveal patterns in the data that the consumer had not previously seen, thereby triggering new avenues of opportunity.

Data mining is useful since it can help to investigate the loss of customers; since it can cost a lot to get new customers, an area of interest is to retain existing customers by identifying which customers are most likely to ‘churn’, or leave service. If the business can understand the causes of customer churn, then they can start to try and prevent it from happening in the first place. The issue is that a customer who cancels a service has actually started a relationship with the competitor before they make the cancellation call. It is at this point that the customer has been lost already.

Data mining can help to investigate the propensity of customers to churn so that they can be distinguished and addressed. Reduction of customer churn has a number of business benefits: the lifetime value of the customer might be increased, and as an adjunct, marketing spend can be targeted more effectively. Further, this could help to reduce new customer churn, where the customer acquisition cost has not been recouped at the point where the customer leaves the service. In some cases, reducing customer churn could even promote growth in areas that are considered to be already mature.

Reduction in customer churn is particularly important for companies in the services industries. There is often a distinction made between voluntary and involuntary churn. Voluntary, or commercial, churn is the situation where the customer has decided to leave and take up a competitor’s service; involuntary churn is where the customer has had no choice to leave the company due to factors such as relocation or invalidity. The ‘Churn’ Rate can be expressed as a percentage, which can obtained by conducting the following formula:

  • Churn Rate: Lost Customers / Total Customers

It is possible to investigate the propensity of customers to churn by using data mining techniques to look patterns in the data that might distinguish customers with a propensity to churn, and the type of churn itself. The benefit here is that this group of potential ‘churn’ customers can be specially identified, and different approaches explored in order to retain these customers.

Understanding the patterns in customer data can have many different applications. For example, it is suggested here that the business team can use these pattern-generating techniques to discover any patterns found within the ‘customer churn’ group. These patterns can then help to predict the customers who are most likely to churn before they begin to develop relationships with competitors. These customers may become the target of focused campaigns and subject to a specific tailored marketing campaign. It could then be possible to explore ways in which to turn these customers from ‘nearly satisfied’ to ‘satisfied’, and then from ‘satisfied’ to ‘loyal’. The customer data set could be configured to actively exclude ‘involuntary’ churn customers from the set of data to be explored, so that the results are more accurate.

Reducing customer churn is a two-pronged fork:

  • Prediction – predicting the churn probability of customers
  • Discovery – discovering the root causes of churn

As part of the prediction process, it is important to understand the profitability of a certain group of customers; for example, if the customer is not very profitable, then perhaps it is possible to simply just let the customer go, and direct the marketing resources towards retaining a more profitable set of customers.

As a part of the discovery process, the business can start to understand what specific factors lead to the churn itself. These factors can then be addressed in order to reduce internal causes of churn.

Data mining techniques can be mapped to the processes of prediction and discovery. Data mining is quite new technology, and it can be difficult to know where to start. Microsoft assumes that the user has no prior knowledge of data mining algorithms, but the terminology might be a bit off-putting. To try and simplify it, here is a table which describes the main Microsoft data mining algorithms, along with a key which shows their main uses:

Algorithm

Description

Exploration

Classification

Estimation

Association

Segmentation

Forecast

Decision Trees

Finds the odds of an outcome based on values in a training set

Yes

Yes

Yes

Yes

Naïve Bayes

Clearly shows the differences in a particular variable for various data elements

Yes

Yes

Potential

Neural Nets

Seeks to uncover non-intuitive relationships in data

Potential

Yes

Yes

Association Rules

Identifies relationships between cases

Yes

Potential

Yes

Clustering

Classifies cases into distinctive groups based on any attribute sets

Yes

Potential

Potential

Potential

Yes

Sequence Clustering

Groups or clusters data based on a sequence of previous events

Yes

Potential

Potential

Potential

Yes

Time Series

Analyzes and forecasts time-based data

Yes

Yes

Linear Regression

Determines the relationship between columns in order to predict an outcome

Potential

Yes

Logistic Regression

Determines the relationship between columns in order to evaluate the probability that a column will contain a specific state

Potential

Yes

Yes

There are a number of different models to choose from, and the choice should be based on the type of data analysis that is being employed. In order to do data mining properly, it is also necessary to test and re-test results again to ensure the validity of the predictions and models in order to ensure rigorousness. Modelling exercises should look at using some of the different algorithms provided by Microsoft, in order to choose the best model by comparing results.

To summarise, Data Mining is a key technology for exploring data, and primarily Predictive Analysis. Here, it has been applied to customer churn. Specifically in terms of Microsoft, the data mining features are responsive and intuitive, and provided in a format that users trust: Excel. The Data Mining functionality in Microsoft allows users to turn data into intelligence, thereby making customer interactions more timely, focused and rewarding.

To add this post to Twitter, please click here