Cognitive AI can help you solve complex problems that would need a lot of man power
This short article explains how a business can efficiently leverage cognitive AI to classify pictures
We focused on a company which does underwater exploration and would like to protect such pictures
Classified pictures is a key feature to protect them and prevent data leaking
To build this example, we have created a dataset of 1,000 images (manually classified) for the supervised classification.
The following themes of images have been assumed to be Confidential for this example:
1) Underwater ROVs
2) Oil/Petroleum Rigs
4) Pipelines
The test set also includes negative samples to help machine differentiate between Confidential Content and Content that is close enough to resemble the confidential content. (520 Images for Confidential Class and 480 images for Non – Confidential Class)
Training Metrics:
The system will give the class and the probability of the prediction in the response. Then, we used a probability threshold (confidence) to tag the images with the appropriate class.
Metrics observed during the validation phase and testing phase are given below. Precision gives the accuracy of the classifier while recall gives the coverage of the classifier.
To avoid over fitting, the Confidence/probability threshold was tuned during validation phase and the training data was adjusted/modified after looking at testing results. The lower precision is due to the dataset chosen for testing as images where hand-picked without specific domain knowledge of what is confidential or not.
Photos correctly classified as confidential
Photos correctly classified as non confidential
Photos correctly classified as non confidential
Data visualization is the visual representation of data that can be used for different industries. This form of visual communication allows user to better understand the set of data. When you combined it with good software, it would allow businesses to gain real insights in a short span of time. Data visualization is a branch of data analytics in big data technology. International Data Corp (IDC) predicts this industry to grow at a compound annual growth rate (CAGR) of 23 per cent to 2019 to reach $48.6 billion. What is the appeal of data visualization within the rapidly growing field of big data? We would start by looking at the advantages of data visualization.
The current digitalization trend had allowed data to be captured easily which allowed it to grow rapidly. According to IDC research study, we would see a 50 fold increase in data from 2010 to 2020. The fact of 50 fold increase in data would be better represented and appreciated with a chart.
The chart is a basic form of visualization that would allow you to understand the scale of 50 times growth of data better. This form of data visualization had been around since the time of Microsoft office. The representation of data is both an art (e.g. color scheme to stand out) and science (e.g. choosing the right data).
While it might be easy to visualize 50 times grow for just 2 variables, things can get complicated if you want to know how prices of different food change in supermarkets around the country. Stop for a moment to understand the scale of this task.
There are various types of food from meat to beverages to dairy products. You would also need to measure the change in percentage terms which makes the whole process quite granular. Without proper visualization tool, it would take you ages to compile and sort of the data and to present it with an impact.
However if you use the tableau visualization software, this is the visual that you can get within an hour. With a glance, we can now understand that food inflation is mild in New York, moderate in St. Louis (and cover a larger area) and severe in Portland.
This visualization which includes charts, maps and area points would mean different things for different people. For the housewife that is staying in Portland, she might be tempted to drive from the area of severe inflation (likely the city) to the area of low inflation (likely the farm area). For the businessman, he might find out the reason behind such disparity within such small area and address the market needs there.
Our eyes are able to take in and process visuals subconsciously within milliseconds without the need for focused attention. There are several things which humans perceive automatically and subconsciously such as the different shapes and tones of colour. By leveraging on our subconscious analytics capabilities, we free the space for our conscious mind for other intellectual tasks. This would allow us to enhance our perception and understanding of data.
Early research had already shown that we can spot patterns subconsciously when there are visual cues as seen in the simple example above. In the personal relationship graphic below, our eyes are able to automatically take in the size of the groups to understand its importance immediately.
While these knowledge had been discovered over 2 decades ago, it is only recently that technology had matured enough for us to apply visualization across deep data sets and spot of patterns for easy recognition and presentation. This has drastically reduced the time and effort for visualization and made it feasible for mass adoption.
Earlier we mentioned that data visualization can be used to provide more insights to your company. This is an example from Tableau which is a data visualization software company.
At the first glance, you will notice that we are not only comparing the relationship of 2 group of data which is typical of a chart. We are comparing side by side 3 variables of region, sales and profit. In this example, you can see that technology has stronger sales and profit in the south than the north. You can also look at the sales of each of the 3 category that is scattered along the chart by simply moving your mouse over it.
The impressive part of this would be that this can be done within 5 minutes with ALL your data. If you tried to do that with Excel, assuming that you can have all your data cleaned, it would take you days to do it. As a business owner, you can now decide to find out the reason of your advantage in the southern region for technology and push out more sales there. Alternatively you can decide to transfer best sales practice for your furniture in the north to boost sales in the south. In order words, you are doing this better and faster.
Another area which visualization creates both personal and business value would be in relationship especially when you look at the 6 degrees of separation. It is a daunting task to understand and figure out the web of relationship as the number of people expands. Data visualization would allow us to understand them better when you apply them to social networking platforms like Facebook and LinkedIn.
In a glance, you can see who knows who and the strength of their relationship. For example, you would know from this relationship web that Travis (in the big blue circle) and Terrance (in the big orange circle) are the 2 key influencers in this group.
They have the widest circle of friends and the people you should know first if you want to enter into their community. Go on and click here to explore the detailed relationship between each person.
Other than using data analytics for specific information set, you can use data visualization for Dashboard of all the important corporate goals. At one glance, you would know how far you are from your targets.
This dashboard would keep you focused on your corporate goals and they can be customized according to your requirements. The vital information could be highlighted or enlarged to make them stand out.
It might be hard to understand the different profitability levels of each transaction when you are a large company. This is when a scatter plot would provide you with the visuals that you would need to analyze them.
The above is a scatter plot of sales and profitability after considering the cost of sales of all the transactions in your company.
You can easily which are the deals which your company should be chasing after and which are the sales that your company should step back on. These insights might surprise you and lead to higher overall profitability.
Now that you have understood their benefits, it is time to understand the challenges of producing these data visualization. You would have to overcome these challenges to obtain your return on investment.
The quality of the data which you feed into the system is vital to the results which you get subsequently. As it is with all analytical processes, when you feed in inferior data, you will get inferior results. It takes a lot of understanding to get the data you need in the right shape and this means that you will have to understand the metadata such as what it is, when is it collected, where, how and by whom.
You would also have to ensure the accuracy of the data. Missing key information would lead to bad business decision. Outliers are especially important for good data visualization and you should remove temptation to remove the outliers.
If you are analyzing financial information, it is the outliers that indicates the inefficiency of the market which is the source of your profitability. While you are ensuring the quality of the data, you will also have to be able to extract the information in a reasonable period of time.
You will have to select the right subset of data to generate the insight which management requires. While getting the right hardware and software would help, you would require a good understanding of the data to be able to do both. You should also guard against any bias views which may cause you to handle data unfairly.
Our brains are evolutionary trained to spot patterns and our pre-attentive processing capabilities allow us to spot them subconsciously. This would also mean that the data can be wrongly presented and lead us to draw incorrect conclusion or even confuse us.
Here are 3 examples of misrepresented data:
These mistakes are common and they can result in wrong conclusions being drawn by executives. You would require a skilled consultant to advice you on that.
If you decided to use data visualization as part of your business intelligence strategy, you will still have to choose the right software for your organization. We have Tableau, Microstrategy and Qlikview in the market. Each of them have their own advantages and disadvantages.
Then there is the part where you define the service level agreement with these software providers. You would have to decide if it make sense to use a cloud service or server which would involved different layers of cost and some of which might be hidden.
Finally you would have to educate your staff on how to use these cool but expensive data visualization software. Expectations would have to be managed while they learn which data is important to unearth the business insights.
Data visualization can help your organization identify and respond to issues quickly. It can simplify complex data and at the same time drill down to details if necessary. Patterns hidden in thousands of spreadsheet lines can be shown clearly and new insights found. Different teams can collaborate better as those with different pattern recognition skills can see the same picture quickly.
In order to implement the benefits data visualization into your organization, you would need to engage the services of a team of consultants to navigate through the maze of challenges. The consultant would also assist you to select the right software and hardware. This will give your company an edge over your competition. It is not about getting more information but to understand the information you have on hand better and faster. Contact us for more information.
Leverage on your natural ability to understand visual information and find your answers faster!
IBM Watson Analytics offers a simple interface to explore data. Using cognitive science, it is able to find by itself the type of your data and offers a NLP query. It also suggests visualizations if you don't know where to start !
It also indicates how good the data is and can suggest to clean or remove outliers which might ruin your predictions!
Watson Analytics offers predictive analytics capabilities for up to 5 outputs. For each output (the value you would like to predict), a spiral chart shows the different inputs and combination of input variables that are predictors of the output. They are conveniently ordered and scored by a Predictive strength which measures how they accurately predict a target without flooding with statistical tests results.
Decision tree is used for the prediction and results are visual and simple to understand and interpret by business users. While there are cases where decision tree might not working well, it should be sufficient for usual predictions and we assume that more models will come in the future!
IBM Watson Analytics is a very interesting platform for business users who are looking for a simple and fast solution for their analytics needs: determine customers attrition rate, target clients etc...
It requires little learning effort and integrates easily into existing systems as it is a cloud-based platform.
However, the difficult part in any analytics project is the less visible work : understanding business needs analytics and finding what questions they are trying to answer and armed with that, explore, prepare and engineer data before a platform such as Watson analytics can deliver its full potential!