Solving Computer Vision Challenges as an AI Leader.

Dr. David Swanagon
Aug 5
5 min read

In the age of AI, leaders are required to do three things:

Lead machines.
Lead people that build machines.
Lead organizations that adopt AI.

The below scenario presents an interesting dilemma for AI leaders. Occlusion is a common occurrence with Computer Vision models. Most use cases are impacted by objects partially blocking the camera's viewpoint. This occlusion results in the sub-optimal processing of the image. In this situation, a large retail store has adopted a Computer Vision model to track inventory levels. The camera scans the store shelves to determine if apparel is running low. The problem is that customer shopping behavior is creating an issue for the camera perceiving the back row. Customers pick up apparel, unfold them, and then put back he clothing they decide not to purchase in the front row. This results in multiple items being stacked on top of each other, which limits visibility to the back row. This may seem like a simple issue for a human. However, a complex machine needs help to address the occlusion problem.

How would you approach this scenario?

As an AI leader, you are responsible for aligning technical solutions with the company's enterprise strategy. This means focusing on the four things a CEO cares about, namely revenue, growth, profit, and reputation. Sometimes, the most advanced technical solution is not the preferred approach due to the negative impact it has on costs and the skill requirements involved. Other times, simple solutions are insufficient because model mistakes lead to meaningful degradation to the company's operating profits. This is the challenge for an AI leader. How do you find the solution that optimizes technical requirements, while addressing costs, employee skills, and the customer? The answer is the Machine Leadership Model.

You will find that most use cases do not have a single 'most obvious' solution. AI leaders must make decisions based on competing alternatives that have both strengths and weaknesses.

The model stipulates that AI adoption is optimized when Machine Autonomy, Trust, and AI Competencies are balanced. This is the known as the AI Innovation Frontier. It is the union where models positively influence enterprise strategy without undermining the company's culture. The challenge is knowing how to find the equilibrium point, especially when dealing with a technical issue. In this situation, the leader is presented with five choices. Let's examine how each choice influences the balance between Machine Autonomy, Trust, and AI Competencies.

Machine Leadership infographic showing a Computer Vision challenge related to Occlusion. Readers must find a solution that balances technical requirements with cost constraints, employee skills, and the customer experience. — Occlusion is a Common Issue in Computer Vision

Leading in the Age of AI requires making choices. The first option focuses on cost optimization by utilizing data augmentation to improve the training process. This is a cost effective way of helping the model adjust when a clear image is not available. Since the store is 50,000 square feet, it is understandable that IT would pursue this approach as it helps improve the model without impacting computational costs. Comparatively, the second option focuses on model performance. Amodal segmentation uses an advanced labeling system to create sophisticated datasets that help the model understand various scenarios where retail apparel is not fully visible. This approach often leads to the most accurate results. However, it is costly to implement and requires AI engineers to have advanced skills in Computer Vision. That said, Amodal segmentation would also lead to a reduction in unnecessary inventory ordering.

The third option emphasizes practicality. By installing additional cameras, the model can create a 3D-Reconstruction of the blocked apparel items. This would lead to more accurate inventory counts without incurring the significant costs associated with Amodal segmentation. Nevertheless, there is a risk that customers will find the additional cameras to be an invasion of their privacy. To see all sides of the shelves, many additional camera angles would be needed.

Separately, the fourth option utilizes a technical compromise. Temporal analysis employs GLBM filters to analyze the history of the store shelves and determine if changes have been made. This can result in a more accurate count than data augmentation. However, it does not require the high cost or skill investment associated with Amodal segmentation. This compromised approach seeks to balance model accuracy with cost and skill requirements. The goal being to improve inventory tracking in a way that produces the maximum possible benefit to the store.

The final option integrates humans into the process. It does so in a manner that relies on the worker's knowledge, basic arithmetic, and physical skills to validate the estimates produced by the Computer Vision model. Manual counts are added to the employee job duties, alongside their existing responsibilities of selling items and maintaining high customer service. This approach saves on computational costs and potentially maximizes the productivity of individual employees. However, it has the downside risk of human counting errors.

An image showing a human and machine working together to solve a computer vision problem. — Integrating Machine and Human Productivity Helps Unlock Performance

Each option could work in the right circumstances. The best way to decide is to determine which approach helps balance Machine Autonomy with Trust and AI Competencies. This should be an iterative process that incorporates all stakeholder viewpoints.

Leading in the Age of AI requires building the competencies needed to engage in the technical conversation. AI leaders must be able to evaluate a use case through a techinical lens. Then, apply business principles to ensure the proposed solution optimizes the company's broader enterprise strategy. This is challenging because the technologies within each industry are constantly changing. The pace of AI innovation is moving faster than the skills required to keep up. This means AI leaders need a set of principles to guide them towards effective decision-making. It is impossible to know the details of every tool or platform. Instead, the proper way to tackle an AI problem is to apply the Machine Leadership Model.

What type of Machine Autonomy should be developed for this use case?
What is the current trust level for AI in the organization?
Do we have the skills needed to proactively manage this use case?
Are there change programs that need to take place to drive AI adoption?
How does the proposed solution impact revenue, growth, profit, and reputation?

Machine Leadership principles help executives handle the most complex use cases. Focusing on the equilibrium between Machine Autonomy, Trust, and AI Competencies is what drives success.

Even though each option in this scenario could work, the right answer is the one that balances Machine Autonomy with Trust and AI Competencies. This means ensuring that the model's accuracy is high enough that employees trust the findings. Likewise, it suggests that design changes should reasonably align with employee skills. This means pursuing a model that is sophisticated enough to solve the problem, without being too difficult to technically manage.

Solving Computer Vision Challenges as an AI Leader.

How would you approach this scenario?

Recent Posts

Comments