Insights

How CEOs can improve AI adoption by sponsoring Domain-Adapted LLMs

Dr. David Swanagon
Aug 17
3 min read

Is your company struggling with AI adoption? A likely cause is that the trust and AI competencies of your non-engineering workforce is low. A great way to solve this problem is to embed non-technical employees into the engineering process, using a company sponsored initiative that capitalizes on the holistic knowledge of the entire workforce. This process requires the leadership of both the CIO and CHRO, with each respective organization working together to unlock the benefits of the project.

AI Adoption is driven by the trust and the skills of the workforce. This can only be achieved when the CIO and CHRO organizations work together.

CEOs can encourage this process by sponsoring Domain-Adapted LLM initiatives. This process involves taking a pre-trained model such as Llama or Mistral and optimizing the output based on enterprise relevant data. A useful analogy is found in baseball. A utility 'Jack of all trades' infielder versus a middle innings relief pitcher. In the world of LLMs, Llama would be the general purpose utility infielder, while the Domain-Adapted LLM would be the middle innings relief pitcher. Though both players serve an important purpose, the benefit of the Domain-Adapted LLM is that it focuses on a narrowly defined task.

Machine Leadership infographic showing the importance of Domain-Adapted LLMs on driving AI Adoption in an organization.

A Domain-Adapted LLM initiative helps drive AI adoption in many ways. First, the process requires the input from non-technical stakeholders. These models utilize custom tokenizers to break down text and RAGs to retrieve information. Non-technical stakeholders validate the outputs using their ‘domain-specific’ knowledge. For example, a Sales Account Manager will understand the dynamics influencing customer decisions during pitch meetings. This includes process steps, proof of concept tools, sales jargon, and follow-up standards. Llama/GTP4-Turbo will not understand these nuances, nor will the AI/ML engineers.

By involving the Sales Account Manager, the Domain-Adapted model trains on highly specialized data ('insights'), which leads to more accurate output. Likewise, non-technical stakeholders develop trust in the model due to their involvement. This leads to higher AI adoption across the enterprise. Common tasks that non-engineers perform include identifying quality data sources, labeling data to match specific categories, participating in prompt engineering simulations, and offering sample ‘high quality’ responses that the model can learn from. This includes using specific language from the industry.

Second, The Total Cost of Ownership (TCO) is significantly lower for these models. A study by Sharma et al. (2024) found that Domain-Adapted models decreased TCO between 90-95% for chip-design coding assistance projects. The reasons are not surprising. Domain-Adapted models have significantly fewer parameters (e.g., 10 billion vs 70 billion) versus general purpose models. This results in fewer GPUs, faster response times, and lower infrastructure and operating expenditures. Due to this, the CEO can redirect IT expenditures towards change management and learning programs. By investing a portion of the money saved in infrastructure costs in learning courses, the CEO can build AI skills across the entire workforce. Over time, this approach will lead to a more digitally driven culture.

For AI leaders, it’s important to evaluate the bigger picture before relying exclusively on general purpose models such as Claude 3 or GPT4-Turbo. Those are excellent platforms. However, their design does not encourage AI Adoption in the same collaborative way as Domain-Adapted LLM models. In the age of AI, CEOs must find ways to integrate non-technical employees into the engineering process. This means sponsoring initiatives that require the CIO and CHRO organizations to work together. Over time, this approach will improve trust and build AI skills in a transformative manner. Non-engineers will understand the design of AI models, while engineers will develop stakeholder management abilities.

Domain adapted LLMs are a pathway for integrating machines into the workplace in a manner that improves accuracy, lowers costs, and builds AI skills.

Citation:

Sharma, A., Ene, T. D., Kunal, K., Liu, M., Hasan, Z., & Ren, H. (2024, June). Assessing economic viability: A comparative analysis of total cost of ownership for domain-adapted large language models versus state-of-the-art counterparts in chip design coding assistance. In 2024 IEEE LLM Aided Design Workshop (LAD) (pp. 1-6). IEEE.

Follow article on LinkedIn: Go to LinkedIn

Top Cognitive Processes for AI

Dr. David Swanagon
Aug 14
2 min read

The Age of AI requires engineers to lead machines, lead people that build machines, and lead organizations that adopt AI. A key part of this framework is developing the right cognitive skills. Our research indicates that six elements play a significant role in AI/ML engineering performance. The number sequence is purposeful. Big C Creativity (Frontier Thinking) should be cultivated first before placing constraints on the problem set. The ability to create something from nothing diminishes the more an individual constrains their thinking to a specific problem, even if multiple solutions are allowed. Let your engineers THINK BIG and boundless before introducing divergent and convergent tasks that require specific solutions.

The Age of AI requires leaders to do three things: lead machines, lead people that build machines, and lead organizations that adopt AI. Cognitive processes are a critical factor as they allow leaders to interact with AI tools in a dynamic manner.

Working Memory is a major factor in AI performance. MLOps platforms, tools, and deployed models continue to scale exponentially. The volume of updates that occur on a routine basis is significant for engineers. The ability to store 9+ chunks of data plays a key role in use case design. We have found that many AI/ML engineers are able to store more information in their working memory than the traditional 7 +(-) 2 that was noted by George Miller. This includes the ability to form associations, chunk information, and utilize spaced retrieval practice to transfer concepts to long-term memory.

Wayfinding is interesting. Did you know that one of the highest drop out rates for Delta Force (outside of physical requirements) is land navigation? Finding your way from Point A to B efficiently is critically important in AI. The computational costs, hyperparameters, and context window for LLMs create budgetary constraints for most companies. This requires engineers to identify pathways that reduce dimensionality and optimize FLOPs.

Spatial Intelligence is a skill that improves once working memory is firing on all cylinders. Think about it for a moment. If you tried to mentally rotate a complex object in your mind, you would need to have a strong working memory to store the image iterations that occur as part of each rotational fold.

Leading in the Age of AI is not simply about learning technology platforms. Success is also determined by the quality of the engine inside the engineer. These skills can be fine tuned through specific exercises to maximize creativity, working memory, and spatial intelligence. If you want to prepare your kiddos to succeed in the digital age, make sure they are cultivating these skills, even if you choose to limit their technology exposure early on.

Machine Leadership infographic showing the top cognitive processes for AI including creativity, memory, convergent thinking, divergent thinking, wayfinding, and spatial intelligence. — Top Cognitive Processes for AI

How the context window impacts Large Language Models (LLM).

Dr. David Swanagon
Aug 14
2 min read

Large Language Models (LLM) that utilize self attention mechanisms are challenged by the context window. When the sequence becomes too long, models can make mistakes with tokens that are located far apart and in the middle of the text. The trick is finding the right balance between cost and decision requirements.

Check out what would be missed if some of the world's greatest novels were untrained. No one would know Michael became the Godfather.

A key aspect of AI leadership is balancing data consumption with computational costs and process complexity. LLMs need a broad context window to interpret text. However, this goal must be balanced with financial realities associated with 1M+ tokens.

An interesting extension to this issue are copyright protections. Right now, language models are allowed to train on copyrighted materials under the "Fair Use" provision. This essentially allows GPT and its competitors to scale without having to pay for data access. This is true even if the Non-Fungible Tokens have a digital certificate of ownership. The copyright is only infringed if the LLM reproduces output that is consistent with the copyright.

Why does this matter?

Claude recently announced that it reached the 1M token threshold for its context window. GPT, Gemini, and Llama have similar models. If these companies are allowed to train on copyrighted material without seeking permission or licensing agreements, the"fair use" provision will essentially eliminate the copyright protections. How could an individual copyright owner utilize their works if the LLM is the main vehicle for society's content delivery?

Another risk is that the language models begin interpreting the copyright in a manner that avoids infringement. Since millions of people rely on LLMs for information, the meaning behind published works could "change" simply to avoid legal challenges. In other words, the language model would create output that reimagines the meaning of the research or innovation to avoid lawsuits. Over time, the LLMs meaning becomes the "actual" de facto meaning.

The pace of AI innovation is allowing technology firms to design models that pursue this very path, while our regulators lack the skills to provide meaningful constraints on the way LLMs capture, train, and utilize data. Collective licenses are a potential pathway. The risk is that large firms (i.e., banks) purchase the collective licenses over time similar to M&A. If this happens, then the companies that provide capital will also have indirect ownership over the data.

The best solution is for individuals to develop the skills needed to understand their data rights, pursue digital licensing agreements that protect their innovations and control how LLMs train on data. As the economists say, "There is no such thing as a free lunch". This motto should be applied to tech companies and their LLMs.

Machine Leadership infographic showing the role the context window has on LLM processing and the number of tokens needed to interpret some of the world's greatest literature. — The context window is determined based on the number of Tokens