LLM : Have an Intelligent Business Discussion with your Data!

Walter Wartenweiler
June 19, 2024

Incredible value is being delivered by recent developments in the AI space. Large Language Models combine the mastery of written words with a compression of the information of the internet. This results in the ability for them to interact at the same level of understanding between the AI and the human. This interaction allows the user not only to get Google search like static response but a discussion leading to clarity. This article explores how this can and will be mirrored in the Data Analytics, Dashboarding and Actionable Insights space.

TABLE OF CONTENTS
Artificial Intelligence
Analytics
Technology
All

The recent few years in AI

Large language models (LLMs) are changing the world faster than almost any recent innovation. They rely on very simple algorithms inspired by biological neural networks. The result surprised the world. It was effective at talking with an interlocutor. It's the computing power allowing the simulation of the equivalent of billions of neurons. The Turing Test - deemed as the way to evaluate if a system became intelligent since the beginnings of theoretical software - would be passed and is becoming a moving goalpost. 

Stable Diffusion, on the other hand, is also changing image and video generation. It promises to bring to life a smart agent that will become like or even beat humans at tasks.

Of course, we try to apply AI anywhere humans are active. This includes art, music, marketing, and more (even the picture at the top of this blog post is AI-made) and much more. 

Meanwhile, Stable Diffusion is changing the world of image and video generation promising to bring to life a multimodal intelligent agent that will become similar or even outperform humans at almost any tasks.

Naturally, we try to figure out how to apply AI in any area of society where humans are currently active, from art to music to marketing (even the picture at the top of this blog post is AI made) to… well.. everything else. 

AI and Data Analytics : a Powerful and Risky Combination

Existing analytics companies infuse their existing systems with AI, while new ones arise, merging AI and data analytics. So let’s look into what that means. 

What an out of the box LLM does very well is speak in natural language (NLP) and carry along a fairly reasonable, if not perfect, understanding of the world. This means that it uses context from end-user prompting. It can give good answers to users. The user can judge, refine, and get further answers. This talk is key to making sure that the info from the interaction is right, full, and useful. Without people in the loop, who would catch the hallucinations and misunderstandings in the neural network?

In a perfect world, with perfect LLMs, the LLM and the human would know all data concepts. Every query would lead to reasonable results. We would dream about talking directly to the AI as if it were a factual embodiment of the company. A little bit similar to the initiative of Jordan Peterson, who wants us to be able to converse with the Bible after having trained the LLM on it. It's worth considering, though, the impact of an LLM hallucinating on the Bible's content. This is different from one hallucinating about your business. Here we all need to be careful. We must keep human-level control high and the system's response traceability transparent. This is especially true if we act on immediate actions from the insights the AI/data system generates.

Where could we usefully plug-in an AI then?

An artificial intelligence system comes pre-trained on an extremely large sample of the information of the internet. After that it’s fine tuned by humans to become good at answering questions in the correct way. The resulting system is therefore extremely impressive at holding almost any type of conversation on any topic and satisfies the end user. 

So it comes out of the box with the knowledge of a manager, a data scientist, an analyst, a software engineer, and AI specialist and most of the specific domain knowledge of your organization.

This means that even without doing anything else on a technical or integration level, it can already deliver value to someone needing to make sense of their data. We can simply ask him questions about how to best present data, what analytics make sense in a given business situation or more. That’s like working with an external (AI) Data Coach.

AI Use Cases in (embedded) Data Analytics

  • If we go one step further, we can present it with the details of our own data model and ask its help to engineer those analytics in target technologies that make sense in our data stack. With proper integration, this can even mean having an (AI) Data Analyst designing analytics with knowledge of your business and data and directly being able to execute those analytics on your data set.
  • One step further again, we can present the data along with the questions and make the AI find patterns that we wouldn’t have discovered otherwise. This means that your (AI) Data Analyst would become aware as part of the conversation not only about the semantics of your company but also its exact STATUS. It would be like inviting an AI Data Expert to work on your data.
  • Finally we could imagine adding a next level of fine tuning, that further trains and tunes the LLM into an AI knowing the world, understanding conversations and understanding natively your business and data model. This would be like training an Artificial Expert Employee that joins your team to deliver value.

Each of these steps comes with increasing integration between the starting LLM and the existing system and thus needs to be carefully assessed for instance in terms of where your data ends up outside your organization. Beware of Compliance Risk here!

The Levels of Conversation

AI Data Coach

When prompting ChatGPT or other LLMs with questions related to metrics relevant to your business you get generally very good answers back and can continue the conversation if you need more information.

An example from ChatGPT: 

“I'm working in a data oriented software company and would like to assess the effectiveness of our marketing initiatives based on our marketing funnel conversion rates. What would a Data Scientist recommend to achieve that goal? Make a high level summary”

To assess the effectiveness of your marketing initiatives based on marketing funnel conversion rates, a Data Scientist would recommend:

1. Define Metrics: Identify key performance indicators (KPIs) for each stage of the funnel.

2. Data Collection: Ensure robust data collection methods to track user interactions and conversions.

3. Analysis: Perform statistical analysis and A/B testing to compare different marketing strategies.

4. Attribution Modelling: Use models to determine the impact of various marketing touchpoints.

5. Predictive Analytics: Forecast future trends and identify areas for optimization.

6. Continuous Monitoring: Regularly review metrics and adjust strategies based on data-driven insights.

AI Data Analyst

When you describe your data model and what you want to assess, the LLM, ChatGPT in his case, can give you back queries and algorithms in your system's formats. These will in turn get you the results directly from your data.

An example from our prototype:

# List each customer's first and last name along with the date of their last rental and the name of the film that was rented on the last rental date and order the result by customer's first and last name.

SELECT

    c.first_name AS customer_first_name,

    c.last_name AS customer_last_name,

    MAX(r.rental_date) AS last_rental_date,

    f.title AS film_title

FROM

    customer c

    JOIN rental r ON c.customer_id = r.customer_id

    JOIN inventory i ON r.inventory_id = i.inventory_id

    JOIN film f ON i.film_id = f.film_id

WHERE

    r.rental_date = (

        SELECT MAX(rental_date)

        FROM rental r2

        WHERE r2.customer_id = r.customer_id

    )

GROUP BY

    c.customer_id

ORDER BY

    c.first_name, c.last_name;

AI Data Expert

When prompting the LLM with a formal description of your data model ALONG with the data  and what you want to assess, it can give you back direct insights into your data. Now this is becoming tricky because you basically get an answer out but it is very difficult to have a formal assessment of the quality of that answer in terms of potential mistakes hidden under the reasoning of the LLM. ChatGPT 4o is currently exposing the Python code it produced to answer your question which could be reviewed by a human before acting on it. Nevertheless - if handled with care - it will come up with answers that sometimes surprise you and would have been a very long Data Science project to reach to you by yourself especially on large datasets.

An example from our prototype: 

# Which country should our company be focusing on to max out sales in 2025 ?

Based on the information provided, it's not possible to definitively state which country a company should focus on for maximizing sales in 2025. However, we can look at trends and potential factors that might influence sales performance:

1. Japan: Sales have been consistently high with amounts of $100,000 and $90,000 recorded from this country. This suggests steady performance.

2. France: Although sales were relatively low in 2023 ($4,000), the amount for Japan was similarly lower than usual ($50,000). The potential exists for a significant increase if efforts are focused on improving sales strategies or market penetration in France.

3. USA: Sales amounted to $2,000 in 2024 and $5,000 in 2023. This country shows some variability but has room for improvement.

4. China: Consistent performance with $70,000 and $50,000 in sales suggests stability and potential for growth through targeted marketing or expanding market share.

5. Spain and Italy: Both have shown relatively low sales figures ($5,000 each) but consistent amounts year-over-year. There is a possibility for growth if strategic initiatives are introduced to boost sales.

6. General Trend: Japan appears consistently strong in terms of sales amount compared to other countries based on the data provided.

Decision-Making Process:

- Market Potential: Analyze market sizes, potential demand for products/services in each country.

- Growth Rates: Consider historical growth rates and forecasted future growth trends.

- Strategic Alignment: Determine if your company's strategic goals align better with any specific markets or countries based on their fit with overall corporate strategies (e.g., technology focus, geographic expansion plans).

- Competitive Analysis: Understand how competitors are performing in these regions to identify potential market gaps and opportunities.

- Economic and Political Factors: Consider economic stability, government policies, and regulations that might affect business operations.

Conclusion:

Based solely on the sales data from 2023 and 2024, Japan stands out as having consistently high sales figures. However, for decision-making in 2025:

- If maximizing short-term profits: Considering Japan's consistent performance may be prudent.

- If looking at long-term growth: Countries with higher potential market sizes or faster projected growth rates (like China) might offer more opportunity.

- If seeking to diversify risk: Focusing on countries like France or Spain could help spread risks across different markets.

To make a well-informed decision, further analysis of these factors and possibly gathering more current data would be beneficial.

AI Expert Employee

If you fine tune the LLM on your data sets and analysis history, you can get even more relevant information out of the system, but making this real-time or near real-time will be costly in terms of hardware costs, skills and invested manpower. I don’t doubt that in the coming years many new innovations both in the science and engineering behind LLMs and the way to source computing power will bring that cost down but at this stage. This level of workload and compute time investment will only be worth it for very large companies with very complex needs.

Our AI prototypes

We are prototyping variants around the AI Data Analyst and AI Data Expert options and will have videos available to see how that fits into icCube in a few weeks. Meantime if you would like to hear from us what we have in action please Schedule a Discussion with Our Team and indicate that you want to speak about AI — we'd love to talk!