Introduction to Large Action Models (and How They’re Built)

Last Updated May 23, 2024

human decision tracking large action models

If you’re developing Large Action Models, accessing data is key. And we can help get you there.

Large Action Models (LAMs) represent a significant advancement in artificial intelligence, extending the capabilities of Large Language Models (LLMs) by enabling AI to perform complex, action-oriented tasks.

While LLMs excel in understanding and generating human-like text, LAMs are able to execute a sequence of actions, making them capable of handling tasks that require interaction with various applications and interfaces.

How is that possible? LAMs learn from massive datasets of user action information and use this data for strategic planning and proactive action in real-time.

If you’re working in the LAM space, accessing meaningful data is key. And we can help get you there. Your first action? Read the rest of the blog.

What Are Large Action Models?

LAMs are designed to perform specific actions with high precision, reducing the likelihood of errors in automated tasks. 

By directly interacting with applications, LAMs can execute tasks more efficiently than LLMs, which may require multiple steps and contextual understanding. 

The integration of symbolic reasoning allows LAMs to explain their actions and decisions, making them more transparent and understandable. 

LAMs are a big step forward in AI technology, enabling more natural and efficient human-computer interactions by performing tasks that go beyond simple text generation.  

Their ability to learn from user interactions and adapt to new environments positions them as powerful tools in the automation and AI landscape. 

Key Features Of Large Action Models

Unlike LLMs that focus on text processing, LAMs can perform tasks like booking flights, filling out forms, and managing online shopping. They interact with software and systems to carry out these tasks autonomously. 

LAMs feature a hybrid architecture, combining neural networks with symbolic reasoning. The latter refers to the use of structured, rule-based logic to understand and execute tasks, while the former enables the model to learn from each task and improve upon each subsequent one. 

This approach combines the interpretability and precision of traditional AI with the adaptive capabilities of modern machine learning. As a result, LAMs can understand and execute tasks based on both structured logic and adaptive learning. This blend helps in achieving higher accuracy and efficiency in task execution. 

So, LAMs improve by learning from user interactions and demonstrations. This approach allows them to adapt to different interfaces and evolve their capabilities over time.  

More on that later. 

What Large Action Models Can Do

Task: Booking a flight on a travel website like Kayak. 

Process: A LAM can take user inputs such as destination, dates, and budget, navigate the website, search for available flights, select the best option based on the criteria, and complete the booking process, including filling in passenger details and payment information. 

Task: Automatically filling out forms on websites like Google Docs. 

Process: A LAM can identify the required fields in a form, retrieve the necessary information (e.g., name, address, date of birth) from a database or user profile, and input this information into the appropriate fields, ensuring accuracy and saving time. 

Task: Shopping on a platform like Instacart. 

Process: A LAM can accept a shopping list, search for the specified items on the platform, add them to the cart, compare prices and deals, and proceed to checkout, handling payment and delivery details. 

Task: Creating a playlist on music streaming services like Spotify. 

Process: A LAM can take user preferences for genres, artists, and moods, search the music library, select songs that match the criteria, and compile them into a playlist, ensuring a cohesive listening experience. 

Task: Generating summaries of articles or documents. 

Process: A LAM can read lengthy texts, identify key points and essential information, and produce a concise summary that retains the main ideas, making it easier for users to grasp the content quickly. 

These examples illustrate how LAMs can automate and simplify complex, multi-step tasks across different applications, enhancing productivity and user experience. 

How Human Decision Tracking Helps Improve LAMs

LAMs need to learn how to make human-like decisions, so to train the model, it needs examples of people making actual choices online. 

Picture a tool that allows a freelance team of linguists and subject matter experts to sign up for an open task via a job portal.  

To execute the assignment, they will simply browse the internet and go about tasks we all do in everyday life. This allows data scientists and LAMs to learn from the paths people walk to get the task done.  

Some people will: 

  • Search Google or Bing; others already know where to look for the information. 
  • Combine information themselves; others will ask LLMs to help them. 
  • Be susceptible to distraction or suggestions visible on the pages they visit; others will be focused and make it to the finish line first. 
  • Decide and don’t go back; other people will re-evaluate and maybe change the choices made.   

Everyone is different, and we all start from different background, with different experiences, and with a different, intrinsic, non-explicit goal. 

Human Decision Tracking in Action

Let’s say, for example, you give five people a specific budget to purchase a laptop for a 12-year-old. Some people will: 

  • Look for the cheapest laptop from a shop they trust. 
  • Do a Google search and compare the same or different cheap laptops in different shops.  
  • Repeat the above but using Bing. 
  • Compare technical specs. 
  • Choose based on brand name.  
  • Be attracted to free accessories. 
  • Purchase based on warranty.  
  • Spend a chunk of the budget on a nice case.  
  • Choose not to spend the full amount, 
  • Consume the budget entirely. 

The tool can track all these paths: whatever people type, select, sort, filter, etc.  

The different paths to the same final goal are registered, and the produced log file can be used to compare the different paths and train the LAM on how people make decisions. 

The LAM learns from the different paths taken.

In conclusion, it’s important to note this technology is in its early stages. Only through testing and feedback will its actual value and competitive differentiation be revealed.

Contact us to see how your company can leverage AI and get ahead of the competition.

Related Posts

Summa Linguae uses cookies to allow us to better understand how the site is used. By continuing to use this site, you consent to this policy.

Learn More