Nova Act, a general-purpose AI agent that can independently perform some simple actions and assume control of a web browser, was unveiled by Amazon on Monday.

Amazon is releasing the Nova Act SDK, a toolkit that enables developers to construct agent prototypes with Nova Act, in conjunction with the new agentic AI model.

The Nova Act, which was developed by Amazon’s recently established AGI lab in San Francisco, will also enable the main features of the company’s forthcoming Alexa+ upgrade, a generative AI-enhanced version of Amazon’s popular voice assistant.

The Nova Act toolkit is available to developers on the new website, nova.amazon.com, which also functions as a showcase for Amazon’s diverse Nova foundation models.

Amazon’s Nova Act is an effort to compete with OpenAI’s Operator and Anthropic’s Computer Use by developing its own general-purpose AI agent technology.

Several prominent technology companies are of the opinion that the utility of current AI chatbots will be considerably enhanced by AI agents that can navigate the web for users.

Amazon may not have been the first to develop this type of agentic technology; however, it may have the broadest reach through Alexa+.

According to Amazon, developers who are utilizing the Nova Act SDK should be able to automate fundamental tasks on behalf of users, such as reserving dinner reservations or ordering salads from Sweetgreen.

Developers can assemble a set of tools that enable an AI agent to navigate web pages, complete forms, or select dates from a calendar using the Nova Act toolkit.

Amazon asserts that Nova Act outperforms agents from OpenAI and Anthropic in numerous internal tests.

For instance, Nova Act achieved a score of 94% on ScreenSpot Web Text, a metric that evaluates the interaction between an AI agent and text on a screen. This score surpasses that of OpenAI’s CUA (88%) and Anthropic’s Claude 3.7 Sonnet (90%).

Nevertheless, Amazon refrained from benchmarking Nova Act with more conventional agent evaluations, including WebVoyager.

The first public product to emerge from Amazon’s aforementioned AGI center is Nova Act, which is co-led by former OpenAI researchers David Luan and Pieter Abbeel.

Before Amazon lured them away last year to lead its AI agent efforts, both had previously founded their own startups: Luan founded Adept, while Abbeel cofounded Covariant.

Luan stated to TechCrunch that he regards agents as a critical step in the development of superintelligent AI systems, despite the fact that it may appear peculiar for an AGI facility to be developing AI agents capable of ordering salads.

AGI is defined by Luan as “an AI system that can assist you in performing any task that a human can perform on a computer.”

According to Luan, the Nova Act SDK was developed by his team to provide developers with the ability to precisely determine when they wish for a human to intervene in an agentic workflow and to reliably automate brief, basic tasks.

He anticipates that it will enable developers to develop agentic applications that are more dependable, albeit not entirely autonomous.

In a congested market, Amazon is introducing its first generalist AI agent. However, this technology is of paramount importance to the company, and its success is contingent upon its implementation.

Early evaluations of Nova Act could offer a glimpse into the capabilities of the long-delayed Alexa+, a critical moment for Amazon’s AI endeavors.

The reliability of early AI agents from OpenAI, Google, and Anthropic is a significant issue in various domains.

According to TechCrunch’s findings, the systems are prone to errors that a human would not make, are sluggish, and struggle to operate independently for an extended period.

It will not be long before we determine whether Amazon has successfully mastered the code or if its agents are affected by the same defects that plague competitors.

you might also like