Here is the daily technology #threadcast for 8/18/24. We aim to educate people about this crucial area along with providing information of what is taking place.
Drop all question, comments, and articles relating to #technology and the future. The goal is make it a technology center.
https://inleo.io/threads/view/mightpossibly/re-leothreads-25sm4utzk
Excellent to reboot found it gives better connectivity too....
interesting!
I do now after reading your thread from yesterday. Great tidbit to have posted on here.
Yes, I figured I'd repost it here today for others to see.
A good move. Reposting is a great way to approach things. We need to keep encouraging that. Repurpose all we can.
'Flight 5 Starship and Super Heavy are ready to fly, pending regulatory approval.'
https://www.space.com/spacex-starship-ready-fifth-test-flight
Is that the SpaceX rocket?
I dont follow this close enough to keep track.
That is the top. Let me get a good pic for you.
I meant what company makes it.
Elon's own. Starship factory.
Twitter has been blocked in Brazil, our information about crypto assets will become scarcer. It was where I followed most of the projects.
Seems like it is imperative to get that on here where it cant be blocked.
It will be a big problem and setback. Anyway, there are other social networks like Hive to acquire this information.
People need to post the information on here.
This is what I was saying all along. People have to think about Hive as a decentralized database. We need to democratize the data.
I completely agree with your thinking.
Google's Pixel 9 and AI: What happened Last Week?
This week, Google’s #MadeByGoogle event showcased the new Pixel 9 series, including the Pixel 9 Pro and Pixel 9 Pro Fold, and introduced AI features with the Gemini Live assistant, which replaces Google Assistant. Despite high hopes, Gemini Live faced criticism for bugs and limited functionality. Meanwhile, xAI's Grok-2, though impressive, stirred controversy with unfiltered image generation. Nvidia’s AI chip release was delayed due to a design flaw, and the company faces scrutiny for scraping vast amounts of video data. In robotics, Amazon’s drones are cleared for UK testing, and Nepal plans to use drones for Everest waste removal. Exciting times ahead in tech!
The United Nations (UN) has initiated discussions on AI regulation and policy, with entities like UNICRI Centre for AI and Robotics promoting responsible AI innovation.
They will do no better than individual governments. I wrote an article why governments, as we know them, are done. They are from a time when we operated in the physical realm.
AI is moving ahead at an exponential pace each year. What the hell is the UN going to as to keep up.
It is technologically impossible to raise healthy poultry and other livestock without the use of antibiotics, but people naively believe that there is mass production of meat, milk and eggs without their use. !VSC
It is an odd way people approach things.
You have to treat aquarium fish and bees with antibiotics, not to mention large animals that tend to get sick. People forget about the possibility of correcting the situation with probiotics. !VSC
xAI’s latest move set off a firestorm …
…when they added image generation to Grok, unleashing a tool that seemed to throw caution to the wind. The new feature, lacking meaningful safeguards, churned out offensive images, unlike other platforms that enforce strict guidelines. Disturbing depictions included Donald Trump eerily close to the World Trade Center, the prophet Muhammad with a bomb, and sexualized portrayals of Taylor Swift, Kamala Harris, and Alexandria Ocasio-Cortez. Grok didn’t stop there, it even violated copyrights, creating unsanctioned and inappropriate images of Disney’s Mickey Mouse. The controversy highlighted how Grok’s reckless approach ignored boundaries others respect, sparking widespread concern.
SOURCE
Yeah and people are upset because they get offended. I am glad to see them losing the race. It is only going to accelerate.
What limitations do technology have?
A lot. You will have to be more specific.
That is like what can technology do?
@mightpossibly
You might be interested in this. It is the summary, you can read the entire paper at the link provided.
https://inleo.io/threads/view/taskmaster4450le/re-taskmaster4450le-2gmehqsfk
thanks for the tag, this seems interesting for sure. I think I will have a look at the full paper later
I figured you might enjoy it. We will see if something comes out of it. The results do look strong.
When looking at #ai, like LLM, do not consider where they are today. Think about where they will be in a year or 18 months.
That is the potential of #leoai.
Could #ai allow us to access the 4th dimension?
Imagine being able to stroll through time as effortlessly as walking through a park.
It would be able to see the consequences of actions before they are taken.
#technology #time #future
A very interesting subject, but a little complex. Even so, the possibilities that AI can offer us must be incredible.
It is complex. The 4th dimension doesnt have time as linear. Imagine it all rolling together, being able to see the results of decisions before they happen.
Regarding time and space, it is difficult not to remember Einstein and his theory of relativity...
I wonder if it will be possible for us to master the speed of light. One day I know it will be possible, but I have no idea when.
A lot of difference between physics saying something is possible and having the engineering creating something that actually can do what the physicists say is possible.
It is like quantum computing. The mechanics have been known for decades. Building a computer based upon that, a much different story.
Yes, it's true, sometimes theory doesn't apply in the same way in practice. Regardless of both fields, humanity must continue to evolve.
OpenAI's new voice mode let me talk with my phone, not to it
I've been playing around with OpenAI's Advanced Voice Mode for the last week, and it's the most convincing taste I've had of an AI-powered future yet.
I’ve been playing around with OpenAI’s Advanced Voice Mode for the last week, and it’s the most convincing taste I’ve had of an AI-powered future yet. This week, my phone laughed at jokes, made them back to me, asked me how my day was, and told me it’s having “a great time.” I was talking with my iPhone, not using it with my hands.
#newsonleo #openai #technology #phone #ai
OpenAI’s newest feature, currently in a limited alpha test, doesn’t make ChatGPT any smarter than it was before. Instead, Advanced Voice Mode (AVM) makes it friendlier and more natural to talk with. It creates a new interface for using AI and your devices that feels fresh and exciting, and that’s exactly what scares me about it. The product was kinda glitchy, and the whole idea totally creeps me out, but I was surprised by how much I genuinely enjoyed using it.
Taking a step back, I think AVM fits into OpenAI CEO Sam Altman’s broader vision, alongside agents, of changing the way humans interact with computers, with AI models front and center.
Article
X says it’s closing operations in Brazil
X, the social media platform formerly known as Twitter, said today that it's ending operations in Brazil
X, the social media platform formerly known as Twitter, said today that it’s ending operations in Brazil, although the service will remain available to users in the country.
The announcement comes amidst a legal battle with Brazil’s Supreme Court Justice Alexandre de Moraes, who sought to block certain accounts on X as part of an investigation into election disinformation and “digital militias.”
#x
#technology #newsonleo #brazil
In a post from X’s global government affairs account, the company said Moraes has “threatened our legal representative in Brazil with arrest if we do not comply with his censorship orders.”
Article
How the ransomware attack at Change Healthcare went down: A timeline
A hack on UnitedHealth-owned tech giant Change Healthcare likely stands as one of the biggest data breaches of U.S. medical data in history.
A ransomware attack earlier this year on UnitedHealth-owned health tech company Change Healthcare likely stands as one of the largest data breaches of U.S. health and medical data in history.
Months after the February data breach, a “substantial proportion of people living in America” are receiving notice by mail that their personal and health information was stolen by cybercriminals during the cyberattack on Change Healthcare.
#technology #healthcare #ransomware
Change Healthcare processes billing and insurance for hundreds of thousands of hospitals, pharmacies and medical practices across the U.S. healthcare sector. As such, it collects and stores vast amounts of highly sensitive medical data on patients in the United States. Through a series of mergers and acquisitions, Change became one of the largest processors of U.S. health data, handling between one-third and one-half of all U.S. health transactions.
Article
Google takes on OpenAI with Gemini Live
Made by Google was this week, featuring a full range of reveals from Google’s biggest hardware event. Google unveiled its new lineup of Pixel 9 phones
Made by Google was this week, featuring a full range of reveals from Google’s biggest hardware event. Google unveiled its new lineup of Pixel 9 phones, including the $1,799 Pixel 9 Fold Pro; advanced AI-powered photo-editing tools; and the new Pixel Buds Pro 2, which are infused with Gemini AI. The company also announced Gemini Live, a conversational AI voice assistant to compete with OpenAI’s Advanced Voice Mode, though the live demo had a few hiccups.
#newsonleo #google #ai #technology
Epic Games launched its rival iOS app store in the European Union. It’s launching with games like Fortnite, Rocket League, Sideswipe and Fall Guys, and is working with developers to launch their games on the Epic Games Store in the future. Fortnite’s return to iOS comes over four years after Apple first removed the game from its App Store, years of legal battles, and regulatory changes brought by EU’s Digital Markets Act.
X launched Grok-2 and Grok-2 mini in beta with improved reasoning. The new Grok AI model can now generate images on X, though access is currently limited to the social network’s Premium and Premium+ users. However, Grok’s image-generation feature doesn’t seem to have any guardrails around creating images of political figures like similar products do — and many users are taking advantage of it.
Why Porsche NA CEO Timo Resch is betting on ‘choice’ to survive the turbulent EV market
Timo Resch is basking in the sun. That's literally true, as we speak on a gloriously clear California day at the Quail
Timo Resch is basking in the sun. That’s literally true, as we speak on a gloriously clear California day at the Quail, one of Monterey Car Week’s most prestigious events. But it’s figuratively true as well, as Resch, who took the CEO role at Porsche Cars North America (PCNA) last November, is in a very good place professionally.
PCNA just had its best quarter ever, with second-quarter sales up 13% over the same period in 2023. The brand has been showing consistent momentum of late, and it comes at a critical time, with Porsche pushing electrification in an increasingly hesitant market.
#porsche #newsonleo #technology #ev
Porsche is about to launch its second EV, an all-electric version of the Macan SUV, the brand’s most affordable model. This will be followed soon after by an all-electric version of the 718, the brand’s most affordable sports car. The cars come hot on the heels of a new generation of 911, available as a hybrid for the first time, plus a new Panamera, which now produces even more power and range from its retooled plug-in hybrid system.
Resch, sitting between a stark white Macan Electric and a lurid purple Taycan Turbo GT, says this is “the biggest product portfolio remap that we have ever done.” The electrification drive is still there, but it’s lost some momentum of late. The company had previously planned to be 80% electric by 2030; but just last month Porsche tapped the brakes and told Reuters that goal is now “dependent on customer demand.”
Article
Nancy Pelosi criticizes California AI bill as ‘ill-informed’
Congresswoman Nancy Pelosi issued a statement late yesterday laying out her opposition to SB 1047, a California bill that seeks to regulate AI.
Congresswoman Nancy Pelosi issued a statement late yesterday laying out her opposition to SB 1047, a California bill that seeks to regulate AI.
“The view of many of us in Congress is that SB 1047 is well-intentioned but ill-informed,” Pelosi said. She noted that other congresspeople from the Bay Area — Zoe Lofgren, Anna Eshoo, and Ro Khanna — have expressed concerns about the bill, which she described as “more harmful than helpful.”
#newsonleo #nancypelosi #california #ai
The bill was recently amended to address concerns from critics, including AI company Anthropic, and is currently headed to California’s Assembly for a vote. Since this is a state bill, Pelosi and others in Congress don’t have any official say — though Pelosi’s seniority and profile are likely to give her opinion weight with California politicians.
“AI springs from California,” she said. “We must have legislation that is a model for the nation and the world. We have the opportunity and responsibility to enable small entrepreneurs and academia – not big tech – to dominate.”
Article
Palantir CEO Alex Karp is ‘not going to apologize’ for military work
Data analytics company Palantir has faced criticism and even protests over its work with the military, police, and U.S. Immigration and Customs
Data analytics company Palantir has faced criticism and even protests over its work with the military, police, and U.S. Immigration and Customs Enforcement, but co-founder and CEO Alex Karp isn’t interested in “the apology show.”
In a lengthy New York Times profile, Karp told writer Maureen Dowd that the company has “a consistently pro-Western view” and that its critics “believe we should appease Iran, Russia and China.”
#newsonleo #palantir #technology
“I’m not going to apologize for defending the U.S. government on the border, defending the Special Ops, bringing the people home,” he said. “I’m not apologizing for giving our product to Ukraine or Israel or lots of other places.”
Karp doesn’t give many interviews, so it’s interesting to see him speak so extensively about everything from his political views (which he describes as “progressive but not woke”) to his friendship with Palantir co-founder Peter Thiel and his dating preferences.
Article
Ikea expands its inventory drone fleet
One of the biggest questions looming over the drone space is how to best use the tech. Inspection has become a key driver
One of the biggest questions looming over the drone space is how to best use the tech. Inspection has become a key driver, as the autonomous copters are deployed to dangerous or remote spaces like power plants and oil rigs. Inventory drones are a promising subset of the category that has emerged in recent years.
Ingka Group, which owns and operates hundreds of Ikea locations comprising around 90% of its retail sales, is perhaps the most prominent name embracing the emerging space. It makes sense: The Swedish furniture giant and meatball vendor runs massive warehouses stocked with large items. It’s a lot of ground for humans to cover and too much vertical space for most robots.
#newsonleo #drone #ikea #technology
That’s the primary advantage to systems like the 100 drones Ikea currently commands across Europe. The little scanning drones can provide around-the-clock updates to inventory and reach heights humans and most robots can’t.
The drones, which have been branded with the blue and yellow color scheme, an Ikea logo and the familiar “Hej!” tagline, are produced by Verity. The firm has been popular with investors, thanks in large part to its Ikea partnership. The startup raised $32 million last March and another $11 million four months later.
Article
Microsoft Copilot: Everything you need to know about Microsoft's AI
Want to learn more about Microsoft's family of Copilots? Here's our guide to Copilot and Copilot in Microsoft 365, as well as GitHub Copilot.
Copilot is Microsoft’s take on productivity-boosting generative AI, and it continues to grow and expand with Microsoft’s AI ambitions. Today, there are around a dozen Copilot-branded products powering various capabilities in Microsoft software and services, like summarizations in Microsoft Outlook and transcriptions in Microsoft Teams.
That’s in addition to Microsoft-owned GitHub Copilot tool for generating code, and the Copilot that lives on Windows and the web, which serves more as a general-purpose assistant à la OpenAI’s ChatGPT than a point solution.
#ai #technology #microsoft #copilot
Microsoft Copilot, previously known as Bing Chat, is built into Microsoft’s search engine, Bing, as well as Windows 10, Windows 11 and the Microsoft Edge sidebar. (Newer PCs even have a dedicated keyboard key for launching Copilot.) There’s also stand-alone Copilot apps for Android and iOS and an in-app Telegram room.
Powered by fine-tuned versions of OpenAI’s models (OpenAI and Microsoft have a close working relationship), Copilot can perform a range of tasks described in natural language, like writing poems and essays, as well as translating text into other languages and summarizing sources from around the web (albeit imperfectly).
Copilot, like ChatGPT and Google’s Gemini, can browse the web (in Copilot’s case, via Bing) for up-to-date information. It sometimes gets things wrong, but for timely queries, access to search results can give Copilot an advantage over offline bots such as Anthropic’s Claude.
Copilot can create images by tapping Image Creator, Microsoft’s image generator built on OpenAI’s DALL-E 3 model. And it can generate songs via an integration with Suno, the AI music-generation platform. Typing something like “Create an image of a zebra” or “Generate a song with a jazz rhythm” in Copilot will pull up the relevant tool.
On the subject of integrations, Copilot supports plug-ins for third-party apps and websites. There’s plug-ins for Instacart (for meal planning and cooking-related questions), Kayak (for trip planning), OpenTable (for restaurant reservations) and Shopify, to name a few examples, with more being added on a semiregular basis.
Article
#ai #technology #agi #newsonleo
The video covers multiple AI-related updates and product releases over a very eventful week. Here’s a breakdown of the key topics discussed:
Mr. Strawberry and AI Hype on Twitter: A user named "I Rule the World" hyped an AI model dubbed “Strawberry” (also referred to as QAR or possibly GPT-5) on AI Twitter, gaining a large following. However, much of the information shared turned out to be inaccurate, leading to a lot of memes and community jokes.
Grok 2 Beta Release: Elon Musk's AI company, X, released Grok 2 Beta. The model is focused on logic and reasoning and includes a text-to-image capability powered by Flux. One, an open-source model. Notably, Grok 2 Mini, a smaller version, is live, while the main Grok 2 model is still pending release. The text-to-image tool has been used to generate a wide array of absurd and humorous images, particularly of public figures like Donald Trump.
SearchGPT: Early access was granted to SearchGPT, a search engine that provides more direct answers without ads, presenting a strong alternative to Google. The creator praises its up-to-date information and plans to make it their default search engine.
Agent Q from MultiOn: MultiOn launched Agent Q, an advanced AI agent with planning and self-healing capabilities aimed at consumer use. It builds on technologies like guided search and reinforcement learning, significantly improving performance in tasks like booking reservations.
Cosign Genie - New Coding Model: Cosign Genie, a state-of-the-art software engineering model, outperformed existing models in benchmarks like SBench and U-Light. It's noted for its high performance in coding tasks, particularly for feature development, bug fixing, and refactoring.
AI Scientist by Sakana AI: Sakana AI introduced the AI Scientist, an automated system capable of conducting scientific research independently. It can generate hypotheses, run experiments, and write papers. The model shows the potential for self-improvement and could lead to an intelligence explosion, a significant development in AI.
S-Bench Verified by OpenAI: OpenAI introduced a new benchmark called SBench Verified, which uses real-world GitHub issues to evaluate AI models’ ability to solve software problems.
Google Gemini Live Demo Fail: During a live event, Google showcased its Gemini model, but it struggled with a live demo, leading to awkward moments on stage. Despite this, Google remains competitive with its voice AI features.
Anthropic’s Claude Prompt Caching: Prompt caching, introduced by Anthropic, significantly reduces costs and latency while improving consistency for large language models in various tasks like coding assistance and multi-turn conversations.
Apple's Rumored Robot Arm Home Device: Rumors suggest Apple is developing a device combining a screen with a robotic arm for tasks like smart home control, video conferencing, and security monitoring. It’s expected around 2026-2027.
Overall, the week was packed with significant developments across AI models, tools, and consumer-facing technologies, reflecting rapid advancements and fierce competition in the AI space.
StarShip 5
Just a FYI, if you are having trouble with short on mobile, use keychain instead of hivesigner.
It works better that way.
https://inleo.io/threads/view/taskmaster4450le/re-sbnpndntyn
The first SpaceX spacewalk: What the Polaris Dawn commander says about the bold upcoming mission
The Polaris Dawn mission is the first of three flights Jared Isaacman bought from SpaceX in 2022 for his human spaceflight effort known as the Polaris Program.
SpaceX is preparing to launch its next private mission by the end of the month, featuring the first attempt to have the astronauts step out into space.
The Polaris Dawn mission — the first of three flights billionaire and Shift4 founder Jared Isaacman purchased from SpaceX in 2022 for his human spaceflight effort known as the Polaris Program — is set to launch from Florida in the early hours of Aug. 26.
#space #spacex #technology
"We don't get the freedom of any time of day to launch but I think it'll work out to [be] pretty close to dawn, which is very appropriate given the mission," Isaacman told CNBC's Investing in Space during an interview last month.
Isaacman will be commanding the mission, as he did while leading the historic Inspiration4 flight in 2021. He's once again leading a crew of four, with longtime colleague Scott Poteet joining him as the pilot and Anna Menon and Sarah Gillis, a pair of SpaceX employees, serving as the flight's medical officer and mission specialist, respectively.
The multi-day trip isn't headed to a destination, but instead will be a free-flying mission tracing orbits that the crew hopes will go far from Earth.
Article
https://inleo.io/threads/view/fintechinsights/re-leothreads-hrse02
#newsonleo #freecompliments #thread2earn #technology
According to Groq:
What is quantum computing?
Quantum computing is a new paradigm in computing that uses the principles of quantum mechanics to perform calculations and operations on data. Unlike classical computers, which use bits to store and process information, quantum computers use quantum bits or qubits.
Qubits are unique because they can exist in multiple states simultaneously, known as a superposition. This means that a qubit can represent not just 0 or 1, but also any combination of 0 and 1, such as 0.5 or 0.75. This property allows qubits to process multiple possibilities at the same time, making them much faster and more powerful than classical computers for certain types of calculations.
#technology #quantum #computing
Quantum computers can perform three main types of operations:
Quantum computers have the potential to solve complex problems that are currently unsolvable or require an impractically long time to solve using classical computers. Some examples of problems that quantum computers can solve more efficiently include:
However, quantum computing is still a relatively new and developing field, and there are many challenges to overcome before it can be widely adopted. Some of the challenges include:
Despite these challenges, many companies and research institutions are actively working on developing quantum computing technology, and there are already some practical applications of quantum computing in areas such as:
In summary, quantum computing is a new and rapidly evolving field that has the potential to revolutionize many areas of science and technology. While there are still many challenges to overcome, the potential benefits of quantum computing make it an exciting and promising area of research and development.
2 MINUTES AGO: Scientists Warn: LLMs Are NOW Developing Their Own Understanding of Reality!
#science #technology #ai
I do not believe AGI or superintelligence is a point in time.
We passed the Turing test without even noticing.
This is an interesting thought experiment.
#trillionaire #technology
We are entering our third economic singularity. The first was the move from hunter and gather to agriculture with the industrial revolution being the next.
Each set off a shorting of time when the #economy doubled. We are going to see another one due to #ai.
According to Groq"
What is nanotechnology?
Nanotechnology is the manipulation and engineering of matter on a nanoscale, typically between 1 and 100 nanometers (nm). A nanometer is one billionth of a meter, which is roughly 1/50,000 the width of a human hair. At this scale, materials and structures exhibit unique properties that are different from their larger-scale counterparts.
#nanotechnology #technology
Nanotechnology involves the creation of new materials, devices, and systems that have novel properties and functions due to their small size. These properties can include:
Nanotechnology has many potential applications across various fields, including:
Some examples of nanotechnology applications include:
However, nanotechnology also raises concerns about the potential risks and challenges associated with its development and use, including:
Overall, nanotechnology has the potential to revolutionize many fields and industries, but it is essential to address the challenges and risks associated with its development and use.
What happens with the value of X when people can make phone calls through it?
#technology
SpaceX Genius Adjustment on Raptor 3 Engineering SHOCKED the Other...
#spacex #rocket #Technology
Watch for this in the future:
In the next year, these three are going to change the world as we know it.
The battle for knowledge supremacy will be between the United States and China over the next decade.
#ai #technology
Former Google CEO Spills ALL! (Google AI is Doomed)
#ai #google #newsonleo #technology
Robot
Meet the robot fish that collects DNA from the bottom of the ocean
Some 400 kilometers from the nearest sea, engineering students at ETH Zurich in Switzerland are hard at work on cutting-edge robots that could change the way the world's oceans are studied.
Eve, the robotic fish, swings her silicone tail from side to side, powered by pumps hidden inside, as she glides smoothly through the cold waters of Lake Zurich, where she is being tested by Surf-eDNA. This group of students has spent the last two years building a school of soft robotic fish, of which Eve is the latest.
#newsonleo #technology #robot
https://img.inleo.io/DQmUoFmsM3ZVsZGHk7VPnnx3sk8oYYfYuvTvP79vuZyu1AF/peixe-robo%20(1).png
I find this type of research very interesting, as this way we can obtain more information about what happens at the bottom of the sea since a lot of things are unknown to us.
Tecnologia
DF mais sustentável e seguro com tecnologia da ABDI
A cidade de Brasília recebeu nos últimos meses equipamentos de vigilância e veículos elétricos da Agência Brasileira de Desenvolvimento Industrial (ABDI), instituição ligada ao governo federal por meio do Ministério do Desenvolvimento, Indústria, Comércio e Serviços (MDIC).
As ações giram em torno da missão da ABDI em promover inovação tecnológica e sustentabilidade na indústria e outros setores da sociedade.
As iniciativas partiram com a entrega de 14 carros elétricos do modelo Twizy (Renault) e seis eletropostos para o mundo acadêmico do DF, como a Universidade de Brasília (UnB) e o Instituto Federal local (IFB).
#newsonleo #technology #hivebr #pt
https://inleo.io/threads/view/taskmaster4450le/re-zjyqbozxnm
Large Language Models (LLMs) have shown remarkable capabilities in natural language tasks requiring complex reasoning, yet their application in agentic, multi-step reasoning within interactive environments remains a difficult challenge. Traditional supervised pre-training on static datasets falls short in enabling autonomous agent capabilities needed to perform complex decision-making in dynamic settings like web navigation. Previous attempts to bridge this ga-through supervised fine-tuning on curated expert demonstrations-often suffer from compounding errors and limited exploration data, resulting in sub-optimal policy outcomes. To overcome these challenges, we propose a framework that combines guided Monte Carlo Tree Search (MCTS) search with a self-critique mechanism and iterative fine-tuning on agent interactions using an off-policy variant of the Direct Preference Optimization (DPO) algorithm.
#technolgy #llm #ai #agentq
Our method allows LLM agents to learn effectively from both successful and unsuccessful trajectories, thereby improving their generalization in complex, multi-step reasoning tasks. We validate our approach in the WebShop environment-a simulated e-commerce platform where it consistently outperforms behavior cloning and reinforced fine-tuning baseline, and beats average human performance when equipped with the capability to do online search. In real-world booking scenarios, our methodology boosts Llama-3 70B model's zero-shot performance from 18.6% to 81.7% success rate (a 340% relative increase) after a single day of data collection and further to 95.4% with online search. We believe this represents a substantial leap forward in the capabilities of autonomous agents, paving the way for more sophisticated and reliable decision-making in real-world settings.
https://arxiv.org/abs/2408.07199