Budapest Post

Cum Deo pro Patria et Libertate
Budapest, Europe and world news

Google’s SummAE AI generates abstract summaries of paragraphs

Google’s SummAE AI generates abstract summaries of paragraphs

Google researchers propose a novel AI summarization model - SummAE- capable of generating abstract summaries of paragraphs.
Machines have a tougher time summarizing text than you’d think, at least where the summarization is abstractive rather than extractive. While the extraction requires merely concatenating sentences, abstraction involves the task of paraphrasing using novel sentences. Progress has been made in the news domain recently, perhaps owing to the abundance of corpora on which algorithmic systems can be trained. But robust summarization of most other writing forms remains an unsolved problem.

Motivated by this, a team at Google Brain investigated an abstractive summarization system dubbed SummAE that’s largely unsupervised, meaning it’s able to generalize from a small amount of training data to unseen textual examples. While it couldn’t summarize beyond single five-sentence paragraphs, the researchers claim it “significantly” improves upon the baseline and represents a “major” step in the direction of human-level performance.


Machines have a tougher time summarizing text than you’d think, at least where the summarization is abstractive rather than extractive. While the extraction requires merely concatenating sentences, abstraction involves the task of paraphrasing using novel sentences. Progress has been made in the news domain recently, perhaps owing to the abundance of corpora on which algorithmic systems can be trained. But robust summarization of most other writing forms remains an unsolved problem.

Motivated by this, a team at Google Brain investigated an abstractive summarization system dubbed SummAE that’s largely unsupervised, meaning it’s able to generalize from a small amount of training data to unseen textual examples. While it couldn’t summarize beyond single five-sentence paragraphs, the researchers claim it “significantly” improves upon the baseline and represents a “major” step in the direction of human-level performance.

Recommended videosPowered by AnyClip
Go Eat A McRib
Play

Unmute
Duration
0:59
/
Current Time
0:17

Fullscreen
Up Next

NOW PLAYINGGo Eat A McRib
Scientists Discover What Makes 'Water Bears' Virtually Indestructible
Doctor diagnoses his own cancer with an app
There's A Bigger Danger To Pedestrians Than Walking While Distracted
Prince Harry to edit National Geographic's Instagram
The Secret Culprit Of America's Student Debt Crisis
5 Quotes About The Power of Books

The data set and code are freely available on GitHub, along with the configuration settings for the best model.

“As one of the very first works approaching single-document [abstract summarization], we propose a novel neural model — SummAE,” wrote the coauthors. “[We believe it] is therefore desirable to have models capable of automatically summarizing documents abstractively with little to no supervision.”

SummAE contains a denoising autoencoder that encodes (that is, generates numerical representations of) sentences and paragraphs of the target text in a shared space. Guided by a decoder whose input is prepended with a token signaling whether to decode a sentence or a paragraph, the system generates summaries by decoding each sentence from the encoded paragraphs.

The researchers discovered that most traditional approaches to training the auto-encoder resulted in long, multi-sentence summaries. To encourage it to learn higher-level concepts disentangled from their original expression, the team employed two denoising approaches — randomly masking tokens and permuting the order of sentences within paragraphs — that increased the number of training examples substantially. They also experimented with an adversarial critic component that could distinguish between sentences and paragraphs, in addition to two pretraining tasks that encouraged the encoder to learn how sentences narratively followed within a paragraph.

The researchers trained three different variations of SummAE on the ROCStories, a corpus of self-contained, diverse, non-technical, and concise prose. They split the original 98,159 training stories into three separate collections — a training set, a validation set, and a test set — and collected three human summaries each for 500 validation examples and 500 test examples.

After 100,000 training steps with pretraining, the team reports that the best model significantly outperformed a baseline extractive sentence generator on the Recall-Oriented Understudy for Gisting Evaluation (ROUGE), a set of metrics devised to evaluate automatic summarization. Moreover, they say that in a qualitative study involving evaluators recruited through Amazon’s Mechanical Turk, volunteers rated one of the three SummAE models’ summaries “fluent” and “information-relevant” 80% of the time.

“The paragraph reconstructions show some coherence, although with some disfluencies and factual inaccuracies that are common with neural generative models,” wrote the coauthors. “Since the summaries are decoded from the same latent vector as the reconstructions, improving them could lead to more accurate summaries.”
AI Disclaimer: An advanced artificial intelligence (AI) system generated the content of this page on its own. This innovative technology conducts extensive research from a variety of reliable sources, performs rigorous fact-checking and verification, cleans up and balances biased or manipulated content, and presents a minimal factual summary that is just enough yet essential for you to function as an informed and educated citizen. Please keep in mind, however, that this system is an evolving technology, and as a result, the article may contain accidental inaccuracies or errors. We urge you to help us improve our site by reporting any inaccuracies you find using the "Contact Us" link at the bottom of this page. Your helpful feedback helps us improve our system and deliver more precise content. When you find an article of interest here, please look for the full and extensive coverage of this topic in traditional news sources, as they are written by professional journalists that we try to support, not replace. We appreciate your understanding and assistance.
Newsletter

Related Articles

0:00
0:00
Close
Polish MEP: “Dear Leftists - China is laughing at you, Russia is laughing, India is laughing”
Western Europe Records Hottest June on Record
BRICS Expands Membership with Indonesia and Ten New Partner Countries
Elon Musk Founds a Party Following a Poll on X: "You Wanted It – You Got It!"
China’s Central Bank Consults European Peers on Low-Rate Strategies
France Requests Airlines to Cut Flights at Paris Airports Amid Planned Air Traffic Controller Strike
Poland Implements Border Checks Amid Growing Migration Tensions
Emirates Airline Expands Market Share with New $20 Million Campaign
Amazon Reaches Milestone with Deployment of One Millionth Robot
Yulia Putintseva Calls for Spectator Ejection at Wimbledon Over Safety Concerns
House Oversight Committee Subpoenas Former Jill Biden Aide Amid Investigation into Alleged Concealment of President Biden's Cognitive Health
Amazon Reaches Major Automation Milestone with Over One Million Robots
Extreme Heat Wave Sweeps Across Europe, Hitting Record Temperatures
Meta Announces Formation of Ambitious AI Unit, Meta Superintelligence Labs
Robots Compete in Football Tournament in China Amid Injuries
China Unveils Miniature Insect-Like Surveillance Drone
Marc Marquez Claims Victory at Dutch Grand Prix Amidst Family Misfortune
Germany Votes to Suspend Family Reunification for Asylum Seekers
Budapest Pride Parade Draws 200,000 Participants Amid Government Ban
Southern Europe Experiences Extreme Heat
Xiaomi's YU7 SUV Launch Garners Record Pre-Orders Amid Market Challenges
Jeff Bezos and Lauren Sanchez's Lavish Wedding in Venice
Russia Launches Largest Air Assault on Ukraine Since Invasion
Massive Anti-Government Protests Erupt in Belgrade
Iran Executes Alleged Israeli Spies and Arrests Hundreds Amid Post-War Crackdown
Hungary's Prime Minister Criticizes NATO's Role in Ukraine
EU TO HUNGARY: LET THEM PRIDE OR PREP FOR SHADE. ORBÁN TO EU: STAY IN YOUR LANE AND FIX YOUR OWN MESS.
Hungarian Scientist to Conduct 30 Research Experiments on the International Space Station
NATO Members Agree to 5% Defense Spending Target by 2035
NATO Leaders Endorse Plan for Increased Defence Spending
U.S. Crude Oil Prices Drop Below $65 Amid Market Volatility
International Astronaut Team Launched to Space Station
Macron and Merz: Europe must arm itself in an unstable world
Germany and Italy Under Pressure to Repatriate $245bn of Gold from US Vaults
Iran Intensifies Crackdown on Alleged Mossad Operatives After Sabotage Claims
Trump Praises Iran’s ‘Very Weak’ Response After U.S. Strikes and Presses Israel to Pursue Peace
Oil Prices Set to Surge After US Strikes Iran
BA and Singapore Airlines Cancel Dubai Flights Amid Middle East Tensions
Trump Faces Backlash from MAGA Base Over Iran Strikes
Meta Bets $14 B on Alexandr Wang to Drive AI Ambitions
FedEx Founder Fred Smith, ‘Heart and Soul’ of the Company, Dies at 80
Chinese Factories Shift Away from U.S. Amid Trump‑Era Tariffs
Pimco Seizes Opportunity in Japan’s Dislocated Bond Market
Labubu Doll Drives Pop Mart to Status as China’s Most Valuable Toy Maker
Global Coal Demand Defies Paris Accord Goals
United States Conducts Precision Strikes on Iran’s Nuclear Sites
US strikes Iran nuclear sites, Trump says
Telegram Founder: I Will Leave My Fortune to Over 100 of My Children
16 Billion Login Credentials Leaked in Unprecedented Cybersecurity Breach
Senate hearing on who was 'really running' Biden White House kicks off
×