Budapest Post

Cum Deo pro Patria et Libertate
Budapest, Europe and world news

Google’s SummAE AI generates abstract summaries of paragraphs

Google’s SummAE AI generates abstract summaries of paragraphs

Google researchers propose a novel AI summarization model - SummAE- capable of generating abstract summaries of paragraphs.
Machines have a tougher time summarizing text than you’d think, at least where the summarization is abstractive rather than extractive. While the extraction requires merely concatenating sentences, abstraction involves the task of paraphrasing using novel sentences. Progress has been made in the news domain recently, perhaps owing to the abundance of corpora on which algorithmic systems can be trained. But robust summarization of most other writing forms remains an unsolved problem.

Motivated by this, a team at Google Brain investigated an abstractive summarization system dubbed SummAE that’s largely unsupervised, meaning it’s able to generalize from a small amount of training data to unseen textual examples. While it couldn’t summarize beyond single five-sentence paragraphs, the researchers claim it “significantly” improves upon the baseline and represents a “major” step in the direction of human-level performance.


Machines have a tougher time summarizing text than you’d think, at least where the summarization is abstractive rather than extractive. While the extraction requires merely concatenating sentences, abstraction involves the task of paraphrasing using novel sentences. Progress has been made in the news domain recently, perhaps owing to the abundance of corpora on which algorithmic systems can be trained. But robust summarization of most other writing forms remains an unsolved problem.

Motivated by this, a team at Google Brain investigated an abstractive summarization system dubbed SummAE that’s largely unsupervised, meaning it’s able to generalize from a small amount of training data to unseen textual examples. While it couldn’t summarize beyond single five-sentence paragraphs, the researchers claim it “significantly” improves upon the baseline and represents a “major” step in the direction of human-level performance.

Recommended videosPowered by AnyClip
Go Eat A McRib
Play

Unmute
Duration
0:59
/
Current Time
0:17

Fullscreen
Up Next

NOW PLAYINGGo Eat A McRib
Scientists Discover What Makes 'Water Bears' Virtually Indestructible
Doctor diagnoses his own cancer with an app
There's A Bigger Danger To Pedestrians Than Walking While Distracted
Prince Harry to edit National Geographic's Instagram
The Secret Culprit Of America's Student Debt Crisis
5 Quotes About The Power of Books

The data set and code are freely available on GitHub, along with the configuration settings for the best model.

“As one of the very first works approaching single-document [abstract summarization], we propose a novel neural model — SummAE,” wrote the coauthors. “[We believe it] is therefore desirable to have models capable of automatically summarizing documents abstractively with little to no supervision.”

SummAE contains a denoising autoencoder that encodes (that is, generates numerical representations of) sentences and paragraphs of the target text in a shared space. Guided by a decoder whose input is prepended with a token signaling whether to decode a sentence or a paragraph, the system generates summaries by decoding each sentence from the encoded paragraphs.

The researchers discovered that most traditional approaches to training the auto-encoder resulted in long, multi-sentence summaries. To encourage it to learn higher-level concepts disentangled from their original expression, the team employed two denoising approaches — randomly masking tokens and permuting the order of sentences within paragraphs — that increased the number of training examples substantially. They also experimented with an adversarial critic component that could distinguish between sentences and paragraphs, in addition to two pretraining tasks that encouraged the encoder to learn how sentences narratively followed within a paragraph.

The researchers trained three different variations of SummAE on the ROCStories, a corpus of self-contained, diverse, non-technical, and concise prose. They split the original 98,159 training stories into three separate collections — a training set, a validation set, and a test set — and collected three human summaries each for 500 validation examples and 500 test examples.

After 100,000 training steps with pretraining, the team reports that the best model significantly outperformed a baseline extractive sentence generator on the Recall-Oriented Understudy for Gisting Evaluation (ROUGE), a set of metrics devised to evaluate automatic summarization. Moreover, they say that in a qualitative study involving evaluators recruited through Amazon’s Mechanical Turk, volunteers rated one of the three SummAE models’ summaries “fluent” and “information-relevant” 80% of the time.

“The paragraph reconstructions show some coherence, although with some disfluencies and factual inaccuracies that are common with neural generative models,” wrote the coauthors. “Since the summaries are decoded from the same latent vector as the reconstructions, improving them could lead to more accurate summaries.”
AI Disclaimer: An advanced artificial intelligence (AI) system generated the content of this page on its own. This innovative technology conducts extensive research from a variety of reliable sources, performs rigorous fact-checking and verification, cleans up and balances biased or manipulated content, and presents a minimal factual summary that is just enough yet essential for you to function as an informed and educated citizen. Please keep in mind, however, that this system is an evolving technology, and as a result, the article may contain accidental inaccuracies or errors. We urge you to help us improve our site by reporting any inaccuracies you find using the "Contact Us" link at the bottom of this page. Your helpful feedback helps us improve our system and deliver more precise content. When you find an article of interest here, please look for the full and extensive coverage of this topic in traditional news sources, as they are written by professional journalists that we try to support, not replace. We appreciate your understanding and assistance.
Newsletter

Related Articles

0:00
0:00
Close
Vatican hosts first Catholic LGBTQ pilgrimage
Apple Unveils iPhone 17 Series, iPhone Air, Apple Watch 11 and More at 'Awe Dropping' Event
France joins Eurozone’s ‘periphery’ as turmoil deepens, say investors
France Faces New Political Crisis, again, as Prime Minister Bayrou Pushed Out
Nayib Bukele Points Out Belgian Hypocrisy as Brussels Considers Sending Army into the Streets
France, at an Impasse, Heads Toward Another Government Collapse
The Country That Got Too Rich? Public Spending Dominates Norway Election
EU Proposes Phasing Out Russian Oil and Gas by End of 2027 to End Energy Dependence
More Than 150,000 Followers for a Fictional Character: The New Influencers Are AI Creations
EU Prepares for War
Trump Threatens Retaliatory Tariffs After EU Imposes €2.95 Billion Fine on Google
Tesla Board Proposes Unprecedented One-Trillion-Dollar Performance Package for Elon Musk
Gold Could Reach Nearly $5,000 if Fed Independence Is Undermined, Goldman Sachs Warns
Uruguay, Colombia and Paraguay Secure Places at 2026 World Cup
Trump Administration Advances Plans to Rebrand Pentagon as Department of War Instead of the Fake Term Department of Defense
Big Tech Executives Laud Trump at White House Dinner, Unveil Massive U.S. Investments
Tether Expands into Gold Sector with Profit-Driven Diversification
‘Looks Like a Wig’: Online Users Express Concern Over Kate Middleton
Florida’s Vaccine Revolution: DeSantis Declares War on Mandates
Trump’s New War – and the ‘Drug Tyrant’ Fearing Invasion: ‘1,200 Missiles Aimed at Us’
"The Situation Has Never Been This Bad": The Fall of PepsiCo
At the Parade in China: Laser Weapons, 'Eagle Strike,' and a Missile Capable of 'Striking Anywhere in the World'
The Fashion Designer Who Became an Italian Symbol: Giorgio Armani Has Died at 91
Putin Celebrates ‘Unprecedentedly High’ Ties with China as Gazprom Seals Power of Siberia-2 Deal
China Unveils New Weapons in Grand Military Parade as Xi Hosts Putin and Kim
Rapper Cardi B Cleared of Liability in Los Angeles Civil Assault Trial
Google Avoids Break-Up in U.S. Antitrust Case as Stocks Rise
Couple celebrates 80th wedding anniversary at assisted living facility in Lancaster
Information Warfare in the Age of AI: How Language Models Become Targets and Tools
The White House on LinkedIn Has Changed Their Profile Picture to Donald Trump
"Insulted the Prophet Muhammad": Woman Burned Alive by Angry Mob in Niger State, Nigeria
Trump Responds to Death Rumors – Announces 'Missile City'
Druzhba Pipeline Incident Sparks Geopolitical Tensions
Cost of Opposition Leader Péter Magyar's Economic Plan Revealed
Germany in Turmoil: Ukrainian Teenage Girl Pushed to Death by Illegal Iraqi Migrant
United Krack down on human rights: Graham Linehan Arrested at Heathrow Over Three X Posts, Hospitalised, Released on Bail with Posting Ban
Asian and Middle Eastern Investors Avoid US Markets
Ray Dalio Warns of US Shift to Autocracy
Eurozone Inflation Rises to 2.1% in August
Russia and China Sign New Gas Pipeline Deal
Von der Leyen's Plane Hit by Suspected Russian GPS Interference in an Incident Believed to Be Caused by Russia or by Pro-Peace or by Anti-Corruption European Activists
China's Robotics Industry Fuels Export Surge
Suntory Chairman Resigns After Police Probe
Gold Price Hits New All-Time Record
UK Fintechs Explore Buying US Banks
Greece Suspends 5% of Schools as Birth Rate Drops
Apollo to Launch $5 Billion Sports Investment Vehicle
Bolsonaro Trial Nears Close Amid US-Brazil Tension
European Banks Push for Lower Cross-Border Barriers
Poland's Offshore Wind Sector Attracts Investors
×