Budapest Post

Cum Deo pro Patria et Libertate
Budapest, Europe and world news

Google’s SummAE AI generates abstract summaries of paragraphs

Google’s SummAE AI generates abstract summaries of paragraphs

Google researchers propose a novel AI summarization model - SummAE- capable of generating abstract summaries of paragraphs.
Machines have a tougher time summarizing text than you’d think, at least where the summarization is abstractive rather than extractive. While the extraction requires merely concatenating sentences, abstraction involves the task of paraphrasing using novel sentences. Progress has been made in the news domain recently, perhaps owing to the abundance of corpora on which algorithmic systems can be trained. But robust summarization of most other writing forms remains an unsolved problem.

Motivated by this, a team at Google Brain investigated an abstractive summarization system dubbed SummAE that’s largely unsupervised, meaning it’s able to generalize from a small amount of training data to unseen textual examples. While it couldn’t summarize beyond single five-sentence paragraphs, the researchers claim it “significantly” improves upon the baseline and represents a “major” step in the direction of human-level performance.


Machines have a tougher time summarizing text than you’d think, at least where the summarization is abstractive rather than extractive. While the extraction requires merely concatenating sentences, abstraction involves the task of paraphrasing using novel sentences. Progress has been made in the news domain recently, perhaps owing to the abundance of corpora on which algorithmic systems can be trained. But robust summarization of most other writing forms remains an unsolved problem.

Motivated by this, a team at Google Brain investigated an abstractive summarization system dubbed SummAE that’s largely unsupervised, meaning it’s able to generalize from a small amount of training data to unseen textual examples. While it couldn’t summarize beyond single five-sentence paragraphs, the researchers claim it “significantly” improves upon the baseline and represents a “major” step in the direction of human-level performance.

Recommended videosPowered by AnyClip
Go Eat A McRib
Play

Unmute
Duration
0:59
/
Current Time
0:17

Fullscreen
Up Next

NOW PLAYINGGo Eat A McRib
Scientists Discover What Makes 'Water Bears' Virtually Indestructible
Doctor diagnoses his own cancer with an app
There's A Bigger Danger To Pedestrians Than Walking While Distracted
Prince Harry to edit National Geographic's Instagram
The Secret Culprit Of America's Student Debt Crisis
5 Quotes About The Power of Books

The data set and code are freely available on GitHub, along with the configuration settings for the best model.

“As one of the very first works approaching single-document [abstract summarization], we propose a novel neural model — SummAE,” wrote the coauthors. “[We believe it] is therefore desirable to have models capable of automatically summarizing documents abstractively with little to no supervision.”

SummAE contains a denoising autoencoder that encodes (that is, generates numerical representations of) sentences and paragraphs of the target text in a shared space. Guided by a decoder whose input is prepended with a token signaling whether to decode a sentence or a paragraph, the system generates summaries by decoding each sentence from the encoded paragraphs.

The researchers discovered that most traditional approaches to training the auto-encoder resulted in long, multi-sentence summaries. To encourage it to learn higher-level concepts disentangled from their original expression, the team employed two denoising approaches — randomly masking tokens and permuting the order of sentences within paragraphs — that increased the number of training examples substantially. They also experimented with an adversarial critic component that could distinguish between sentences and paragraphs, in addition to two pretraining tasks that encouraged the encoder to learn how sentences narratively followed within a paragraph.

The researchers trained three different variations of SummAE on the ROCStories, a corpus of self-contained, diverse, non-technical, and concise prose. They split the original 98,159 training stories into three separate collections — a training set, a validation set, and a test set — and collected three human summaries each for 500 validation examples and 500 test examples.

After 100,000 training steps with pretraining, the team reports that the best model significantly outperformed a baseline extractive sentence generator on the Recall-Oriented Understudy for Gisting Evaluation (ROUGE), a set of metrics devised to evaluate automatic summarization. Moreover, they say that in a qualitative study involving evaluators recruited through Amazon’s Mechanical Turk, volunteers rated one of the three SummAE models’ summaries “fluent” and “information-relevant” 80% of the time.

“The paragraph reconstructions show some coherence, although with some disfluencies and factual inaccuracies that are common with neural generative models,” wrote the coauthors. “Since the summaries are decoded from the same latent vector as the reconstructions, improving them could lead to more accurate summaries.”
AI Disclaimer: An advanced artificial intelligence (AI) system generated the content of this page on its own. This innovative technology conducts extensive research from a variety of reliable sources, performs rigorous fact-checking and verification, cleans up and balances biased or manipulated content, and presents a minimal factual summary that is just enough yet essential for you to function as an informed and educated citizen. Please keep in mind, however, that this system is an evolving technology, and as a result, the article may contain accidental inaccuracies or errors. We urge you to help us improve our site by reporting any inaccuracies you find using the "Contact Us" link at the bottom of this page. Your helpful feedback helps us improve our system and deliver more precise content. When you find an article of interest here, please look for the full and extensive coverage of this topic in traditional news sources, as they are written by professional journalists that we try to support, not replace. We appreciate your understanding and assistance.
Newsletter

Related Articles

0:00
0:00
Close
Satirical Sketch Sparks Political Spouse Feud in South Korea
Indonesia Quarry Collapse Leaves Multiple Dead and Missing
South Korean Election Video Pulled Amid Misogyny Outcry
Asian Economies Shift Away from US Dollar Amid Trade Tensions
Netflix Investigates Allegations of On-Set Mistreatment in K-Drama Production
US Defence Chief Reaffirms Strong Ties with Singapore Amid Regional Tensions
Vietnam Faces Strategic Dilemma Over China's Mekong River Projects
Malaysia's First AI Preacher Sparks Debate on Islamic Principles
Meta and Anduril Collaborate on AI-Driven Military Augmented Reality Systems
Russia's Fossil Fuel Revenues Approach €900 Billion Since Ukraine Invasion
Alcohol Industry Faces Increased Scrutiny Amid Health Concerns
U.S. Goods Imports Plunge Nearly 20% Amid Tariff Disruptions
Italy Faces Population Decline Amid Youth Emigration
Trump Accuses China of Violating Trade Agreement
OpenAI Faces Competition from Cheaper AI Rivals
Foreign Tax Provision in U.S. Budget Bill Alarms Investors
Russia Accuses Serbia of Supplying Arms to Ukraine
Gerry Adams Wins Libel Case Against BBC
EU Central Bank Pushes to Replace US Dollar with Euro as World’s Main Currency
U.S. Health Secretary Ends Select COVID-19 Vaccine Recommendations
Trump Warns Putin Is 'Playing with Fire' Amid Escalating Ukraine Conflict
India and Pakistan Engage Trump-Linked Lobbyists to Influence U.S. Policy
U.S. Halts New Student Visa Interviews Amid Enhanced Security Measures
Trump Administration Cancels $100 Million in Federal Contracts with Harvard
SpaceX Starship Test Flight Ends in Failure, Mars Mission Timeline Uncertain
King Charles Affirms Canadian Sovereignty Amid U.S. Statehood Pressure
EU Majority Demands Hungary Reverse Anti-LGBTQ+ Laws
Top Hotel Picks for 2025 Stays in Budapest Revealed
Iron Maiden Unveils 2025 Tour Setlist in Budapest
Chinese Film Week Opens in Budapest to Promote Cultural Exchange
Budapest Airport Launches Direct Flights to Shymkent
Von der Leyen Denies Urging EU Officials to Skip Budapest Pride
Alcaraz and Sinner Advance with Convincing Wins at Roland Garros
EU Ministers Lack Consensus on Sanctioning Hungary Over Rule of Law
EU Nations Urge Action Against Hungary's Pride Parade Ban
Putin's Helicopter Reportedly Targeted by Ukrainian Drones
U.S. Considers Withdrawing Troops from Europe
Russia Deploys Motorbike Squads in Ukraine Conflict
Critics Accuse European Court of Human Rights of Overreach
Spain Proposes 100% Tax on Non-EU Holiday Home Purchases
German Intelligence Labels AfD as Far-Right Extremist
Geert Wilders Threatens Dutch Coalition Over Migration Policy
Hungary Faces Multiple Challenges Amid EU Tensions and Political Shifts
Denmark Increases Retirement Age to 70, Setting a European Precedent
Any trade deal with US must be based on respect not threats', says EU commissioner
UK Leads in Remote Work Adoption, Averaging 1.8 Days a Week
Thirteen Killed in Russian Attacks Across Ukraine
High-Profile Incidents and Political Developments Dominate Global News
Netanyahu Accuses Western Leaders of 'Emboldening Hamas'
Ukraine and Russia Conduct Largest Prisoner Exchange of the War
×