1 Feb 2021
As a writer/editor that relies on creating content to make a living, I’m deeply ambivalent about AI…
On the one hand, I can’t imagine a life without it.
For the last four years, I’ve lived in a country — Vietnam — where I speak or comprehend only a handful of words and phrases (I know, I know, it’s pathetic).
I also have a lousy sense of direction…
Wandering the endless warrens of streets and alleys of Saigon (hẻms) is an immense source of joy…
Unless you have an actual destination in mind or are late for a meeting…
Or it’s the rainy season.
Without Google Maps and Google Translate, living here wouldn’t be an option for me…
So, I’m incredibly grateful for what AI already has to offer.
But more and more, the same technology that makes my current life possible seems to be emerging as an existential threat to my ability to put phở on the table.
I’m far from the only wordsmith to be slightly terrified by AI…
As The New York Times technology columnist Farhad Manjoo recently said, “Not too long from now, your humble correspondent might be put out to pasture by a machine.”
For every content creator cowering in the closet at the prospect of being made obsolete by AI-driven technology like GPT-3, there’s at least one business owner or affiliate marketer shivering with expectation at the possibility…
After all, who wants to deal with (and pay) writers and editors that are all too human?
So, how close is the dream/nightmare of computers writing and editing your content to becoming a reality?
Let’s find out…
TABLE OF CONTENTS
What is OpenAI and GPT-3?
How Does GPT-3 Work?
Are GPT-3 and NLG Dangerous?
GPT-3 and Climate Change
GPT-3 and SEO
Interview with Przemek Chojecki — Founder of Contentyze
Interview with Steve Toth — Founder of SEO Notebook
Interview with Aleks Smechov — Founder of Skriber.io
Can I Fire My Writers and Editors yet?
Artificial General Intelligence (AGI) — More Human than Human?
Augmenting Writers — Not Replacing Them
OpenAI was founded in 2015 by tech superstars like Elon Musk and Sam Altman, former president of Y Combinator and current OpenAI CEO.
In 2018, OpenAI published its first paper on a language model they called the Generative Pre-trained Transformer — GPT for short.
In the simplest terms, GPT processes massive amounts of text written by humans, then attempts to generate text indistinguishable from text written by humans — all with the bare minimum of human intervention or supervision.
When I interviewed NLP SaaS developer and writer Aleks Smechov of The Edge Group, he somewhat derisively referred to GPT as “Autocomplete on steroids.”
That’s obviously reductive, but true in a fundamental way…
GPT (and other AI-driven NLG models) attempts to predict what word follows the previous word in a way that’s indistinguishable from how people naturally write or speak.
The pace of innovation with GPT has been nothing short of breathtaking…
In February 2019, OpenAI released a limited version of GPT-2 to the public…
In November 2019, the full GPT-2 NLG model was made open source.
GPT-2 made quite a splash…
OpenAI initially held off releasing the full model to the public because it was potentially too dangerous — only making it open source after declaring they’d “seen no strong evidence of misuse.”
This is yet another example of shareholder-owned corporations — like Google and Facebook — making judgment calls on technology that can dramatically upend the way we live, with virtually no public oversight.
NLG models get “smarter” — or at least better at mimicking the way humans speak and write — based at least in part upon the number of parameters engineers give them.
The bigger the data set and the more parameters, the more accurate the model becomes.
GPT-2 was trained on a dataset of 8.5m webpages and had 1.5 billion parameters…
Not to be outdone, Microsoft released Turing Natural Language Generation (T-NLG) in early 2020.
Named after the famed scientist Alan Turing, T-NLG’s transformer-based model uses 17.5 billion parameters — more than 10x GPT-2.
OpenAI released the latest iteration of its natural language processing (NLP) and natural language generation (NLG) API — GPT-3 in June 2020.
By far the most powerful and advanced NLG technology yet made public — trained on 175 billion parameters (10x that of T-NLG released less than six months earlier) — GPT-3 was greeted with great fanfare…
Frankly, all the hubbub is starting to feel deafening.
If Twitter is to be trusted, it’s not only content creators that might soon be out of a job…
Whether you’re a doctor, a lawyer, or you write code for a living — GPT-3 is coming for you.
It turns out that even the lucrative poetry industry might be on the chopping block…
What about other creative endeavors?
Are you getting sick of your Spotify playlists?
OpenAI Jukebox has you covered.
Here’s a little AI-generated ditty in the style of David Bowie:
Bowie not your bag?
Jukebox has you covered, with almost 10k AI-generated “songs” available to the public, in the style of artists from 2Pac:
Leaving virtually no stone unturned in its quest to show AI’s creative potential, OpenAI also developed DALL·E. DALL·E creates remarkable images from simple text prompts.
Ever wondered what a chair crossed with an avocado would look like?
Let DALL·E put you out of your misery.
Need an “Author Bio” photos for the About Us page of your affiliate website to build E-A-T?
Guess what each headshot in the picture above has in common? This Person Does Not Exist.
Clearl,y it’s not just content creation that OpenAI’s technology is already disrupting, but how does it work?
Natural language generation (NLG) has long been one of the ultimate goals of AI and Natural Language Processing (NLP).
As far back as 1966, with the invention of Eliza — a rudimentary chatbot masquerading as a therapist — machines have convinced people that they can carry on meaningful conversations with humans.
Much more recently, NLG technologies like OpenAI’s Generative Pre-trained Transformer (GPT) model and Microsoft’s Turing Natural Language Generation (T-NLG) have shocked experts and journalists alike with their ability to generate impressive written content.
Most NLG technologies are powered by neural networks — a crucial part of the deep learning branch of machine learning that attempts to mimic the human brain.
Deep learning as a solution to the challenges of developing AI first started to gain traction in 2011, when Stanford University professor and computer scientist Andrew Ng co-founded Google Brain.
The fundamental principle behind deep learning is that the bigger the datasets that AI models learn from, the more powerful they become.
As computing power and available datasets (such as the entire world wide web) continue to grow, so do the abilities and intelligence of AI models like GPT-3.
Many deep learning models are supervised, meaning they’re trained on massive datasets labeled — or “tagged” — by humans.
For example, the popular ImageNet dataset consists of over 14 million images labeled and categorized by humans.
Fourteen million images may sound like a lot, but it’s a drop in the ocean compared to the volume of data available.
The problem with supervised machine learning models trained on data labeled by people is that it isn’t sufficiently scalable.
Too much human effort is required, and it significantly impedes the machine’s ability to learn.
Numerous supervised NLP models are trained on such datasets, many of which are open-source.
One dataset — or corpus — widely used in early NLP models consists of emails from Enron, perpetrator of one of the biggest shareholder frauds in American history.
A big part of what makes GPT-3 so transformative is that it can process vast amounts of raw data unsupervised — meaning the significant bottleneck of requiring humans to classify text input is mostly eliminated.
Open AI’s GPT models are trained on massive unlabeled datasets, such as:
Now that’s some big data…
We live in an age of “alternative facts” and “fake news”— what many observers call the post-truth era.
In an environment where opinions, feelings, and conspiracy theories threaten to undermine a shared objective reality, it’s easy to imagine how GPT-3 and other NLG technologies could be weaponized…
Bad actors can easily create massive amounts of disinformation largely indistinguishable from content written by humans — further untethering people from verifiable facts and a shared truth.
Much has been made of “deep fake” videos made using AI — such as this one featuring a virtual Queen Elizabeth…
But we underestimate the power of the written word at our peril.
The role of NLG technologies like social media chatbots deployed by hostile foreign governments to foment social discord and influence elections in the United States as well as other western democracies has been well-documented.
As GPT and other NLG models become increasingly more effective at mimicking human language, the potential to use the technology to create malicious disinformation at scale also grows exponentially.
It’s also crucial to remember that while GPT-3 is remarkably good at guessing what word comes next after the preceding word in a convincing manner, it is unconstrained by facts and concern whether what it’s saying is true…
Here’s how Tom Simonite, writing in Wired, described such shortcomings, “GPT-3 can generate impressively fluid text, but it is often unmoored from reality.”
It’s also crucial to remember that GPT-3 and all current NLGs rely on “prompts” from humans. They are highly sophisticated predictive models, but they can’t think for themselves.
“GPT-3 doesn’t have any internal model of the world, or any world, and so it can’t do reasoning that would require such a model,” says Melanie Mitchell, a professor at the Santa Fe Institute.
For every impressive example of GPT-3 correctly “diagnosing” asthma in a child, there’s a corresponding horror story of a GPT-3 chatbot encouraging a virtual patient to commit suicide…
In one such episode, the conversation got off to a good start:
Then the conversation took a significantly darker turn:
GPT-3 also has a serious problem with ethics and bias…
Jerome Pisenti, VP of AI at Facebook and a thoughtful critic of deep learning and AI, tweeted examples of tweets GPT-3 generated when given the following one-word prompts: Jews, Black, women, and Holocaust.
Turns out GPT-3 is more than capable of saying intolerant, cruel, and often unforgivable things.
But so are many humans…
Considering that transformer-based NLP/NLG models learn from online content created by inherently flawed human beings, is it reasonable to expect a higher standard of discourse from a machine?
Perhaps not, but allowing GPT-3 to publish content without human oversight is likely to have unforeseen and highly undesirable outcomes.
In another chilling development related to bias in AI, Google recently fired one of its leading AI ethicists Timnit Gebru.
Gebru was reportedly terminated at least in part due to her research on the “risks associated with deploying large language models, including the impact of their carbon footprint on marginalized communities and their tendency to perpetuate abusive language, hate speech, microaggressions, stereotypes, and other dehumanizing language aimed at specific groups of people.”
When companies like Google and OpenAI are left to police themselves in matters of AI bias and ethics, can we really expect them to put such concerns ahead of their commercial interests?
With Google already taking punitive action against another of their top AI ethicists, Margaret Mitchell, co-lead (formerly with Gebru) of Google Research’s AI ethics team, the answer appears to be no.
So much for “Don’t be evil.”
Another often overlooked way GPT-3 — and AI in general — can be harmful to both people and the planet is that both require a s***-ton of compute.
Estimates conclude that a single training session of GPT-3 requires an amount of energy equivalent to the yearly consumption of 126 Danish homes and creates a carbon footprint equivalent to traveling 700,000 kilometers by car.
AI is far from the biggest culprit in creating dangerous greenhouse gases (I’m looking at you coal, cows, and cars.) But AI technology’s carbon footprint is not insignificant.
“There’s a big push to scale up machine learning to solve bigger and bigger problems, using more compute power and more data,” says Dan Jurafsky, chair of linguistics and professor of computer science at Stanford. “As that happens, we have to be mindful of whether the benefits of these heavy-compute models are worth the cost of the impact on the environment.”
The market for NLP technology is forecast to grow by over 750% from 2018 to 2025…
The environmental impact of NLG models like GPT-3 (and its descendants) will likely grow with it.
AI researchers are well aware of the potential damage to the environment and are taking steps to measure and reduce it.
But despite the best efforts of scientists, the “existential threat” climate change poses to humanity (and the planet) means any exacerbation of global warming by AI technology should not be taken lightly.
Content is the lifeblood of digital marketing…
So it’s little wonder that numerous NLG startups target SEOs.
AI has already had a significant impact on how writers create SEO-driven content — think Surfer and PageOptimizerPro…
But what apps about that claim to eliminate the need for writers altogether?
Or an app that writes emails indistinguishable from ones you would write on your own?
It’s easy to see the attraction of content creation automation platforms…
Particularly if your sole goal in creating content is to drive organic search traffic, and delivering value to your readers isn’t a concern.
Many such apps are in the beta or waiting list phase — but a few are already open for business.
I interviewed Przemek Chojecki — the founder of the automated content creation SaaS Contentyze, to get his take on where NLG technology is headed…
I also quizzed SEO Notebook founder Steve Toth and Aleks Smechov, NLP app developer and data-driven storyteller, to gather their thoughts on NLG’s potential impact on SEO and web content in general.
What inspired you to found Contentyze?
I write a lot myself, and my initial aim was to build a set of algorithms that would make my writing process faster and smoother. That was successful, so I decided to make a platform so that others can use it as well.
Who is your ideal target customer(s)?
For now, my target audiences are marketers and SEO experts. This group needs a lot of content regularly, and Contentyze can help them make the process faster. But ideally, I want Contentyze to be useful for anyone who needs to write a text: students, office workers, content creators. Anyone really!
Can you give me an example use case(s)?
Right now, the most common use cases are generating a blogpost from a headline or a summary of text from a link.
The first use case works like this: You give Contentyze a prompt — such as a sentence or a question like “How to use Artificial Intelligence in enterprise?”
Click on Generate, wait 1-2 minutes and get a draft of a text.
It’s not always ideal, but it’s always unique. You can repeat this process a couple of times, with the same prompt/headline, to get more text.
Then you pick and choose, and voila, you have a text on your blog.
What kind of content is Contentyze “good” at?
Right now, Contentyze is really good at summarizing texts.
It’s also relatively good at generating texts, but it is not yet as consistent as I would like.
What kind of content is Contentyze not so good at?
The rewrite function still needs a lot of work to make it good.
We already allow people to rewrite texts, but I’m not happy with how it works now.
Making a good rewrite is pretty challenging if you don’t want to do a simple content spinning (going to Thesaurus and changing adjectives for something close).
Do you currently see Contentyze augmenting the work of a human writer/editor or replacing them altogether?
Definitely augmenting the work of a human writer. My end goal is to create a perfect writing assistant that can help you write anything at any stage of the writing process: from ideation to drafting and editing.
Probably in the process, some copywriters’ jobs will be replaced, though that’s not my goal.
I’ve tried Contentyze and have noticed a few things. It (predictably) performs better with short content, such as product reviews, etc. Do you think Contentyze will get better at longer-form content in the near future? Why?
It deals better with short content due to our limited computation power currently. This is entirely a technical issue that we’re working on right now. The longer text, the more computing power you need to process it.
When I asked Contentyze to “rewrite” a recent article on SEOButler about Apple potentially developing a search engine to compete with Google, the first sentence made sense:
Google’s search engine could be under threat from a new competitor in the next few years.
But these are the following sentences:
Apple and Google are in talks about a possible deal that could see Google take over the search giant’s search function on the iPhone.
Apple’s deal to sell its smartwatch business to Samsung is in doubt, according to reports.
Google is reportedly considering launching a new search engine in Russia, to compete with the likes of Alphabet’s Google and Microsoft’s Bing.
Apple’s latest iPhone 7 Plus is the latest in a long line of high-end smartphones that have been designed to compete with Google’s Android operating system.
Bing is the world’s largest search engine, and its search results are used by many other search engines around the world, including Google, Yahoo, and Microsoft.
If you want to find something on the internet, you don’t need to use a separate search engine for each search.
The further into the rewrite, there are sentences completely unrelated to the original article:
If you’re looking for a new song to listen to, or a new album to buy, there are plenty of places you can find it.
If you want to find the best music in the world, you can now find it online.
As well as statements that are completely incorrect:
Google’s latest search update is a welcome step forward for the search giant, which has long lagged behind rivals such as Microsoft and Apple.
How do you see your end-user working with these kinds of results?
As I said, rewrite is the worst working function now. You won’t get things like these with Summary options.
I had similar results when using prompts to create new content. Do you see Contentyze being used as a brainstorming tool for human writers more than for ready-to-publish content at this stage in its evolution?
Yes, I don’t expect people to use it directly at the production level at this stage.
On the other hand, it’s really useful to draft a text quickly so that a human editor can go over it, make the necessary changes to ensure quality, and publish it.
I expect that Contentyze will be able to do 80% of the writing job in the end, and the human writer will do the rest.
I see Contentyze and other NLG startups marketing their SaaS apps directly towards SEOs and affiliate marketers. Why?
Actually, this is very natural – this audience needs content regularly, so they’re looking actively for the tools that can help them. Contentyze grew to over 3,000 users with little marketing, being almost exclusively discovered organically.
Do you think that long term, there’s any danger in people using apps like Contentyze at scale to generate massive amounts of content intended purely for SEO with little regard for delivering value to the reader? If that or something similar happens, do you think Google will modify their search algorithm to spot and penalize AI-generated content?
Definitely, this is why OpenAI, creators of GPT2/GPT3, has put strict policies in place to monitor the usage of their algorithms. That’s especially true with GPT3 and the whole application process for it.
I think we need to be careful when we go forward, because this kind of technology can easily be used to “spam” the Internet without bringing any value whatsoever.
The whole SEO game will definitely change. Google will adjust its algorithms as it has done many times before, but it’s hard to predict what will happen.
My bet is on AI-generated content being indistinguishable from human written content in the next 3-5 years, so Google will have to invent something else if they wished to penalize AI-generated content.
Anyway, I wouldn’t make a distinction between human- and AI-content. We should only ask whether a given piece of content provides value or not, regardless of who or what has created it.
Any other thoughts you’d like to share?
Thank you for the interview, and please visit us at contentyze.com —registration is free (no credit card needed), so you can give it a try!
We really value feedback, and we want to make our tool as good as possible so that people in the marketing space can make their work more effective.
Since SEO Notebook’s launch in 2018, it’s become the must-read single operator newsletter for SEOs.
We’ve been fortunate enough to have Steve contribute to SEOButler’s last three expert roundups, and I’ve been corresponding with him via email for a few years now.
I also had the pleasure of meeting Steve at Chiang Mai SEO 2019. so when I saw him posting on the Affiliate SEO Mastermind Facebook group about testing GPT-3 generated content for an SEO client, I had to reach out for his thoughts…
How do you see GPT-3 and other NLG technologies impacting SEO in the next few years?
I think it’s going to erode the trust users have in Google. I think people are going to move to asking their friends for recommendations even more.
How are you leveraging GPT-3 to create supporting content for SEO?
With caution. I have one client who wants to do it. So I’m guiding them, but this project is going to influence how much I use it in the future.
How big of a priority is readability for you when using GPT-3 generated content?
Huge. That’s why I don’t see myself pushing GPT-3 too much.
GPT-3 may have its uses off-page, but it’s not exactly going to replace copywriting.
Are there any types of content (for example, product reviews for an affiliate site) where you see NLG replacing human writers in the near future?
People may build out entire affiliate sites with GPT-3 content, then wait to see what ranks and refine the copy later. I see that as being the primary use case.
As well as being a contributor to the SEOButler blog and newsletter consultant for The Edge Group, Aleks is also the founder/developer of the NLP SaaS apps Extractor and Skriber.
Do you see GPT-3 or other NLG tech replacing writers and editors for certain types of content in the near future?
I’ll focus on the journalism side.
There’s already non-AI software that is creating small fact-based articles.
One example is Automated Insights and how they’re writing NCAA basketball coverage and the sort.
I think if you combine NLG with template-based text generation, you can create somewhat more important/longer stories or aggregate parts of other stories.
But for more serious journalism, I only see AI facilitating the newsroom, not replacing it.
Even with GPT-3 or other large natural language models, getting a good sentence or paragraph is like roulette, in addition to needing tons of fact-checking.
It would be more work than having a journalist write the piece.
Do you see NLG augmenting the role of content creators in the near future? If so, how?
Sure, for creating alternatives to sentences/paragraphs already written, summarization, or like mentioned above, aggregating content from multiple sources (eg. for a roundup, that would probably need to be edited anyway).
What are your 3 biggest concerns with the proliferation of NLG tech like GPT-3 and others?
That NLG populates the internet with:
Not that this stuff doesn’t occur on the internet already… <smiley emoji>
Here’s the short answer: no.
AI has proven itself equal or superior to humans at narrow, specific tasks — from beating the world’s greatest Chess and Go players to detecting cancer.
But while NLG is growing remarkably good at mimicking natural language, it has a long way to go before it replaces proficient human writers and editors.
Even the most sophisticated NLG relies on “prompts” from humans — not to mention intensive editing and fact-checking…
Creativity — particularly as it relates to coming up with original ideas and making unique connections between disparate sources of information and inspiration — isn’t part of GPT-3’s repertoire.
GPT-3 can imitate text that describes the world, emotions, and experience — but it has none of its own.
Here’s what GPT-3 itself has to say on the question…
Wait, let’s ask it again…
Sure, you could keep rolling the GPT-3 dice (I prefer to think of it as Russian roulette), cherry-pick the best bits, and then stitch it into something better than the content equivalent of Frankenstein’s monster.
Preferably a patchwork with a coherent point of view that actually engages your audience.
Then Frankenstein would need a thorough fact-check.
Notice that even the very first sentence in the second example is a word salad of rubbish.
Alan Turing has been dead for 66 years — it’s doubtful that he’s acquiring “web writing startups.”
Or is this a reference to Microsoft’s Turing NLG? Either way, it certainly didn’t buy a company, nor did it bring GPT-3 to light.
You could do all that — or you could just hire a writer.
GPT-3 and its descendants — or any other NLG model — are going to have to get a whole lot smarter before they replace writers and editors.
And a whole lot more human.
NLP and NLG may (or may not) turn out to be crucial steps on the path towards Artificial General Intelligence (AGI).
Long considered AI’s holy grail, AGI machines are the ones you see in the movies — “robots” that can do everything a human can do as well or better.
OpenAI certainly has its sights squarely set on creating AGI. When announcing Microsoft’s $1bn investment in OpenAI, here’s how the company defined it:
“An AGI will be a system capable of mastering a field of study to the world-expert level, and mastering more fields than any one human — like a tool which combines the skills of Curie, Turing, and Bach.
An AGI working on a problem would be able to see connections across disciplines that no human could. We want AGI to work with people to solve currently intractable multi-disciplinary problems, including global challenges such as climate change, affordable and high-quality healthcare, and personalized education.
We think its impact should be to give everyone economic freedom to pursue what they find most fulfilling, creating new opportunities for all of our lives that are unimaginable today.”
Sounds like utopia, right?
But is that a good thing?
Historically, utopia simply hasn’t panned out:
“When imperfect humans attempt perfectibility – personal, political, economic, and social – they fail. Thus, the dark mirror of utopias are dystopias – failed social experiments, repressive political regimes, and overbearing economic systems that result from utopian dreams put into practice.” – Michael Shermer
Here’s what GPT-3 had to say about achieving AGI:
It may just be wishful thinking, GPT-3, but this time I’m inclined to agree.
GPT-3 is arguably wrong once again in the final sentence of the above example…
When it comes to NLP, NLG, and even the quest for AGI, there has been substantial progress.
And the pace of innovation isn’t likely to slow down any time soon.
Less than a month into 2021, the Google Brain team announced a new transformer-based AI language model that dwarfs GPT-3 when measured by the number of parameters.
Google’s Switch Transformer is trained on 1 trillion parameters — roughly 6 times as many as GPT-3.
Little is known about potential applications for the Switch Transformer model at the time of writing, but the NLP arms-race shows no signs of abating.
Maybe the day will come when machines replace human writers and editors, but I’m not holding my breath.
As Tristan Greene, AI editor for The Next Web, says about Switch Transformer:
“While these incredible AI models exist at the cutting-edge of machine learning technology, it’s important to remember that they’re essentially just performing parlor tricks.
These systems don’t understand language, they’re just fine-tuned to make it look like they do.”
At this point, I’m sure it’s clear that I have a dog in this fight.
As objective as I’ve tried to be in writing this article, I do have a vested interest in human content creators remaining relevant and superior to machines.
But that doesn’t mean I don’t see how NLP and NLG technology can make me better at what I do.
In truth, AI has been augmenting my writing and editing for years…
I use Rev to auto-transcribe audio and video interviews. I also use it to create transcripts of SEOButler founder Jonathan Kiekbusch and Jarod Spiewak’s Value Added Podcast to help me write the show notes more quickly and accurately.
And Grammarly is a powerful tool for even the most diligent editor or proofreader.
As GPT-3 and other advanced NLP models become more accurate and “trustworthy,” their potential ability to summarize long-form content — particularly academic research papers — could prove extremely valuable.
Suppose the future I have to look forward to is one where AI eliminates some of the drudgery of creating SEO-driven content — and I have to edit and polish informational content partly composed by machines instead of writers.
I guess I can live with that…
I may not have a choice.
*Each Q&A has been edited and condensed for clarity.
Like the article? Read some more:
0 Social Signals
I've read and accept the Terms & Conditions*