The tech world has been having a blast ever since the best generative AI writing tool, ChatGPT, came out last fall. We're starting to see a lot more AI-written content – and it's getting harder and harder to tell which pieces have been actually written by humans.
Does it matter? It really depends on who's asking.
Good content is good content no matter where it comes from. But beyond the actual quality of writing, ethical concerns of using AI to write have definitely been on the rise.
A few months ago, Google announced guidelines prohibiting AI writing. A few months after that, they revised them again to prohibit spammy-AI content, not standalone AI. And that's a pretty big difference.
Beyond the realm of Google, how are you able to tell?
Some articles, essays, and reports have slipping through the cracks to publish robotic misinformation all throughout the web... and it's quite concerning. I've worked with writers that try to pass off AI-generated essays as their own when I clearly asked them not to use it.
Not all AI content is bad, and not all bad content is written by AI, but there's definitely a correlation between AI-produced fluff and expertly-written human content.
I won't sit here and say I hate ChatGPT, it's actually quite the opposite – I use it every day.
But I think it should be up to the publisher to determine where they want their writing to come from. Don't use ChatGPT to complete a business report with sensitive company data if you aren't allowed to, but go ahead & use it to speed up a description for a new product you're inventing. It's becoming a very interesting ethical issue.
With the release of GPT-4, it got even harder to decipher AI. There's no one-size-fits-all approach to predicting AI, especially since it's not definitive. You can't "detect" AI, you can only predict it.
At the end of the day you're just looking at words on a piece of paper, right?
I mean – sometimes it can be pretty obvious. If you forget to tweak some things and submit your robot writing, you might just end up getting made fun of on Twitter.
I've been messing around with generative AI tools like ChatGPT since the first day it came out and sometimes there are some pretty obvious tell-tale signs of AI-generated content.
It's not always direct, and it's definitely not provable, but sometimes it could be helpful to get an idea of where something came from.
Let's go over some technical and non-technical things to look for when trying to check if something was written or generated with AI.
The 2023 Artificial Intelligence Boom
So much buzz about AI these days. You really can't go past a few tweets before running into an announcement about the next magical AI platform that can customize your entire wardrobe, redesign your bedroom, or generate some realistic avatars. These tools are nothing short of amazing & are only really just the start of something bigger.
We're starting to see artificial intelligence tools transition from fun & friendly gimmicks to powerful, complex tools that are literally changing the way we live and work. With that comes a whole new set of challenges, one of which is the rise of AI-generated, spammy content.
If you're a writer, you might be thinking "Great, another thing I have to compete with." And you're not necessarily wrong. AI content is getting better and better and it's only going to continue to improve. As an avid copywriter, I've started to see the uptick in AI-generated content and figured I'd start investing ways to sort through it.
It's not just blog posts – AI is now being used to generate everything from school research papers, e-commerce product descriptions and even chunks of code. And as AI gets better and better at imitating human writing, it's getting harder and harder to tell the difference between what's been written by a machine and what's written by a human.
But what's the point?
If an AI can write an article that looks and feels just like one that's been written by a human, why does it matter? Well it really depends on the context. I personally don't have an issue with it, as I've used AI to help me write things faster & more in-depth than anything I could do on my own.
On the other hand, in a world where anyone can say anything, it's important to be able to spot the difference between fact and fiction. I've seen blogs writing product reviews (clearly using AI) with fake facts, numbers, and pricing. It's like someone just wanted to rank on Google. Not so good.
So if you're suspicious, how can you predict the content you're reading is the real deal?
How To Tell If An Article Was Written With AI
Beyond the realm of Google, academics & other professionals have seen a huge surge in AI-generated content. So, whether you've come across content in an academic, professional, or casual setting, you might want a way to validate if certain content was written by another human.
How to detect AI-generated content requires multiple samples of writing, various tools and methods, and still involves an aspect of luck. Don't rely on a single method of AI content detection to claim something was. These are really just contextual guesses!
After months of manually analyzing content, I still find myself getting stumped depending on the complexity of the AI used.
While I thought most AI tools couldn't write past an undergraduate college level, my mind was changed after seeing GPT-4. Luckily, there are a few tools & manual methods you can use to help determine if a piece of text was assisted by an AI.
Here are my personal best tips and tools to spot AI content in 2023:
Method 1: Using Undetectable AI's Multi-Detection Tool
The first tool we'll go over to help predict if something was written with AI is called Undetectable AI. The tool works by checking content through a fine-tuned model that’s been trained off batched documents submitted to each of the AI detectors they feature (Originality, GPTZero, etc).
Behind the scenes, the tool assigns a likelihood based on its training to give a predictable result based on all the tested content.
So when using Undetectable, the tool basically checks the likelihood of returning positive for AI-writing based on 8 different variations of detectors at once.
It's not conclusive, but it's a very helpful prediction. None of these tools should be seen as such and decisions should not be made off of these tools alone, but they're definitely helpful for giving context that otherwise wouldn't be known.
To use Undetectable's AI Checker, paste your sample of writing inside the input box & submit it for testing! You'll see results from popular detection tools like GPTZero, Writer, Crossplag, Copyleaks, Sapling, Content At Scale, Originality, and ZeroGPT.
Did I mention the tool is free?!
Method 2: Originality.ai Detector + Text Visualizer (paid)
If you want to go a step further than testing your article across various detection tools, you could use Originality AI to both check & visualize the writing. Originality is the harshest AI detection software I've ever used.
It really aims to crack down on AI-generated writing, but can over-diagnose writing (false positives) more than other tools. If you input AI writing, it's almost certainly going to mark it as AI.
The text visualizer feature is what sets it apart from many other AI writing detectors. If you have writers, check their writing with Originality & then rebuild the article using their visualizer.
This will only work if you're checking writing with Google docs. But if you are, you can use their chrome extension to "rebuild" the article and see how it was written. It looks something like this:
Combine this with their copy/paste detection tool and you'll have some really good intuition as to the origins of your suspected writing. In the example above, I actually gave a task to a writer I hired and they used AI to generate about half of it. You can see it clearly when things get copy/pasted before getting tweaked.
Originality uses a combination of GPT-4 and other natural language models (all trained on a massive amount of data) to determine if content seems predictable. Originality seems to be the only AI content detection tool that works very well and accurate for both ChatGPT & GPT 4 (the most advanced generative language tools available to the public).
With pricing starting at 0.01 per 100 words, it's pretty reasonable if you're looking for a more professional, industry-level content detection checker. I've had good luck with it and will continue to use it when checking production-level copy.
You can visualize writing like before or can simply paste your text into the input box like all of the other tools. As a bonus feature, plagiarism also gets detected by default.
Remember, 5% AI doesn't mean 5% of the sample was written with AI. It means if you flipped 100 coins to predict whether something was written with AI, the detection tool would guess it was AI 5 out of those 100 times. Teachers have been getting these percentage values confused and it's ended up getting students in trouble. Not good...
Regarding plagiarism, it's also very impressive. Originality was able to find the exact blog I "copied" the content from and marked the text as being copied from a website (this one!!!). I was honestly impressed at how quickly it was able to find this article. For what it's worth, combining AI detection with a plagiarism checker is an additional measure to be even more confident about the origins of written content.
For anyone looking to automate and easily test writing, Originality has been my go-to tool. Unlike Undetectable.ai, Originality is for more in-depth content checkers.
They will also keep your scans saved in your account dashboard for easy access in the future.
Please remember, nothing is truly definitive and I want to stress that. These tools are all predictors. But to increase the confidence in predictions, you should be using multiple sources to test, validate, and visualize your suspected test – and Originality is currently the best at going in depth.
Acceptable Detection Scores
According to the CEO of Originality, if content is consistently ranking under 10%, it is almost certainly in the clear! Only when content rises close to 40 or 50% AI is when you should begin to get suspicious about its origins.
The longer sample you input increases the chance of detection being more accurate (larger sample sizes = more reliable detection) – and reliability doesn't mean accuracy! Additionally, the more content you scan by the same writer should help give you a better idea when deciding if their writing is legitimate.
Just be careful as some results end up with false positives and false negatives. It is far better to review a series of articles and make a call on a writer/service compared to passing judgement on a single article or text snippet.
Checking Entire Sites
If there is a pattern of consistently high or low detection scores, that should be your largest indicator of AI-written content. One single article is not enough proof to determine if an entire website (or multiple documents of content) have been written with AI assistance. It's also important to take these detection tools with a grain of salt (I can't stress this enough!). The more articles from one source you check will result in a greater statistical sample, but so many factors go into detection beyond what a website can do. Some of these factors includes syntax, repetition, and lack of complexity which we'll get into below. Originality recently introduced a tool to check entire websites at once.
Method 3: Using GPTZero (very careful & accurate detection)
I like GPTZero because they seem to be one of the only AI detection company that cares about what they flag. While they can't promise 100% accurate detection, they only tend to mark something as AI if they're confident about it.
They tend to focus more on academic and educational writing, with a goal of being used in the classroom. I use this on my casual articles since it goes the most in-depth. Undetectable is great for briefly checking across multiple tools, Originality is wonderful for visualizing, but GPTZero is my favorite detector.
The tool is run by a team of talented ML & software engineers and built on 7 "components" of tech, likely making it the most accurate and reliable AI detection tool that is publicly available today. Check out GPTZero for free & try it on a bunch of different types of content (you can also upload files directly)
If you try to enter the paragraph above into GPTZero, you'll get a 0% chance of AI (which is true... I'm sitting on the couch writing this right now and I'm pretty sure I'm not a robot). It's pretty impressive that it definitively knows there's a 0% chance that AI wrote it. Not even a little bit.
Method 4: Content at Scale AI Detector (casual writing & free)
The team over at Content at Scale released a free AI detector that is hands-down the best tool for quickly detecting AI writing. The tool is trained on billions of pages of data and can test up to 25 thousand characters at a time (that's nearly 4000 words!)
To use the tool, paste writing into the detection field and submit it for detection. In just a few seconds you'll see a human content score (indicating the likelihood that a sample of text was written by a human) and you'll also see a line by line breakdown highlighting what parts of your content have been flagged as suspicious or blatant AI.
A big part of how AI prediction works is by trying to recreate patterns. Patterns are great indicators because AI generators are literally trained on recognizing them to produce what "fits" existing patterns the best. The more your text matches existing formats of writing, the higher probability it was generated.
Below are two screenshots between a ChatGPT output compared to human writing. After testing, you'll also see a predictability, probability, and pattern score. These scores are a simplified explanation of what's going on behind the scenes. Human-produced writing is not very predictable because it doesn't always follow patterns. AI writing is the opposite, it only knows patterns.
Read these two excerpts and see if you could determine the difference in the writing. The first one seems very professional, but you can almost feel what the next sentence is going to be about. The human result is a lot more sporadic. It's still good writing – it's just got more creativity in it. Check out Content at Scale if you want a highly accurate, line by line explanation of what's going on
Method 5: CopyLeaks AI Detector
A recent AI detector that's popped up with really great accuracy has been Copyleaks. The detector alerts you if it believes something is AI written or human-generated with not much else. You could hover over sections of text that you think are suspicious (especially text highlighted in red) and you'll be able to see a percentage breakdown. The tool supports GPT-4 and has 2 detection models, basic & enhanced.
They also have a free chrome extension to check directly within your browser. The tool is free to use for checking individual instances of AI writing but requires a paid plan if you're looking to use an API to scan tons of documents in a short period of time.
If you switch to the enhanced model, you'll be asked to sign in (or create an account). It doesn't seem to change anything on the surface or describe how anything is AI. It could simply be a funnel to get people to sign up. I'd stick to their basic tester for general AI-writing related testing.
Method 6: Giant Language Model Test Room (but it's GPT-2)
Three researchers from the MIT-IBM Watson AI labandHarvard NLP group created a great free tool to help detect machine-generated text content named the Giant Language Model Test Room (or GLTR, for short). GLTR is currently the most visual way to predict if casual portions of text have been written with AI. To use GLTR, simply copy and paste a piece of text into the input box and hit "analyze." This tool was built with GPT-2, meaning it won't be as extensively trained as if it were written with GPT-3 or GPT-4 content. But still works as a decent way to look for easily generated content.
The tool will give you a prediction of how likely it is that the text was generated by an AI. If you want to learn more about the technical details behind GLTR, you can read more on their official website. Each word is analyzed by how likely each word would be the predicted word given the context to the left. If the word is within the top 10 predicted words, the background is colored green, for the top 100 it will shade yellow, the top 1000 red, otherwise violet. If you see content filled with a lot of green, it's likely generated by an AI.
Here's a side-by-side comparison of an excerpt of an article written by an AI and one written by a human. You can see that the AI-generated text is much more green than the human-written text.
Again, not foolproof, but a somewhat good indicator. I'd say GLTR is a great visual tool we have to determine AI content, but it doesn't give you an exact score. It's not declarative (take that as you wish). You won't get a percentage or number saying "yeah this is probably AI." By simply pasting a group of text, you can get a good idea of how likely it was written by an AI, but the final call should be based upon your own judgement. Want to see it used compared to Jasper, Hyperwrite, and Lex? Check out this video we made:
Method 7: Writer.com AI Content Detector
Although the parameters for detecting AI content are unclear, Writer.com offers a free and extremely simple AI writing detection tool. You can check text by URL or paste writing directly into their tool to run scans. I've had good success with it but struggle to find the methods in which they determine flagged content.
The detector includes 1500 characters of AI content available to check for free, whenever you want. It does fairly well at detecting ChatGPT-generated writing.
Method 8: Technical & Syntactical Signs
The next way to tell if a piece of content has been generated by an AI is to look at the technical aspects of the writing. This isn't as concrete & may seem obvious, but if you're having trouble with the previous tools or just want to further break down writing you've come across, you should look deep at the content. Here are a few things to look for:
1. Watch out for Transitional Words. ChatGPT loves to use transitional words. Every few lines it'll insert another one. Things like Furthermore, Additionally, Moreover, Consequently, and Hence are frequently written but don't always appear in human writing. We don't really "transition" our writing unless it's something more formal or professional.
2. Big vocab words are suspicious. Utilized, implemented, elucidated, and ascertained are often overused, but what human talks like that in a general article they would write? Human conversations, simpler terms like used, explained, and found are more common and relatable.
3. Length of extensive sentences: AI-generated content often includes very short sentences. This is because the AI is trying to mimic human writing, but it hasn't quite mastered extensive sentence complexity as of yet.
This is painfully clear if you're reading a technical blog about something that requires code or step-by-step instructions. We're not at the point where AI can pass that Turing test just yet.
If you've tested content using one of the detection tools and if content is creative & unique, I'd say it's in the clear. It's the technical content that comes off as confidently fishy that you need to look further into.
4. Repetition of words and phrases: Another way to spot AI-generated content is by looking for repetition of words and phrases. This is the result of the AI trying to fill up space with relevant keywords (aka – it doesn't really know what it's talking about).
So, if you're reading an article and it feels like the same word is being used over and over again, there's a higher chance it was written by an AI. Some of the spammy AI-generation SEO tools love keyword-stuffing articles. Keyword stuffing is when you repeat a word or phrase so many times that it sounds unnatural.
Some articles have their target keyword in what feels like every other sentence. Once you spot it, you won't be able to focus on the article. It's also extremely off-putting for readers.
5. Lack of analysis: A third way to tell if an article was written by an AI is if it lacks complex analysis. This is because machines are good at collecting data, but they're not so good at turning it into something meaningful.
If you're reading an article and it feels like it's just a list of facts with no real insight or analysis, there's an even higher chance it was written with AI. With ChatGPT, we're nearing the point where AI is able to start to analyze writing, but I still find responses to be very "robotic."
People are starting to use AI to reply to tweets but don't realize how painfully cookie-cutter their responses are! You'll notice AI generated writing is a lot better for static writing (like about history, facts, etc) compared to creative or analytical writing. The more information a topic has, the better AI can write & manipulate it.
6. Inaccurate data: This one is more common in AI-generated product descriptions, but it can also be found in blog posts and articles. Since machines are collecting data from various sources, they sometimes make mistakes. If a machine doesn't know something but is required to give an output, it'll predict numbers based on patterns (which aren't accurate). This happens all the time and is (in my opinion) the easiest predictor of AI.
So, if you're reading an article and you spot several discrepancies between the facts and the numbers, you can be very confident what what you just read was written using AI. If you come across spammy content, report it to Google. Save someone else the pain of having to waste their time to read something that is clearly inaccurate!
Method 9: Verify Your Sources & Author Credibility
This one might seem a bit unnecessary for a single blog, but it's still worth mentioning. If you're reading an article and the domain seems to be randomly associated with the content posted, thats your first red flag. But more importantly, you should check the sources that are being used in the article (if any). If an author is using sources from questionable websites or simply declares things without any source, it's either the author isn't doing their research or could simply be automating a bunch of AI-generated content.
If you're trying to check an article on Google, click the menu and see all the information Google has on the site. Here's what that looks like for us:
You can see we were indexed by Google about 2 years ago, but Google doesn't really know too much about us yet. Combine this with your own judgement to make your decision if something seems to be trustworthy.
OpenAI Discontinued Their Official AI Detector
The company behind the madness themselves, OpenAI, released a tool a few months ago to help detect writing. Using the official tool, OpenAI had initially claimed only 26% of AI-written samples they tested were identified properly as AI.
With some doubt from the online marketing & writing community about the tools accuracy, it seems like they were actually correct as OpenAI discontinued & removed their own AI detection tool from the website on July 20th, 2023:
As of July 20, 2023, the AI classifier is no longer available due to its low rate of accuracy. We are working to incorporate feedback and are currently researching more effective provenance techniques for text, and have made a commitment to develop and deploy mechanisms that enable users to understand if audio or visual content is AI-generated.https://openai.com/blog/new-ai-classifier-for-indicating-ai-written-text
My initial thoughts on the detection tool was it really looked like a coin toss. I tested many outputs from ChatGPT and got "unable to tell" and "unlikely written by AI." I never used the tool.
Gold Penguin's AI Detection Tool
A few weeks ago I got together with a development team and had them create us our very own AI detection tool. I was sick of using tools that over-detected a lot of writing. If it's THAT hard to decipher if something was written with AI or not – I'll just leave it as it is. I didn't want anything to get detected when it wasn't, even if that meant I would let some actual AI get through. But that's fine, this technology can't accurately detect everything anyways.
The tool is free and, like every other tool, should only be taken with a grain of salt. It's great for letting you know if something is OBVIOUSLY AI, but for more intricate tools you should probably use another tool.
Other Online Detection Methods
Beware when finding random websites that claim they'll check if your content is AI-generated. If you're looking for AI-content detection tools, ensure that they describe how they are checking content – because "ai detection" doesn't mean anything by itself!
Final Thoughts & What's Next?
It's not the easiest to tell if an article was written by an AI because you truthfully can't be sure. To make matters worse, AI just gets so much better each day. What is GPT-5 going to look like in a few months? I can't even imagine.
That said, if you're questioning whether or not an article was written by an AI, your best bet is to use a combination of all of these tools combined with your own judgement. Test multiple papers by the same author for further reliability.
Make sure to remember to take the results you see with a grain of salt. Nothing you see is conclusive in any way, shape, or form since there's no concrete way to detect AI. Keep in mind what you're working with leaves no watermark, you're just looking at words on a screen.
Hopefully these new tools benefit us by allowing skeptics to filter out AI-generated content across the internet, news, and within school systems across the world.
As AI becomes more sophisticated and the line between human and machine-generated content becomes increasingly blurry, it's only a matter of time until everything we reach the point where AI-generated content becomes indistinguishable!
Let's see what the next few months have in store for us all..
Want To Learn Even More?
If you enjoyed this article, subscribe to our free monthly newsletter
where we share tips & tricks on how to use tech & AI to grow and optimize your business, career, and life.