Why You Keep Getting Hallucinations and Fake Citations From AI

By Chris Bigelow

You might wonder why AI prompting guidance appears on a health and fitness site. Innova Vita Fitness is unique in that we teach comprehensive AI prompting techniques in our flagship health and wellness course because many people now turn to AI for nutrition, training, and wellness advice. While we cover prompting strategies and safety guardrails extensively in our course modules and activities, we want to make some of this knowledge publicly available. AI has become increasingly common in daily life, and understanding how to use it properly directly affects the quality of guidance you receive for your health and fitness goals especially for those pursuing their fitness goals and nutrition independently.  This article will be fairly short and cover three basic topics around avoiding errors in AI outputs: Understanding how fabrications occur, why quality prompting is so important, and how to avoid or spot fabricated citations in AI generated reports.

The Growing Problem of AI Fabrication

Something I've been hearing consistently from people across many different sectors is how they keep getting completely fabricated information from AI chatbots, including convincing-looking arguments supported by fake references. This isn't some random glitch or occasional error, it’s actually pretty predictable. Several specific factors contribute to this issue, and understanding these mechanisms is necessary for anyone wanting to use AI tools effectively.

How Default AI Chat Actually Works

When chatbots operate without web search capabilities (which typically requires manual activation), they generate responses based on patterns learned during training rather than retrieving live information. These systems use sophisticated prediction algorithms to determine the most statistically probable next word or phrase based on your query, continuing this process until they complete their response.

This is fundamentally different from searching a database or looking up facts like you might do with a traditional search. The AI isn't retrieving information; it's generating text based on learned patterns. When the system encounters a query where it lacks precise information, it doesn't just shrug. Instead, its programming compels it to generate a plausible-sounding response, which can lead to "hallucinations" where the AI produces factually incorrect or entirely fabricated information.

You can partially address this by explicitly instructing the bot to acknowledge uncertainty or to limit responses to verified information. While this approach isn't perfect, it does appear to reduce fabrication rates in my experience.

The Relationship Between Input-Output Quality 

Another fundamental limitation lies in the direct relationship between input and output quality. If you ask vague or general questions, you'll typically receive equally generic responses. This is where many first-time AI users become frustrated and abandon the tool, never discovering its true potential because they keep getting responses that are roughly equal to the quality of their question.  Another way of saying this is “garbage in, garbage out”.  

This is where learning prompt engineering becomes valuable. Fortunately, it’s not as complicated as it sounds but it does take practice due to the amount of precision required. Prompt engineering involves crafting specific queries that include all the necessary context, constraints, and details for your desired output. This actually requires some foundational knowledge about your topic because you need to understand how to formulate effective questions as well as set effective guardrails. Many people struggle because they don't structure their queries properly, resulting in subpar or incorrect/hallucinated responses. Since these systems predict responses based on your input, misaligned input inevitably produces poor output.  This is why we include engineered prompts in our course for different domains of health and wellness. We are under the assumption that most users are beginners in pursuit of their health and wellness, as well as AI, and won’t know how to frame a question to an AI properly about their nutrition, exercise, sleep habits, etc.  So, the provided prompts act as training wheels.  We have a dedicated AI module that actually teaches people how to create their own prompts so that they don’t have to rely on ours forever (or they could even improve upon them for their individual needs).

Why Fake Citations Happen

Default chat settings often fabricate studies and references because AI systems understand common academic formatting styles like APA and MLA from their training data. When you ask a chatbot in default mode to "generate a report on protein and muscle protein synthesis," it creates appropriately formatted content with potentially accurate general details derived from its training. However, specific in-text citations and reference lists will likely be fabricated.

The system understands the structure and format of citations but doesn't have real-time access to a database of verifiable academic papers in default mode. When you request "proof" from external sources, it generates plausible-sounding studies that fit the context based on learned patterns rather than retrieving actual existing studies. This fabrication occurs as part of the predictive generation process the AI uses to complete your request.

Practical Solutions for Better Results

The most effective way to avoid fabricated citations is enabling your chatbot's web-browsing or research function. Different platforms use various terms like "browsing," "web access," "search," or "research mode." This feature allows the chatbot to access the internet and retrieve real research studies when properly prompted.  You should also note that not all of those web enabling features are equal.  In my experience, ‘web search’ just makes the chatbot pull from Reddit or similar forums since user-generated content sends to appear first in searches, so you may end up getting junk here too.  The one that I recommend is any type of deep research mode like the “research mode” mentioned above. Specifically, in research mode, AI chat bots do very well with prompts that include phrasing like: “Base your results on academic peer-review research”, and “Provide in-text citations and a reference page in APA format with links to the studies”, etc.  Enabling ‘web search’ functions can technically do this too but I’ve found that it still tends to pull pretty heavily from non-peer reviewed user-generated content and that leads to a weird mishmash of actual quality content and things that might be ‘mostly correct’ that’s pulled from a forum somewhere.

Even with some form of web access enabled, you shouldn't trust results blindly. Always request links to cited studies in your query so you can verify them independently like in my example above. Reading the full studies yourself represents best practice even though it requires time investment and some level of academic training. This verification ensures that study data is being applied correctly to your specific question and that you're not missing important context that abstracts might omit.  If you don’t have much of an academic background, you can ask the bot to explain statistics from the study to you in plain English too.  These tools are actually surprisingly good tutors.  The problem is people often use them to just  get answers and do the thinking for them, which is proving detrimental on multiple fronts. 

Why This Matters for Health and Fitness

When you're making decisions about nutrition, training, or recovery, the quality of your information pipeline directly affects your results. AI can speed up research and help you compare options, but it's still your responsibility to set proper constraints and verify sources. When used correctly, these tools save time and improve consistency. When they are used carelessly, they can lead you toward ineffective or generic strategies that you could get from the most cursory Google searches.

For health-related queries, always prioritize higher-quality evidence sources. Ask AI to focus on clinical guidelines, systematic reviews, and reputable databases like PubMed, Cochrane, WHO, or NIH rather than individual studies or blog posts.

Moving Forward as a Skilled AI User

Hallucinations and fake citations aren't random technical failures. They're predictable outcomes of how these tools work and how we interact with them. With proper settings, clear constraints, and verification habits, you can avoid most common problems while accessing AI's genuine capabilities.

Many people new to AI make the mistake of treating it like a combination of a genie and a Google search. This misconception leads to poor-quality outputs that overshadow what AI can actually accomplish when used by someone who understands its strengths and limitations. The difference between skilled and unskilled AI use often determines whether someone finds these tools useful or dismisses them as unreliable.

The issues mentioned in this article aren't all the causes of AI problems, but in my experience, they account for some of the most frequent and easily preventable issues through improved prompting practices.

If this topic interests you, we have additional articles about AI use specifically in health and wellness contexts. This subject is also covered comprehensively in our flagship health and wellness course, available for both individuals and for licensing to businesses and organizations. Check out the links below for related articles and our course information.

Course Icon Explore Our Course

Other Articles Related To AI and Health and Wellness

Should We Trust AI for Health and Fitness Advice?

AI and the Future of Personal Wellness: How Can You Use It Today? - This is an older article but it still has some useful insights if you are new to AI and how it applies to wellness