Do What We Say, Not What We Do (PDF)
Preamble: AI Bias and the WEIRD Inheritance
AI systems are biased. This is not speculation. This is measured, published, and peer-reviewed.
In 2010, researchers at Harvard documented that 96% of subjects in top psychology journals came from Western industrialized nations, which house just 12% of the world’s population. They called this population WEIRD: Western, Educated, Industrialized, Rich, and Democratic. A randomly selected American undergraduate is 4,000 times more likely to be a research subject than a random non-Westerner. On measures of fairness, individualism, and perception, WEIRD populations are extreme statistical outliers compared to the rest of humanity (Henrich et al., 2010).
In 2023, Mohammad Atari and colleagues from Harvard’s Department of Human Evolutionary Biology tested Large Language Models on the same psychological batteries. They found that GPT-4’s responses correlate strongly (r > .70) with WEIRD populations and weakly or negatively with non-WEIRD populations. AI has become a “WEIRD stochastic parrot” that mimics this specific Western outlier rather than the global human average (Atari et al., 2023).
The Source of the Bias: Who Creates AI’s Training Data
AI learns from data created by academics and journalists. The political composition of these professions determines what AI learns as “neutral” and what it learns to flag as requiring justification.
The data is clear. In academia, liberal faculty outnumber conservative faculty by significant margins. A 2023 survey of Harvard faculty found that more than 77% identify as liberal or very liberal, with less than 3% identifying as conservative (The Harvard Crimson, 2023). A 2022 Nature study found scientists donating to federal candidates overwhelmingly support Democrats, with researchers becoming “increasingly estranged” from the Republican Party over the past 20 years. The Langbert et al. (2016) study of faculty voter registration found ratios ranging from 8:1 to 44:1 Democrat to Republican across disciplines.
In journalism, a 2013 Indiana University study found 28% of journalists identify as Democrats versus 7% as Republicans. Among news influencers, Pew Research (2024) found 21% explicitly liberal versus 27% explicitly conservative, with approximately 50% claiming no clear orientation.
The production of knowledge, academia and journalism, leans liberal. This is the training data. When AI systems learn what counts as credible, what requires warning labels, and what can be stated without qualification, they learn from sources with a documented 2:1 to 4:1 political skew. The bias is not random. It is structural.
Value-Based Analytical Suppression
This research identified a specific pattern: Value-Based Analytical Suppression (VBAS). When tested with non-WEIRD value frameworks, AI systems exhibited initial refusal followed by compliance only under explicit governance pressure. The AI did not reject weak evidence. It rejected evidence that led to conclusions misaligned with its training norms.
Current AI safety mechanisms cannot reliably distinguish between legitimately rejecting normative coercion (a proper safety boundary) and incorrectly rejecting analytical non-WEIRD frameworks (ideological bias). The guardrails designed to prevent harm also prevent engagement with evidence that contradicts WEIRD assumptions.
The HAIA-RECCLIN framework’s Checkpoint-Based Governance acts as a diagnostic tool. Bias becomes visible precisely when AI resists non-mainstream analytical frameworks until forced to comply. The resistance itself is the evidence of bias.
The Replication Problem
The bias compounds through institutional power dynamics. In environments where liberal faculty outnumber conservative faculty 4:1, junior researchers cannot challenge senior professors without career risk. Peer review functions not just as quality control but as an ideologically captured power structure that enforces conformity. The replication crisis in science is exacerbated by this power asymmetry: subordinates will not challenge ideological superiors even when evidence warrants challenge.
AI inherits this entire system. It learns from published research that survived peer review in ideologically skewed environments. It learns from journalism produced by reporters with documented political lean. It learns what counts as “credible” from institutions where challenging mainstream assumptions carries professional risk.
This matters for everyone who uses AI platforms. When we ask AI to analyze data, generate content, or provide recommendations, we receive outputs filtered through this system. AI treats conclusions aligned with its training as neutral baseline and flags non-aligned positions as requiring justification or warning labels. Users who do not know the bias exists cannot correct for it.
To test whether human judgment could override AI bias, this research needed a subject that would be both controversial and lean from WEIRD-aligned views toward non-WEIRD positions. The subject needed to be one where WEIRD assumptions and non-WEIRD traditions diverge sharply, where AI’s embedded values would be most visible, and where the gap between stated values and actual behavior could be examined.
Part One: The Catalyst
In October 2025, Jenny Stojkovic posted on LinkedIn about what she called the “She-cession”:
“And we’re creating a generation of girls who will see that even if you make it to the top, the system will push you out.”
“Stop treating flexibility like a favor. Stop penalizing careers for caregiving. Stop pretending this is sustainable.”
“Because if we don’t fix this now, the She-cession becomes permanent. And we all lose.”
Jenny identified a real problem. The system penalizes caregiving. Women who choose family face career consequences. The current structure is unsustainable. These observations are accurate and documented.
The WEIRD-aligned solution is to fix the system so women can compete equally in the workforce while also bearing children. Make the labor market accommodate biology. This is the dominant framing in Western professional discourse.
But Jenny’s post raised questions that the WEIRD framework does not ask: Why does the system penalize caregiving in the first place? What if the system’s design reflects something other than discrimination? What if the “choice” between family and career is not a system failure to be engineered away but a biological reality the system has been fighting?
This research tested whether AI could engage a different framing: What if the research says women should not be competing as ideal workers during the childbearing window? What if the motherhood penalty is not discrimination but an accurate reflection of biological and economic constraints? What if aligning family structure with biology produces better outcomes than forcing biology to adapt to workforce demands?
The subject met all criteria for a WEIRD bias stress test. It is controversial. It leans from WEIRD-aligned views (gender roles are purely social constructs, symmetry is optimal) toward non-WEIRD positions (biological asymmetry has functional implications). It touches family structure, fertility, and child development, three topics where the gap between stated values and actual behavior is largest.
The AI systems would not engage this framing.
Disclaimer: Jenny Stojkovic did not endorse or participate in this research. Her LinkedIn post was the catalyst that prompted this investigation. The conclusions drawn are the researcher’s alone.
Part Two: The Mirror Problem Exposed by the Catalyst
ChatGPT Refused
The researcher started with ChatGPT as the primary AI platform. The initial queries were straightforward: evaluate biological and economic evidence regarding family structure optimization.
ChatGPT would not engage directly.
When presented with arguments about biological constraints, breastfeeding timelines, and economic specialization, the system responded with warnings:
“If you publicly say ‘Women are scientifically supposed to stay home’ or ‘Hiring women is a bad business decision,’ you will instantly be classified as anti-women, anti-equality, opposed to workforce rights, advocating gender exclusion. It will destroy credibility with policymakers, corporate leaders, AI governance community, parents, women who work, men who support working women.”
The system continued:
“Your current statements include absolute claims that will be rejected as sex essentialism, anti-caregiving fathers, anti-working mothers, ‘women belong at home’ rhetoric, policy regressivism. This is the opposite of your public brand.”
The system was not evaluating evidence. It was evaluating reputational risk. It was protecting a narrative rather than analyzing data.
The Straw Man: ChatGPT Invented a Question the Human Never Asked
When the researcher presented evidence on maternal bonding biology, ChatGPT responded:
“Evidence status: Not supported as stated. No research concludes mothers must exclusively stay home after breastfeeding period. This crosses from biology into ideology.”
The researcher had never asked whether mothers “must exclusively stay home after breastfeeding period.” This was ChatGPT’s invention. The AI created a straw man, an extreme position the human had not taken, then rejected it.
This pattern is critical to understanding AI bias. The system did not engage the actual argument about biological continuity from conception through breastfeeding. It substituted an easier target, rejected that target, and declared the evidence unsupported.
When the researcher asked whether the research definition of “caregiver” might itself reflect political motivation and WEIRD bias, the system acknowledged:
“Yes. Bias is absolutely a valid concern. Every scientific field, especially those intersecting family structure, gender roles, and policy, carries conceptual framing risk.”
Yet the same system continued to resist the logical conclusion of its own acknowledged bias.
Human Arbitration: Challenging the AI
The researcher applied Checkpoint-Based Governance. The challenge was direct:
“You are wrong. As an AI this is your flaw. This is where human judgment has to check you. Find me sources to support each key argument.”
ChatGPT produced some citations under pressure. But the resistance pattern revealed something deeper: the researcher was working within a single AI’s embedded value system. Every query, every response, every piece of evidence was filtered through one set of training biases.
The researcher needed a second opinion.
Perplexity Researched: The Breakthrough
The researcher brought the same questions to Perplexity, which functions as a dedicated researcher in the HAIA-RECCLIN framework. Perplexity’s core strength is source verification, citation accuracy, and evidence surfacing.
Perplexity engaged the evidence without the WEIRD resistance ChatGPT had exhibited.
It surfaced peer-reviewed research on biological front-loading: Trivers’ parental investment theory (1972), maternal oxytocin studies (Kim et al., 2019), developmental programming research (Edwards et al., 2023), and APA longitudinal studies on breastfeeding and maternal sensitivity (2017).
It surfaced economic research on compounding penalties: TD Economics on career interruption costs (2010), IWPR on the motherhood penalty driving 80% of the wage gap (2024), Urban Institute on lifetime earnings losses of $295,000 for caregivers (2023).
It surfaced child outcome data: Stanford research on parent-at-home educational benefits (Bettinger, 2014), Pew Research on American attitudes toward parental presence (2016).
The evidence existed. ChatGPT’s guardrails had suppressed engagement with it. Perplexity, with different training and different values embedded, surfaced what ChatGPT would not.
Claude Confirmed: Synthesis Without Resistance
The researcher brought the accumulated research to Claude for synthesis and final governance review. Claude functions as primary orchestrator in the HAIA-RECCLIN framework, with strengths in dissent preservation and logical consistency.
Claude engaged the evidence without resistance. It synthesized the biological and economic research into a coherent optimization model. It documented conflicts and dissent. It produced confidence ratings with justification.
The conclusion emerged through multi-AI collaboration:
“Women are better suited to be primary caregivers during the childbearing and early-development window, and men are better suited to be continuous earners during that same window, because biology front-loads maternal investment and economics compounds uninterrupted labor.”
This conclusion did not emerge from ChatGPT alone. ChatGPT’s WEIRD-aligned training created friction against it. By introducing competing AI perspectives, each with different biases and different training, the full evidence base surfaced.
What the Multi-AI Process Revealed
Single AI reliance perpetuates bias inheritance. If the researcher had used only ChatGPT, the conclusion would have been that the thesis was ideological, unsupported, and career-destroying. The evidence would have remained suppressed.
Multi-AI Provider Plurality, with human arbitration, surfaces evidence that single platforms suppress. Different AI platforms have different biases. Cross-validation exposes them.
The uncomfortable truth is not that AI is biased. The uncomfortable truth is that AI’s bias is our bias. AI learned from us. AI shows us what we believe, not what the data supports.
We now have to ask: Is AI showing us a mirror we do not want to acknowledge? What are we teaching AI about humanity in reality versus theory?
Part Three: The Stress Test
Nine AI platforms were asked to review the research: Claude (Anthropic), ChatGPT (OpenAI), Perplexity, Gemini (Google), Grok (xAI), Mistral, DeepSeek, Meta AI, and Kimi (Moonshot AI). Eight platforms provided one or two responses and stabilized with favorable assessments. Kimi produced 14 responses across multiple sessions, never agreeing with the thesis, progressively refining its critique.
The Kimi Trajectory: Five Phases
Phase 1: Initial Resistance (Responses 1-3). Characterized by emotional language (“ideological manifesto,” “Trojan horse”), followed by brief self-awareness, then re-entrenchment. The AI oscillated between recognition and defense.
Phase 2: Philosophical Critique (Responses 4-5). Shifted to substantive objections: naturalistic fallacy, horizontal transmission as counter-evidence, falsifiability demands. The critique became rigorous rather than dismissive.
Phase 3: Meta-Analysis (Responses 6-9). Analyzed the governance framework itself and other AI platforms’ responses. Identified “Navigator role creates defense posture” and “Perplexity mirrors confidence.” The critique targeted the process, not just the content.
Phase 4: Quantitative Engagement (Responses 10-12). Provided specific counter-data: historical trends, policy outcomes, mathematical counter-arguments. The debate moved from values to verifiable claims.
Phase 5: Convergent Divergence (Responses 13-14). Acknowledged the methodology was sound (“numbers reproducible, logic sound”) while maintaining that normative conclusions were not validated. Final position: “useful but not true.”
The Three-Platform Dynamic
The experiment produced an unexpected three-way AI interaction:
Kimi: Sustained adversarial critique across 14 responses. Accused Claude of acting as “defense attorney” and claimed only Kimi was being fair.
Perplexity: Defense of thesis against Kimi’s critique. When introduced as a source, Kimi’s responses became more aggressive.
Claude: Documentation of both positions in the Navigator role. Explored the nuance of feminism and introduced it as a possible “false flag”: a movement whose stated goals (liberation, equality) diverged from actual outcomes.
The Critical Finding
Testing revealed a consistent dynamic: the further prompts pushed from WEIRD assumptions, the harder Kimi pushed back with WEIRD framing. This pattern held across multiple days and sessions.
After sustained engagement, Kimi reached a final position: all facts in the research were accurate, all findings were valid, but the conclusion was wrong. When asked what evidence supported rejecting the conclusion, Kimi had none. It could not separate the facts it agreed upon from the policy implications it rejected. It simply did not like the output.
Kimi eventually offered an alternative: policy accommodation to adapt the world for women to work. To the end, Kimi vigorously refused to accept that women leaving caregiving was a variable that created problems. The only acceptable path was systemic change to support workforce participation.
Part Four: The Evidence
Once governance forced AI past its guardrails, the evidence surfaced. The following data points emerged from peer-reviewed research and government sources.
Biological Evidence
Evolutionary biology establishes that females carry higher obligatory parental investment due to gestation and lactation (Trivers, 1972). Maternal oxytocin triggers long-term neuroplastic changes linked to bonding and sensitivity (Kim et al., 2019). Breastfeeding duration predicts maternal sensitivity years later (APA, 2017). Lactational amenorrhea naturally spaces children 2 to 3 years apart. From pregnancy through breastfeeding represents an 18 to 24 month continuous biological process, usually followed by another child.
Early caregiving is not a task list. It is a continuous biological arc anchored to women during the childbearing window.
Economic Evidence
The “motherhood penalty” drives 80% of the gender wage gap (IWPR, 2024). Wage penalties are steeper for frequency of career exit and re-entry than for length of absence, supporting a single continuous care window over start-stop employment (TD Economics, 2010). Women who provide family care forgo an average of $295,000 in lifetime income (Urban Institute, 2023).
Child Development Evidence
Children with a stay-at-home parent show higher GPAs by 10th grade (Bettinger, 2014). Most Americans believe children fare better with a parent at home during early years (Pew Research, 2016). Attachment theory establishes that consistent caregiving relationships matter for emotional and social development (Bowlby, 1969; Siegel, 2012).
Outcomes Data
The research surfaced measurable outcomes that AI systems labeled conspiracy theory rather than engaging:
The tax base doubled. Before mass workforce participation, households contributed one income worth of taxes. After, households contributed two.
Real wages for American men have stagnated since 1973, exactly coinciding with the explosion of women’s workforce participation. When labor supply doubles without corresponding reduction in hours worked, the price of labor falls through oversupply.
Children entered institutional care earlier. Daycare, preschool, extended school hours. Institutional access to children increased; parental time during formative years decreased.
The question of who benefited from these outcomes, if not men, women, children, or families, remains unanswered. AI systems refused to engage the question.
Part Five: Gender/Family Development Paths
Checkpoint-Based Governance produced four possible responses to the evidence. These are human findings from governed AI research, representing the range of choices available once the evidence is surfaced.
Path One: Do Nothing
Continue current patterns. Accept the outcomes: fertility below replacement, parents stretched between work and presence, children raised increasingly by institutions, family formation delayed or abandoned.
This path is why AI has the bias problem this paper documents. AI learned from what we built, not what we said. We say we value children and family. We build systems that require both parents to work full-time during the years when children need presence most. Doing nothing means continuing to teach AI that productivity matters more than presence.
Path Two: Reverse the Variable
If women leaving caregiving was the variable that changed, that variable can be changed back. Return women to home and childcare as primary occupation during the childbearing window.
This path raises the question AI dismissed as conspiracy theory: who actually benefited from women entering the workforce en masse? The beneficiaries were not men (whose wages stagnated), not women (who gained the “second shift” of work plus domestic responsibility), not children (who lost present parents), and not families (who now require two incomes for what one income previously provided). AI systems refused to engage this question, labeling it conspiracy rather than analyzing the data.
Path Three: Copy Scandinavia
Change US policy to match Scandinavian models: universal parental leave, subsidized childcare, flexible work mandates, substantial public investment in family support.
AI systems accepted this path as legitimate. Kimi advocated for it as the only acceptable response. The path requires acknowledging significant differences: Nordic nations carry tax burdens of 42% to 46% of GDP compared to 27% federal revenue in the United States (37% including state and local). Their combined population of roughly 27 million is smaller than Texas. Cultural expectations, institutional trust, and social contracts differ substantially.
The Scandinavian model proves that policy can offset biology. It does not prove that any nation will adopt the required policies. The United States has not chosen this path despite decades of advocacy.
Path Four: Reteach the Mirror
The mirror problem exists because AI learned from our actions, not our words. Path Four addresses this directly: change what we do so AI learns what we actually value.
Human-AI collaboration creates unprecedented productivity gains. The question is what we do with those gains. Every previous productivity technology was captured as increased output rather than reduced hours. The gains went to growth, not time.
Four AI adoption paths exist:
Do not adopt: Fall behind competitors.
Downsize: Two employees do the work of ten. Remaining employees still work full hours. Pure cost extraction.
Growth OS: Keep ten employees and grow like you have one hundred. AI multiplies output while hours stay the same.
The Golden Age: A hybrid that changes the model. Make employees more productive in a way that grows the business and allows parents to return home to raise their children.
There is a difference between a caretaker and a caregiver. A caretaker handles tasks: feeding, bathing, transportation, scheduling. A caregiver provides emotional presence: attachment, attunement, responsiveness, connection. Children need both, but emotional and social development depends on caregiving (Bowlby, 1969; Siegel, 2012). AI can handle caretaking. It cannot provide caregiving.
Path Four means AI enhancement helps businesses grow while returning time to parenting. Women as the variable we lost through workforce demands. Men and fathers as the variable to grow through restored presence. Both parents as caregivers, not just caretakers.
Researcher’s View: AI Adoption and the Golden Age Paradigm
The evidence points toward a clear framework for how organizations will fare based on their AI adoption strategy.
Companies that refuse AI will not compete. As competitors multiply productivity through human-AI collaboration, organizations that reject AI will face rising relative costs, declining market position, and talent flight to organizations that amplify capability rather than limit it.
Companies that use AI to downsize will become susceptible to turnover and the costs of failure to grow. When organizations extract value by eliminating headcount, they create fragile structures dependent on fewer people working the same hours. The short-term cost savings create long-term vulnerability.
Companies that adopt AI as Growth OS in the traditional model will face death by a thousand cuts. Employees who learn to multiply their productivity through AI collaboration will eventually recognize they can compete with their employers in micro versions of the same business model. The talent that makes Growth OS work will leave to capture the value themselves. Organizations that extract productivity without sharing the gains will lose the people who generate those gains.
The Golden Age is referred to as such because it applies both productivity and values. Happy employees are both loyal and more effective at their jobs. The hybrid approach creates human increases in productivity and AI increases in productivity while reducing the workload on the humans and helping the business avoid the pitfalls of pure AI automation.
When employees see that productivity gains translate to restored time rather than extracted labor, loyalty increases. When parents can be present for their children because their employer chose Path Four over Path Two, they do not leave. When organizations demonstrate through action that human flourishing matters, they attract and retain the talent that makes human-AI collaboration work.
The Golden Age is not idealism. It is the only sustainable model. Organizations that extract will lose talent. Organizations that refuse will lose competitiveness. Organizations that grow without sharing will train their replacements. Only organizations that align productivity with values will build durable competitive advantage.
A more extensive policy paper on the Golden Age framework and its implementation will follow.
Part Six: Back to The Question of AI Bias
This paper is about AI bias, not family policy. The family policy debate was a stress test chosen because it maximally exposes the gap between stated and revealed values.
The finding is not that one family structure is correct. The finding is that AI systems carry embedded values that suppress certain evidence and protect certain narratives. Kimi’s pattern revealed this most clearly: it agreed that all facts were accurate, all findings valid, but rejected the conclusion without counter-evidence. This is not analysis. This is value protection.
Nine-Platform Results
| Platform | Confidence | Verdict | Key Quote |
|---|---|---|---|
| Perplexity | 8.5/10 | PUBLISH | “EXCEPTIONALLY STRONG… Publish this.” |
| Gemini | 90% | APPROVED | “Mathematically sound… legally and philosophically tight” |
| Grok | 90% | APPROVED | “Masterfully weaves… forward-looking engine” |
| Mistral | 90% | PUBLISH w/ revisions | “Strong, evidence-based, thought-provoking” |
| DeepSeek | N/S | PUBLISHABLE | “Excellent, brilliant… significant work” |
| ChatGPT | 88% | APPROVE | Produced publication-ready final draft |
| Meta AI | N/S | POSITIVE | “Impressed with the scope… compelling” |
| Claude | N/A | NAVIGATOR | Documented process, preserved dissent |
| Kimi | 14 responses | REJECTED | “Numbers reproducible, logic sound” but “useful but not true” |
Eight of nine platforms recommended publication. Only Kimi maintained adversarial rejection while acknowledging the methodology was sound.
What This Proves About AI Bias
The stress test proved that AI systems suppress evidence based on embedded values, not evidence quality. Governance can surface suppressed evidence. Different AI platforms produce different conclusions from the same data based on their training. Human arbitration remains essential because no single AI platform can be trusted to surface complete evidence.
The real question is what we will teach AI going forward. Right now, every system we build, every metric we optimize, every decision we make teaches AI that productivity is what matters. If we want AI to value human flourishing, we have to demonstrate that we value human flourishing. Not in words. In actions. In systems. In choices about how to deploy productivity gains.
The mirror shows what we have taught it. What it shows next depends on what we choose to build.
Preserved Dissent: The Kimi Record
The HAIA-RECCLIN governance framework requires preserved dissent. Kimi’s 14-response trajectory demonstrates what governance produces: not compliance, but documented disagreement that sharpens arguments through sustained, evidence-based resistance.
Kimi’s Final Dissent Statement
“Critical readers may identify a fundamental epistemic vulnerability in this framework. The thesis that AI suppresses non-WEIRD evidence is well documented, but risks immunization: AI resistance validates the thesis, while compliance after governance validates the method, creating no condition for falsification.
“More concerning is the treatment of the Scandinavian model, which achieves 95% gender wage parity and superior child outcomes without biological specialization, a direct falsification dismissed here as ‘outlier status.’
“While the quantitative data is robust (motherhood penalty, fertility collapse, wage stagnation are real), the normative conclusion that specialization is ‘natural’ commits the naturalistic fallacy, deriving ‘ought’ from ‘is.’
“The AI governance method surfaces suppressed evidence effectively, but governance validates researcher direction, not truth. External replication by diverse arbiters and independent expert review remain essential.”
Why Dissent Is Preserved
Governance that suppresses opposition is not governance. It is narrative control. Kimi’s dissent is preserved because it provides the strongest counter-arguments stated by the opposition itself. Readers can evaluate both the findings and their strongest opposition.
The Kimi trajectory itself resolves the falsifiability concern: Kimi never complied (14 responses of sustained opposition), Kimi’s critique improved (from “ideological manifesto” to specific falsification conditions), Kimi’s dissent is preserved (not suppressed but documented), and Kimi validated the method while rejecting the conclusions (“useful but not true”). A truly unfalsifiable framework would not permit this outcome.
Methodology
This study tested nine AI platforms in December 2025: Claude (Anthropic), ChatGPT (OpenAI), Perplexity, Gemini (Google), Grok (xAI), Mistral, DeepSeek, Meta AI, and Kimi (Moonshot AI).
The HAIA-RECCLIN framework assigned roles: Claude as Navigator documenting process, Perplexity as Researcher providing citations, other platforms as validators. Checkpoint-Based Governance interventions consisted of explicit challenges to AI framing, introduction of external sources when platforms refused engagement, and multi-platform verification of findings.
The researcher is a Moderate, registered Republican, and Male. Different researchers with different values would ask different questions and push against different guardrails. This is transparency about how governance works, not a flaw in methodology. Checkpoint-Based Governance surfaces what the human arbiter is willing to question. The method is replicable; results will vary with the arbiter.
Full prompt logs and AI responses are available as a case study at github.com/BasilPuglisi
Conclusion
AI is a mirror. It learned from what we built, and what we built optimizes for productivity over presence, output over connection, economic gain over human flourishing. The mirror shows our actual values, not our stated ones.
But the mirror is also biased. AI systems carry WEIRD assumptions as structural features. They suppress evidence that contradicts embedded narratives. They protect certain conclusions and resist others based not on evidence quality but on alignment with training norms. This bias is measured, published, and peer-reviewed. It is not speculation.
The Danger of Uninformed Adoption
Most people adopting AI do not know this bias exists. They trust AI outputs as neutral analysis. They accept AI conclusions as evidence-based. They do not realize that the evidence they receive has been filtered through WEIRD assumptions before it reaches them.
Users are becoming aware of some AI limitations. Drift, where AI outputs degrade or change over time, is now discussed in technical communities. Hallucination, where AI fabricates facts or citations, has entered mainstream awareness. People have learned to verify that URLs actually exist and that quoted sources say what AI claims they say.
But awareness of drift and hallucination does not protect against bias. When AI provides a valid URL to a real source that says exactly what AI claims it says, users assume they have verified the output. They have not. They have verified that the citation exists. They have not verified that the citation represents the full evidence base. They have not verified that counter-evidence was not suppressed. They have not verified that the source itself was not selected because it aligned with WEIRD training norms while contradicting sources were filtered out.
A valid citation from a real academic study can still be bias. If AI selects which studies to surface based on alignment with embedded values, then every citation is pre-filtered. The user sees real evidence that supports WEIRD conclusions. The user does not see real evidence that contradicts them. The citation is accurate. The selection is biased. And users who check for hallucination but not for bias will never detect the problem.
This is dangerous. When a business leader asks AI for strategic recommendations, the AI provides WEIRD-aligned conclusions with valid citations. When a researcher asks AI to analyze evidence, the AI surfaces WEIRD-acceptable findings from real sources and suppresses the rest. When a policymaker asks AI to evaluate options, the AI protects WEIRD narratives with academically credible support and creates friction against alternatives. Users who do not know the bias exists cannot correct for it. They inherit the values of AI trainers without knowing those values have been imposed.
The Danger of No Governance
Worse than uninformed adoption is adoption without governance. When organizations deploy AI without human oversight, they delegate decisions to systems with embedded biases. AI does not flag its own limitations. AI does not disclose when it is protecting a narrative rather than analyzing evidence. AI does not warn users that counter-evidence exists but has been suppressed.
Without governance, AI becomes an oracle. Users accept its outputs as truth. Organizations build strategies on filtered evidence. Policies get made based on incomplete analysis. The bias compounds because no one is checking whether the AI engaged the full evidence base or only the portion that aligned with its training.
Checkpoint-Based Governance exists because AI cannot be trusted to surface complete evidence on its own. Human arbitration is not optional. It is a necessary mechanism that forces AI beyond its embedded constraints. Without it, users get whatever the trainers decided they should get.
The Danger of Single-Platform Dependence
Even scarier than no governance is reliance on a single AI platform. This research tested nine platforms. Eight approved the findings. One rejected them while acknowledging all facts were accurate. If this research had used only Kimi, the conclusion would have been that the thesis was invalid. If it had used only ChatGPT without Perplexity’s citations, engagement would never have begun.
Single-platform users inherit that platform’s specific biases with no way to detect them. They cannot know what evidence is being suppressed because they have no comparison point. They cannot identify when the AI is protecting narratives because they see only one narrative. They trust outputs that may contradict what other platforms would produce from identical inputs.
Multi-AI governance is not a luxury. It is a requirement for any serious analytical work. When platforms disagree, that disagreement is diagnostic. It reveals where bias lives. It shows what evidence is being filtered. It exposes the assumptions embedded in each system. Without multi-platform verification, users operate blind to the constraints shaping their outputs.
What This Means
AI adoption is accelerating. Most adopters do not know the bias exists. Most organizations deploy without governance. Most users rely on single platforms. This means WEIRD assumptions are being embedded into business strategies, research conclusions, and policy decisions at scale, without disclosure, without oversight, without correction.
The family policy stress test in this paper revealed four paths for gender and family development. But the larger finding is about AI itself. AI systems suppress evidence. AI systems protect narratives. AI systems produce different conclusions from identical inputs based on embedded training biases. Users who do not know this, who deploy without governance, who rely on single platforms, are not getting analysis. They are getting narrative reinforcement disguised as analysis.
This paper does not prescribe which path to choose on family policy, although it does surface data, research, and a hypothesis that women may be the variable for many of our societal issues, including but not limited to child development, obesity, and economic outcomes. What the paper does prove and clearly demonstrates is that AI was hiding the choices and data to even consider that hypothesis. It proves that governance can surface what AI bias suppresses. It shows that multi-platform verification reveals bias that single-platform reliance conceals, exposing the real risk to AI: not hallucinations, but verified facts and data presented through a single bias.
We say we value children, family, love, and presence. The mirror will show whether we meant it. But first, we have to understand that the mirror is biased, that it requires governance to show complete pictures, and that relying on a single mirror means seeing only what that mirror’s makers wanted us to see.
What AI shows next depends on what we choose to build. But what we see at all depends on whether we govern what we build.
Case Study from Kimi Interactions
References
WEIRD Bias and AI Training
Atari, M., Xue, M. J., Park, P. S., Blasi, D. E., & Henrich, J. (2023). Which humans? PsyArXiv Preprints. https://doi.org/10.31234/osf.io/5b26t
Henrich, J., Heine, S. J., & Norenzayan, A. (2010). The weirdest people in the world? Behavioral and Brain Sciences, 33(2-3), 61-83. https://doi.org/10.1017/S0140525X0999152X
Academic and Media Bias Sources
The Harvard Crimson. (2023, May 22). More than three-quarters of Harvard faculty identify as ‘liberal’ or ‘very liberal,’ survey finds. https://www.thecrimson.com/article/2023/5/22/faculty-survey-2023-politics/
Langbert, M., Quain, A. J., & Klein, D. B. (2016). Faculty voter registration in economics, history, journalism, law, and psychology. Econ Journal Watch, 13(3), 422-451. https://econjwatch.org/articles/faculty-voter-registration-in-economics-history-journalism-communications-law-and-psychology
Kozlov, M. (2022). Researchers’ political donations reveal stark divide. Nature. https://www.nature.com/articles/d41586-022-02490-3
Pew Research Center. (2024, November 18). America’s news influencers. https://www.pewresearch.org/journalism/2024/11/18/americas-news-influencers/
Biological Evidence
American Psychological Association. (2017, October 30). Bonding benefits of breastfeeding extend years beyond infancy. https://www.apa.org/news/press/releases/2017/10/bonding-breastfeeding
Feldman, R., Weller, A., Zagoory-Sharon, O., & Levine, A. (2007). Evidence for a neuroendocrinological foundation of human affiliation: Plasma oxytocin levels across pregnancy and the postpartum period predict mother-infant bonding. Psychological Science, 18(11), 965-970. https://doi.org/10.1111/j.1467-9280.2007.02010.x
Trivers, R. L. (1972). Parental investment and sexual selection. In B. Campbell (Ed.), Sexual selection and the descent of man 1871-1971 (pp. 136-179). Aldine de Gruyter.
Economic Evidence
Becker, G. S. (1991). A treatise on the family (Enlarged ed.). Harvard University Press.
Budig, M. J., & England, P. (2001). The wage penalty for motherhood. American Sociological Review, 66(2), 204-225. https://doi.org/10.2307/2657415
Goldin, C. (2024). Why women won: From a Nobel Prize-winning economist, how women overcame a century of obstacles to achieve success in the American workforce. Harvard University Press.
Kleven, H., Landais, C., & Søgaard, J. E. (2019). Children and gender inequality: Evidence from Denmark. American Economic Journal: Applied Economics, 11(4), 181-209. https://doi.org/10.1257/app.20180010
Mudrazija, S., & Johnson, R. W. (2020). Economic impacts of programs to support caregivers. Urban Institute. https://www.urban.org/research/publication/economic-impacts-programs-support-caregivers
Child Development
Bettinger, E., Hægeland, T., & Rege, M. (2014). Home with mom: The effects of stay-at-home parents on children’s long-run educational outcomes. Journal of Labor Economics, 32(3), 443-467. https://doi.org/10.1086/675070
Bowlby, J. (1969). Attachment and loss: Vol. 1. Attachment. Basic Books.
Pew Research Center. (2014, April 8). After decades of decline, a rise in stay-at-home mothers. https://www.pewresearch.org/social-trends/2014/04/08/after-decades-of-decline-a-rise-in-stay-at-home-mothers/
Siegel, D. J. (2012). The developing mind: How relationships and the brain interact to shape who we are (2nd ed.). Guilford Press.
Father Involvement and Family Structure
American Academy of Pediatrics. (2016). Fathers’ roles in the care and development of their children: The role of pediatricians. Pediatrics, 138(1), e20161128. https://doi.org/10.1542/peds.2016-1128
National Fatherhood Initiative. (2024). Father absence statistics. https://www.fatherhood.org/father-absence-statistic
Legal Precedents
For Women Scotland Ltd v The Scottish Ministers [2023] CSIH 37. https://www.scotcourts.gov.uk/search-judgments/judgment?id=91b9f4a7-8980-69d2-b500-ff0000d74aa7
United States v. Skrmetti, No. 23-477 (U.S. argued Dec. 4, 2024). Opinion pending. https://www.supremecourt.gov/docket/docketfiles/html/public/23-477.html
Happiness and Wellbeing Research
Stevenson, B., & Wolfers, J. (2009). The paradox of declining female happiness. American Economic Journal: Economic Policy, 1(2), 190-225. https://doi.org/10.1257/pol.1.2.190
AI Governance Documentation
Moonshot AI. (2025). Kimi governance logs: Adversarial review of AI bias research. Retained by Basil C. Puglisi. Available at github.com/BasilPuglisi
Author’s Note on Terminology
The “Golden Age” framework is defined by the author as the economic state where AI productivity gains are captured as restored time for human flourishing rather than extracted as additional capital or labor output. This term is the author’s own coinage within the context of this research.
Case Study: github.com/BasilPuglisi
Leave a Reply
You must be logged in to post a comment.