Pangea Social - Socratic Dialogue

How Pangea Social A.I. Detects Bias in Text

Written by Eric Forst | Aug 12, 2023 10:12:14 PM

At Pangea Social, our mission is to create inclusive, safe online communities that can be used to drive a positive impact on our physical global and local communities.

In order to achieve this, we invented a new kind of artificial intelligence that is able to detect bias in text. We built this through a combination of the training text used, and by a unique mathematical framework based on the theory of general relativity that we call Cognitive Relativity.

For our training data, we used a random 5% sample of all the text on Wikipedia. This is an excellent data source because of the wide range of human editors who comb through Wikipedia entries to ensure that opposing viewpoints are expressed. Even so, we still had to revise this corpus of text. For example,  almost every small town in Europe has a Wikipedia entry, and those posts regularly feature food specialties and favored restaurants. This meant that we had to remove a lot of data related to food.

In addition to this unbiased training data, we designed our algorithm to analyze text in the context of other text in a document. This works by analyzing each phrase, paragraph, or longer post on two dimensions: Generality and Diversity.

Generality looks at whether the words used in a phrase connect to lots of other words in our dictionary, which is a language search tree that arranges all words in the English language based on their causal relationships. Words with lots of connectivity get high Generality scores while words with fewer connections get a lower score. Examples of  high Generality include words like “Go,” “be,” or “and.”

Diversity examines the variety of themes in any phrase. Statements with low Generality and low Diversity have high Clarity, and statements with lots of Generality and Diversity have high Framing. Examples of words or word-pairs with high diversity are “America,” “league,” or “near future.”

After this analysis that determines the main themes in the text and how well connected those themes are, the algorithm looks for words and phrases at the end of branches in the language dictionary. These are phrases that are close-minded as they do not connect to a wide range of other ideas. And this is how the A.I. highlights and flags areas of text that are biased.

(Illustration of language trees from Pangea Social patent, "Relativistic field effect methods to optimize creative awareness”)

Here’s an example of a post made on Facebook, which is written from a right-wing political perspective and gets a moderately high overall quality score or “Big Mind”  in the Pangea Social rating system:

"That's all fine and dandy but we know now that many lies were told about covid, and not because new research data came in but because it was politically expedient to do so. Fauci lied about gain of function and kept changing his story about masks. Biden and Rachel Maddow lied when they said that the jab would stop the spread - there was no research data at the time supporting that." +18.25% Pangea Score 

The heat map diagram of this post looks like this…

The map shows that all 5 of the phrases or “nodes” are well connected by causal relationships (indicated by white lines), and there is a strong Thesis statement (1) ("not because new research data came in but because it was politically expedient to do so" ) that is supported with evidence (Clarity nodes (0) and (4)) and has very low Bias and Digression.

For comparison, here’s a Facebook post made from a left-wing perspective, which gets a very low overall quality or “Big Mind” score because its statements are both very Biased and highly Digressed.

“Maybe they pushed Pizzagate so hard because they knew this Epstein thing would lead back to Trump, and they always attack the other side for the things of which they’re guilty, 100% of the time.” -25.06% Pangea Score

Here’s the heat map of this post:

In this post, there are no white lines connecting nodes, because the two phrases have no causal relationship. The first statement is digressed because it has a wide variety of concepts. The second statement (1) has a high bias rating due to an accusation that is not backed up by evidence.

One of the main factors in evaluating whether or not an A.I. algorithm exhibits bias is an analysis of the political orientations of the founders and inventors. We’ll admit it, our team is composed of registered Democrats who are biased to left-wing points of view. But we set out to create an algorithm that would be trusted by anyone, anywhere, anytime and with confidence that it rates text objectively, based purely on mathematical models.

Our hope is that Pangea Social can help weave American conversations back together again, with citizens of all political and cultural identities participating in meaningful dialogue with each other, rather than the shouting matches and put-down contests we typically see online.

We hope that we’re beginning to achieve this goal and welcome you to sign up as a beta tester and give us feedback on our A.I.-powered writing editor