It has been suggested for decades that Sanskrit could be the ideal language for representing knowledge in artificial intelligence (AI). This is due to Skaskrit's rule-heavy, formula-based syntax, which makes it an excellent, logical choice for writing algorithms. But is this true? Let's investigate.
Origin
Claims that Sanskrit is one of the best languages for AI are based on a research paper, “Knowledge Representation in Sanskrit and Artificial Intelligence,” published by NASA scientist Rick Briggs in 1985. Access the research at this link.
However, just because Sanskrit represents an ideal language for AI, this does not mean it should be used as a programming language. Instead, the suggestion refers to using the language for datasets used by AI models. There are reasons for this.
Sanskrit is one of the oldest languages, dating back to Vedic times in India. The language is often called Devabhasha , meaning the language of the gods. All ancient Hindu and many Buddhist and Jain religious manuscripts are written in Sanskrit.
For this and other reasons, Sanskrit is considered a pure language. It remained independent and authentic, never mixing with other Indo-European languages. In part, this is because it was never a language for commoners or common conversation.
In ancient times, Sanskrit was used exclusively for writing and preserving knowledge. Other languages, including Prakrit Bhasha or vernacular languages, were used to converse. Over the centuries, many different languages and dialects have emerged, and some have roots in Sanskrit.
Currently, India has 22 official languages, 121 of which are recognized by the Indian constitution. More than 19,569 dialects have been derived, of which 1,369 are recognized as being derived from native mother tongues. This does not include the languages and dialects of other South Asian countries.
When scientists looked for pure, unadulterated languages (not mixed with other languages), they turned to ancient languages like Sanskrit, Latin, and Greek. Sanskrit has remained true to form, unadulterated throughout history.
Comprehensive phonetics
Sanskrit is a phonetic language. Words have a direct correspondence between symbols and sounds. In fact, there is no sound or pronunciation that cannot be written in Sanskrit letters. As such, once you know the language, Sanskrit is easy to translate and record.
No syntax
In most languages, the order of words in a sentence is essential for comprehension. This is not true in Sanskrit.
In Briggs' research paper (linked above), he compares Sanskrit to English. Hence, in English sentences, the word order or syntax of a sentence is critical for comprehension. For example, “He will” and “He will” have different meanings.
In Sanskrit, words have many suffixes that indicate the context and use of the word within a sentence. So even if the words of a Sanskrit sentence are scrambled, the meaning remains the same.
The phrase “He will” in Sanskrit can be written either way:
The same applies to any sentence in the Sanskrit language.
The meaning of words
In Sanskrit, each word of a sentence contains more information than in English (according to Briggs). In contrast, in English, the descriptive words in a sentence typically represent the noun, pronoun, verb, adverb, or adjective. For example, a verb in English only provides information about the action and a hint about the tense (past, present, or future). The time of the action (when the action was performed) is interpreted based on the syntax of the sentence.
However, in Sanskrit, the verb contains information about the action, its tense, and the noun (as if the noun were singular, dual, or plural).
Let’s take the phrase: “Students will read”. In English, the word “students” only tells us about the noun. The “will” tells us about the verb tense (in the future), and the word “read” tells us the verb or action.
This is the same phrase in Sanskrit:
Only the second word in the Sanskrit sentence above tells us that the noun is plural, the tense is in the future, and the action is reading.
Understanding the cases
“Cases” indicate the grammatical functions of nouns and pronouns according to their relationship to other words in a sentence. Sanskrit has many more cases for nouns or pronouns than any other language.
Consider the English word “beautiful,” which is an adjective but the root can be used as a noun (“She is a beauty”). It can also be an adverb (“She sang beautifully.”) or a verb (“Let’s beautify this room.”).
In Sanskrit, each noun has eight cases and the word for each case also means the singularity, duality or plurality of the case. The following table represents the cases for the noun Beat in Sanskrit.
With multiple “cases,” each Sanskrit word is self-explanatory and minimally dependent on other words in a sentence. Each word by itself indicates its role and context.
So what could this mean for AI? According to Briggs, the power of Sanskrit cases can provide more information to a computer in more precise sentences than any other language.
Going to the root
One thing that distinguishes Sanskrit from all other languages in the world is that it has its own metalanguage or “metarule”.
In 500 BC, Pāṇini, a sage and master of the Sanskrit language, wrote a text on its grammar called Astadhyayi or Eight Chapters . The text illustrates a system of Sanskrit grammar and vocabulary, indicating how each word in the language is actually derived from a root word. There are a set of 4,000 rules (or Sutras ) applied to these root words. The Sutras are similar to a mathematical formula.
Essentially, Pāṇini taught a “metarule,” which is typically interpreted by scholars as follows: “In the case of a conflict between two rules of equal strength, the rule that comes later in the serial order of the grammar wins.”
This means that Sanskrit words technically do not need translation from any other language (as some English words derive from Latin). Instead, all Sanskrit words are derived or rooted from the language itself.
All languages have structures and rules for forming words and sentences. But sometimes there are variations in the spelling of words. This is not the case with Sanskrit.
In Sanskrit, there are no random words. The language is based on a grammatical derivational system. Astadhyayi explains how all Sanskrit words are derived from fundamental letters given in his Maheshwar Sutras a set of 14 rules, which are the basis of language.
A research paper, “On the Architecture of Panini Grammar,” by Stanford University linguistics expert Paul Kiparsky sheds more light on this.
Panini Astadhyayi (with his Sutras ) shares similar rules to modern programming languages.
For example, there is:
- Sangyak Varna – Similar to Keywords
- Pratyaya – similar to operators
- Vidhi – similar to functions
- Anuvrati – similar to libraries or packages
Since Sanskrit can be algorithmically derived from a metalanguage or “metarules,” as described in Panini Astadhyayi , it could be easy or easier to create a generative model for the language—at least compared to other modern languages. Its rule-based grammatical basis is one of the reasons why Sanskrit has been considered by some as an ideal language for knowledge representation in artificial intelligence.