• Project
  • Data
  • e-Text
  • Draw
  • Prove
  • Lounge
  • Introduction
  • GeoText

Geometry Description Language Demo

Geometry Description Language (GDL) is designed to formalize definitions, configurations, propositions, and problems stated in geometry literature (such as textbooks and papers). Through a series of transformation (shown in the following figure), formalized propositions can be automatically proved by calling provers from GEOTHER and dynamic diagrams for their configurations can be automatically constructed and drawn by GeoGebra.

Formalization of concepts

Let $V$ be a variable.\textbf{Concepts} are formulated as follows.

  • A formulation in the form of $V::T$ is called an (\textbf{abstract})\textbf{instance}, where $V$ is called the \textbf{reference} or\textbf{pointer} of the abstract instance and $T$ is called an (\textbf{abstract} or\textbf{basic})\textbf{type}.
  • A formulation in the form of $f(a_1,\cdots,a_n)$ ($0\leq n$) (when $n=0$, the form is $f()$) is called a\textbf{concept}, where $f$ represents the name of the concept, called a\textbf{vocabulary}, and $a_i~(1\leq i\leq n)$ is an abstract instance or another concept, called an\textbf{argument} of the concept.

The number of "$(~)$" existing in a concept $C$ is called the \textbf{number of layers} of $C$, denoted as \texttt{L}($C$). The number of layers and the number of arguments of a concept are finite. Abstract type $T$ can be Point, Line, Segment, Halfline, Angle, Circle, Arc, Triangle, Quadrilateral, Polygon, Length, Area, Degree, GeometricQuantity, AlgebraicQuantity, Quantity, equal, neq, ObjectRelation, QuantityRelation, and Boolean.

  • $A$::Point represents an abstract instance and means a point $A$ in space;
  • collinear($A$::Point,$B$::Point,$C$::Point) represents a geometric relation and means that that $A$, $B$, $C$ are collinear;
  • circumcenter(triangle($A$::Point,$B$::Point,$C$::Point)) represents a geometric object and means the circum center of triangle $ABC$;
  • distance($A$::Point,$l$::Line) represents a geometric quantity and means the distance from point $A$ to line $l$;
  • distance($A$::Point,$B$::Point) represents a geometric quantity and means the distance between $A$ and $B$.

Geometric concepts can be classified as follows according to their meanings.

\begin{itemize} \item \emph{geometric objects}: geometric entities in spaces, such as an arbitrary point, an intersection point, a line, a perpendicular line, a circle, and the orthocenter of a triangle; \item \emph{geometric quantities}: measurements of geometric objects, such as the length of a segment and the area of a circle; \item \emph{geometric object relations}: relations amongst geometric objects, such as two lines are parallel, two circles intersect, and two triangles are similar; \item \emph{geometric quantity relations}: relations between geometric quantities, such as the length of a segment equals to that of the other segment and the area of a circle is less than that of a square. \end{itemize}

Algebraic concepts include

\begin{itemize} \item \emph{algebraic quantities}: abstract quantities, such as integer and real number; \item \emph{quantity functions}: operations on geometric quantities and algebraic quantities, such as plus, times, tan, and sin; \item \emph{algebraic quantity relations}: relations between algebraic quantities, such as equal, less than, and greater than. \end{itemize}

All the concepts can be classified as follows according to their syntax features.

\begin{itemize} \item \emph{Entity concepts}, representing entity objects, include geometric objects, geometric quantities, algebraic quantities, and quantity functions; \item \emph{Boolean concepts}, representing boolean values, include geometric object relations, geometric quantity relations, and algebraic quantity relations. \end{itemize}
Formalization of clauses

A concept can be instantiated by assigning exact or concrete arguments. The resulting formulation is called an instance of the concept and we can also say that the instance applies this concept.

\textbf{Instances} are formulated as follows.

\begin{enumerate} \item A variable, such as $A$ and $l$, is a\textbf{term}; \item A constant, such as $0$ and $\pi$, is a\textbf{term}; \item A formulation in the form of $f(a_1,\cdots,a_n)$ ($0\leq n$) is an\textbf{instance} of a concept, where $f$ is the vocabulary of the concept and each $a_i~(1\leq i \leq n)$ is a term, an abstract instance, or an instance of another concept (called a subinstance of the instance), called an\textbf{argument} of the instance. \end{enumerate}

The number of "$(~)$" existing in an instance $I$ is called the \textbf{number of layers} of $I$, denoted as \texttt{L}($I$). The number of layers and the number of arguments of an instance are finite.

\textbf{Clauses} are formulated as follows.

\begin{enumerate} \item declare$(\alpha_1,\cdots,\alpha_n)$ ($0\leq n$) is a (\textbf{declarative})~\textbf{clause}, where $\alpha_i~(1\leq i\leq n)$ is an abstract instance, called an\textbf{argument} of the clause, and $n$ is a finite number; \item $V:=I$ is a (\textbf{reference})~\textbf{clause}, where $V$ is a variable and $I$ is an instance of an entity concept. $V$ is called the\textbf{reference} or\textbf{pointer} of $I$; \item Both and$(\alpha_1,\cdots,\alpha_n)$ and or$(\alpha_1,\cdots,\alpha_n)$ are (\textbf{reference})~\textbf{clauses} (\textbf{compound reference clauses}), where each $\alpha_i~(1\leq i\leq n)$ is a reference clause, called an\textbf{argument} of the clause, and $n$ is a finite number; \item give$(\alpha_1,\cdots,\alpha_n)$ is a (\textbf{hypothesis})~\textbf{clause}, where each $\alpha_i~(1\leq i\leq n)$ is an instance of an entity concept, called an\textbf{argument} of the clause, and $n$ is a finite number; \item An instance of a boolean concept is also called a (\textbf{boolean})~\textbf{clause}; \item and$(\alpha_1,\cdots,\alpha_n)$, or$(\alpha_1,\cdots,\alpha_n)$, and not$(\alpha)$ are all (\textbf{boolean})~\textbf{clauses} (\textbf{compound boolean clauses}), where $\alpha,\alpha_1,\cdots,\alpha_n$ $(1\leq n)$ are all boolean clauses, called\textbf{arguments} of the clause, and $n$ is a finite number. \end{enumerate}

The clauses in (2), (3), and (4) are also called \textbf{construction clauses}.

  • declare($A$::Point,$l$::Line) represents a declarative clause and means that declare a point $A$ and a line $l$;
  • $l:=$line($A,B$) represents a reference clause and means that $l$ is a line passing $A$ and $B$;
  • give(triangle($A,B,C$)) represents a hypothesis clause and means give a triangle $ABC$;
  • perpendicular($l,m$) represents a boolean clause and means that $l$ is perpendicular to $m$.
Formalization of configurations

configuration$(\alpha_1,\cdots,\alpha_n)$ ($0\leq n$) is called a\textbf{configuration}, where each $\alpha_i~(1\leq i\leq n)$ is a clause or a configuration, called an \textbf{argument} of the configuration. $n$ is finite and all variables are the references of either abstract instances or reference clauses existing in the configuration. The same variable can not be used as two different references.

\begin{enumerate} \item A configuration is called a \textbf{construction configuration}, if each $\alpha_i~(1\leq i\leq n)$ is a declarative clause, or a construction clause, or a construction configuration; \item A configuration is called a \textbf{constraint configuration}, if each $\alpha_i~(1\leq i\leq n)$ is a a declarative clause, or a boolean clause, or a constraint configuration; \item Other configurations are called \textbf{mix configurations}.~ \end{enumerate}
  • configuration($A$:=point(),$B$:=point(),$C$:=point(),give(ninepointcircle(triangle($A,B,C$))) represents $A$, $B$, $C$ are arbitrary points, and give the nine point circle of triangle $ABC$.

Two configurations or clauses can be merged. If $S_1$ is in the form of configuration$(\alpha_1,\cdots,\alpha_n)$ and $S_2$ is in the form of configuration$(\beta_1,\cdots,\beta_m)$, then $S_1$ and $S_2$ are merged into one configuration configuration$(\alpha_1,\cdots,\alpha_n,\beta_1,\cdots,\beta_m)$, denoted as $S_1+S_2$, and the other configuration configuration$(\beta_1,\cdots,\beta_m,\alpha_1,\cdots,\alpha_n)$, denoted as $S_2+S_1$.

Formalization of definitions

\textbf{Definitions} are formulated as follows.

\begin{enumerate} \item A formulation $S$ in the form of $[I~\mbox{where}~Y]$ is a\textbf{constraint binding}, where $I$ is an abstract instance, or an instance of a concept, or a boolean clause, or a configuration, called the\textbf{center instance} of $S$, denoted as \texttt{CI}($S$); $Y$ is a boolean clause, or a reference clause, or null ("null" means nothing and any operation on null will terminate), called the\textbf{constraint} of $S$, denoted as \texttt{Constraint}($S$); \item A formulation $S$ in the form of $[V]$ is an\textbf{inverse binding}, where $V$ is a variable, called the\textbf{center instance} of $S$, denoted as \texttt{CI}($S$); \item A formulation $D$ in the form of Definition$(C, R, N)$ is a \textbf{definition}, where $C$ is a concept, called the \textbf{target concept} of $D$, denoted as \texttt{TC}($D$); $R$ is a constraint binding, or an inverse binding, or an abstract instance, called the\textbf{return body} of $D$, denoted as \texttt{RB}($D$); $N$ is a boolean clause or null, called the \textbf{nondegeneracy condition} of $C$, denoted as \texttt{NC}($C$); the following conditions should also be satisfied. \begin{itemize} \item The variables existing in $Y$ should exist in $C$ or $I$; \item The center instance of an inverse binding is the reference of an abstract instance existing in $C$; \item The variables existing in $N$ should exist in $C$; \item The variables existing in $N$ should exist in $C$; \item The same variable can not be used as two different references; \item If there is no argument of $C$, then $R$ should be an abstract instance. \end{itemize} \end{enumerate}

If the return body of a definition is an abstract instance, then the target concept of the definition is called a\textbf{simple} (or\textbf{basic})\textbf{concept} (algebraic concepts used in geometric statements are all viewed as simple concepts); otherwise it is called a\textbf{complex} (or\textbf{derived})\textbf{concept}.

When the return body of a definition is a constraint binding, if \texttt{Constraint}($R$) is a boolean clause or null, the definition is called a \textbf{constraint definition}; if \texttt{Constraint}($R$) is a reference clause or null, the definition is called a \textbf{construction definition}.

  • The formulation Definition(intersection($l$::Line, $m$::Line), [$A$::Point where and(incident($A$, $l$), incident($A$, $m$))], intersect($l$, $m$)) means that the intersection point of two lines $l$ and $m$ is defined as a point $A$ such that $A$ is on $l$ and $A$ is on $m$, and the nondegeneracy condition is that the two lines $l$ and $m$ intersect.
  • The formulation Definition(completequadrilateral($A$::Point, $B$::Point, $C$::Point, $D$::Point, $E$::Point, $F$::Point), [configuration($E$ := intersection(line($A$, $B$), line($C$, $D$)), $F$ := intersection(line($A$, $C$), line($B$, $D$)))], null) means a complete quadrilateral is defined as a configuration such that $E$ is the intersection point of line $AB$ and line $CD$ and $F$ is the intersection point of line $AC$ and line $BD$, and there is no nondegeneracy condition.

A\textbf{script} is a collection of definitions, formulated in the form of Script$(D_1,\cdots,D_n)$ ($0\leq n$), where each $D_i~(1\leq i\leq n)$ is a definition and $n$ is finite. We have created a script with formalized constraint definitions and a script with formalized construction definitions.

Formalization of propositions

A\textbf{proposition} is formulated in the form of Proposition$(N, T, H, G)$, where $N$ is the name of the proposition; $T$ is the type of the proposition, i.e., "Assertion", "Lemma", "Theorem", "Corollary", or "Conjecture"; $H$ is a configuration, called the \textbf{hypothesis} of the proposition; and $G$ is a boolean clause or in the form of $\alpha_1<=>\alpha_2$ (where $\alpha_1$ and $\alpha_2$ are both boolean clauses), called the \textbf{conclusion} of the proposition. All variables existing in $G$ should exist in $H$.

  • Simson's theorem is stated as "the feet of the perpendiculars from a point to the sides of a triangle are collinear if and only if the point lies on the circumcircle" in the textbook. It is formalized to Proposition(Simson's theorem, Theorem, configuration($A$ := point(), $B$ := point(), $C$ := point(), $D$ := point()), incident($D$, circumcircle(triangle($A$, $B$, $C$))) $<=>$ collinear(foot($D$, line($A$, $B$)), foot($D$, line($A$, $C$)), foot($D$, line($B$, $C$))));
  • Pappus's theorem is formalized to Proposition(Pappus, Theorem, configuration(declare($C$::Point, $F$::Point, $P$::Point, $Q$::Point, $R$::Point), $A$ := point(), $B$ := point(), $D$ := point(), $E$ := point(), give(Pappus($A$, $B$, $C$, $D$, $E$, $F$, $P$, $Q$, $R$))), collinear($P$, $Q$, $R$)).
Formalization of problems

A\textbf{problem} is formulated in the form of Problem$(N, T, H, O)$, where $N$ is the name of the problem; $T$ is the type of the problem, i.e., "Compute", "Prove", "Locus", or "Construct"; $H$ is a configuration, called the\textbf{hypothesis} of the problem; and $O$ is called the\textbf{objective} of the problem.

\begin{enumerate} \item If $T$ is "Compute", "Locus", or "Construct", then $O$ is an instance of an entity concept or the reference of the instance; \item If $T$ is Prove", then $O$ is a boolean clause or in the form of $\alpha_1<=>\alpha_2$ (where $\alpha_1$ and $\alpha_2$ are both boolean clauses). \end{enumerate}

All variables existing in $O$ should exist in $H$.

  • A computation problem "given a triangle, let the length of one side be $a$ and the length of its height be $h$, calculate the area of the triangle" is formalized to Problem(Area, Compute, configuration(declare($A$::Point, $B$::Point, $C$::Point), give(triangle($A$, $B$, $C$)), $a$ := length(side($A$, $B$)), $h$ := height($C$, side($A$, $B$))), area(triangle($A$, $B$, $C$)));
  • A proving problem "the two diagonals of a parallelogram bisect each other" is formalized to Problem(Parallelogram, Prove, configuration(declare($D$::Point), $A$ := point(), $B$ := point(), $C$ := point(), give(parallelogram($A$, $B$, $C$, $D$)), $O$ := intersection(line($A$, $C$), line($B$, $D$))), and(equal(length(segment($O$, $A$)), length(segment($O$, $C$))), equal(length(segment($O$, $B$)), length(segment($O$, $D$))))).
Formalization of statements with "set"

To extend the expressibility of GDL, we introduce the following formulations.

\begin{enumerate} \item $\{V_1;\cdots;V_n\}:=\{\alpha_1;\cdots;\alpha_n\}$ ($1\leq n$) is a (\textbf{set})~\textbf{clause}, if each $V_i:=\alpha_i~(1\leq i\leq n)$ is a reference clause and $n$ is a finite number; \item $\{V_1;\cdots;V_n\}:=I$ ($1\leq n$) is a (\textbf{set})~\textbf{clause}, where $V_i~(1\leq i\leq n)$ is a variable, $n$ is a limited number, and $I$ is an instance of an entity concept; \item $f(\{\alpha_1;\cdots;\alpha_n\},\cdots,\{\beta_1;\cdots;\beta_n\})$ ($1\leq n$) is a (\textbf{set})~\textbf{instance} of a concept, if $f(\alpha_i,\cdots,\beta_i)$ $(1\leq i\leq n)$ is an instance of the concept; \item $\{R_1;\cdots;R_n\}$ ($2\leq n$) is a (\textbf{set})~\textbf{return body}, where each $R_i~(1\leq i\leq n)$ is a return body and $n$ is a finite number. \end{enumerate}
  • {$A;B;C$} := {point(); point(); point()} is a (set) clause, meaning $A$, $B$, and $C$ are arbitrary points;
  • {$D;E;F$} := vertex(Napoleontriangle(triangle($A,B,C$))) is a (set) clause, meaning $D$, $E$, $F$ are the three vertices of Napoleon triangle of triangle $ABC$;
  • collinear(midpoint(diagonal(completequadrilateral($A,B,C,D,E,F$)))) is a (set) instance, meaning the midpoints of the three diagonals of a complete quadrilateral are collinear;
  • Definition(diagonal(completequadrilateral($A$::Point, $B$::Point, $C$::Point, $D$::Point, $E$::Point, $F$::Point)), {[segment($A$, $D$)]; [segment($B$, $C$)]; [segment($E$, $F$)]}, null) means that a complete quadrilateral $ABCDEF$ has three diagonals, which are segment $AD$, segment $BC$, and segment $EF$.

If a formulation is in the form of $\{\alpha_1;\cdots;\alpha_n\}$ ($1\leq n$) where $n$ is a finite number, then each $\alpha_i~(1\leq i\leq n)$ is called a\textbf{component}.

In addition, a formulation in the form of choosediff$(\{A_1;\cdots;A_n\},m)$ ($1\leq n$, $1\leq m\leq n$) is an (\textbf{operation})\textbf{instance}, meaning to arbitrarily select $m$ different components from $\{A_1;\cdots;A_n\}$. For instance, line(choosediff{$A;B;C$}, 2)) means an arbitrary line of line $AB$, line $AC$, and line $BC$.