The query and representation languages of WebKB-2


Table of contents                               (Click here to see the example file we refer to)

  1. The FS (For Structuration) language of commands
  2. The FC (For Control) sub-language
  3. The FL (For-Links) or FT (For Taxonomies/Terms) sub-language
  4. The FCG (For Conceptual Graphs / Frame-CG) sub-language
  5. The FE (Formalized English) sub-language



The FS (For Structuration) language of commands

In FS, commands should be separated by a semicolon. More generally, you should end each command by a semicolon.
In EBNF: Commands := (Command ";")+         // "+" means "at least 1 time"
(In the "run" mode, the last command need not be ended by a semicolon but you should rather not use this facility).

If commands are stored in a file (instead of being sent through the interfaces or as GET/POST parameters to the WebKB-2 CGI servers), they should be enclosed within the HTML marks <KR> and </KR> to isolate them from the rest of the file. Here is an example.

<html><head><title>An HTML document with text and commands in it</title></head> <body>Hello <!-- this text is not displayed by an HTML browser <KR>print "Execution of a 1st print command"; /*multi-line kind of comment*/ print "Execution of a 2nd print command"; //single line kind of comment </KR> --> <p>Below is a command <p><KR>print "Execution of a 3rd print command";</KR> </body></html>

Thus, it is possible to mix statements or queries with text or other document elements. To make WebKB-2 access the file and execute/interpret the commands in it, use the commands "run", "load" or "display" with an absolute URL of the file as an argument. WebKB-2 generates a file to present the results. With "load", only the results are presented (this is the "load" execution mode). With "run", the text of each command is displayed before its result (this is the "run" execution mode). With "display", the results (not the commands) and the text outside the command parts are presented (in other words, the file is copied on the output with the results of queries in the place of the queries).
Given these three modes and the possibility to call the WebKB-2 server from a program or hyperlink, it is easy to create complex "virtual documents" (documents with variable content).

From within a file, in addition to "run", "load" and "display", you can use "include" which accesses and executes a file without changing the current execution mode. The execution mode can also be modified via the commands "run mode", "load mode" and "display mode".

These file loading commands are part of the "control commands" (FC sub-language). Here are the various kinds of commands.   Command := Control_command | Term_command | FCG_command
These commands are discussed in the next sections. In this document, "term" means formal term, i.e. a category identifier.



The FC (For Control) sub-language

Here is the EBNF grammar of FC, i.e. the control commands in FS. ("//" are for in-line comments).

Control_command:= "load" URL | "run" URL | "display" URL | "include" URL
                | "load mode" | "run mode" | "display mode"
                | "no storage" | "storage" | "use names" | "no names"
                | "unprefixed variables" | "prefixed variables"
                | "default creators:" UserId* | "print" (Term|String|Number)*
                | "trace" | "no trace"
                | Variable ":=" Expression
                | "if" "(" Expression ")" Command_block "else" Command_block
                | "while" "(" Expression ")" Command_block

Command_block  := "{" Commands "}"
Expression     := "(" Expression ")" | "-" Expression
                | Expression "+" Expression  | Expression "-" Expression
                | Expression "*" Expression  | Expression "/" Expression
                | Expression "<" Expression  | Expression "<=" Expression
                | Expression ">" Expression  | Expression ">=" Expression
                | Expression "!=" Expression | Expression "==" Expression
                | Variable ":=" Expression   | Value
Value          := String | Number | Date | Time
Variable       := ("*"|"?"|"@")(Identifier|Number)
UserId         := Identifier


//Lexical rules (except for the self-explanatory "...", the notation of Lex
//  is adopted: no space between successive components of a same token,
//  {} to delimit non-terminals, [] for alternatives;
//  white spaces and the HTML imbreakable space encoding " " are ignored;
//  Java/C++ comments ("/* ... */" and "//...") are ignored;
//  HTML tags are ignored but the content of HTML comments is parsed):

Annotation   := "$(" ...")$"
String       := '"' ... '"'  |  "'" ..."'"  |  "$(" ... ")$"
Number       := ("+"|"-")?{Digit}+("."{Digit}*)?
Digit        := [0-9]
Date         := [0-9][0-9]"/"[0-9][0-9]"/"[0-9][0-9][0-9][0-9]
CreationDate := [0-9][0-9]?"/"[0-9][0-9]?"/"[0-9][0-9][0-9][0-9]
Uri          := "://"[A-Za-z0-9_\-/.~#%$@?&+=]+
Term         := {Uri} | {TermLetter1}{TermLetter}*
TermLetter1  := [a-z0-9_] | "#"[a-z0-9_.]
TermLetter   := [A-Za-z0-9_\-/.~#%$@?&|\'] | "\\"{ANY}
//The rules Number and Term overlap but the rule Number has precedence

Most of these commands are classics and self-explanatory. Click here to understand the use of the commands "default creators:", "use names" and "no names" (these three commands only affect the interpretation of graphs (FCG/FE)).
Variables can occur anywhere in a command: they are expanded before the parsing of that command. The command "prefixed variables" specifies that variables begin by '$' (except when on the left side of the operator ":="). With the opposite command, "unprefixed variables", parsing is slower since WebKB-2 has to check if identifiers are names for variables or not.
The command "no storage" tells WebKB-2 not to commit changes when it has finished executing commands for a user. Changes are also not commited when an error has been detected during the parsing/execution.
Click here for examples.



The FL (For-Links) or FT (For Taxonomies/Terms) sub-language

FT is an older name for FL (more exactly, FT was extended and hence a new name was also adopted; the grammar below should be updated because it is still only the grammar of FT). Reminder: "term" means formal term, i.e. a category identifier; "taxonomy" refers to the set of all links between categories. The main links with their reserved characters in FT are: subtype (<), instance (:), simple_exclusion (!), closed_exclusion (/), inverse (-), equal (=), closely similar (~), location (l), member (m), substance (s), spatial_part/subprocess (p), object (o), url (u) and tool/technique (t). The characters for their inverse links are respectively  >, ^, !, /, -, =, ~, L, M, S, U and T.

The following commands permit the addition, removal and querying of categories, category names and links between categories.
Click here for examples. You may also see the Bison/Yacc part and the Flex/Lex version part of our FT (and FS) parser.

Term_command      := Term_declaration | Term_query | Term_removal
                   | Link_removal | Link_modification


Term_declaration  := Term Relation_signature? Annotation? CreationDate?
                          (Links_of_1_kind ("," Links_of_1_kind)*)?
Relation_signature:= "(" (Term Cardinality1? ("," Term Cardinality1?)*)?
                                            ("->" Term Cardinality1?)?   ")"
                   | "(" "->" ")"
Cardinality1      := "[" Cardinality "]"
Cardinality       :=  Number (".." NumOrStar)?
NumOrStar         := Number | '*'

Links_of_1_kind   := Link_kind (Destination_term | Term_name | Term_partition)+
Link_kind         := ">"|"<"|"^"|":"|"="|"~"|"!"|"/"|"-"|"_"| AlphaNumChar
                   | Identifier (":" | "=>")
TermPartition     := ClosedPartition | OpenPartition
ClosedPartition   := "{(" Destination_term+ ")}"
OpenPartition     := "{"  Destination_term+ "}"
Destination_term  := Term Cardinality2? Link_context?
                   | Cardinality Categ LinkContext?
Cardinality2      := "[" (Number "..")?NumOrStar ',' (Number "..")?NumOrStar "]"
Link_context      := "(" Creator Community? Date? ")"


Term_query        := "?" Term ("("UserID")")?
                   | "? userID" UserId  | "e-mail" (UserId | LinkKind Term)
Term_removal      := "del" Term
Link_removal      := "del" Term Link_kind Identifier
Link_modification := "mod" "(" Term Link_kind Identifier ")"
                           "(" Term Link_kind Identifier ")"


Term         := Identifier
Term_name    := Identifier

//Additional lexical rules:
Annotation   := "(^" ... "^)"
AlphaNumChar := [a-zA-Z0-9]



The FCG (For Conceptual Graphs / Frame-CG) sub-language

FCG and FE are notations for writing Conceptual Graphs (i.e. statements) and commands using them as parameters. In our opinion, they are more expressive/readable alternatives to the Conceptual Graph Linear Format (CGLF) and the Conceptual Graph Interchange Format (CGIF). For example, FCG and FE accept various forms of quantification (such as "at least 3% of") that are difficult to represent in CGLF and CGIF, the order of concept nodes in a graph may be used for solving quantifier scope ambiguities, and names may be used instead of category identifiers when WebKB-2 can exploit the semantic constraints in the graph to retrieve the relevant category identifiers).
Click here to read why we have chosen and extended the Conceptual Graph formalism.
Query retrieval may be done using query graphs expressed in FCG. The command "spec" permits WebKB-2 to retrieve specializations of the graph given in the parameter. Click here for details on the "?" command and planned extensions to the current implementation. Click here for examples. You may also see the Bison/Yacc part and the Flex/Lex version part of our FCG parser.

FCG_command  :=  FCG | "spec" FCG  | "?" FCG  | "??" FCG 
              | "delGraph" Identifier  | "delGraphs"

FCG          := ("!"|"~")? "[" Node Branches? "]" Context?   // "!" = "~" = "not"
              | "[if" FCG "then" FCG ("else" FCG)? "]"

Context      := "(" UserId CreationDate? ")" | "(" CreationDate ")"

Branches     := ("," Branch  | Comp_rel Node)+
Branch       := Path_specif? Relation Node
Comp_rel     := "=>" | "<=>" | "<=" | "=" | "!=" | "<" | "=<" | ">" | ">=" 

Path_specif  := Path_term ("|" Path_term)*
Path_term    := Path_factor Path_factor*
Path_factor  :=  "(" Relation Node? ")" Count
              |  "(" Path_specif ")" Count
Count        := "?" | "*" | Number? "+" | Number

Relation     := RelModality? HasFor_or_Is? (RelationType|Variable)
                         "of"? Annotation? Context?  (":" | "<=")
              | "?" ":"
RelModality  := "may" | "can" | "able to"
HasFor_or_Is := "has for" | "have for" | "is" | "are" | "be"  //syntactic sugar
RelationType := Term
Variable     := ("*"|"?"|"@")(Term|Number)

Node         := NodeExpr Annotation? 
              | "(" NodeExpr Annotation? Branches? ")"

NodeExpr     := "(" NodeExpr ")"   |  "-" NodeExpr 
              | NodeExpr ("+"|"-"|"*"|"/"|"mod") NodeExpr
              | NodeCore

NodeCore     := VAR/Indiv/Fct (Quantifier  Restrictors)? FCQ?
              |                Quantifier  Restrictors   FCQ?
              | GroupOf        Quantifier? Restrictors   Collection? "?"?
              | GroupOf        Quantifier?               Collection  "?"?
              | TermDef | Numbers | Dates | Period |     FCQ
FCQ          := FCG | Collection | "?"
VAR/Indiv/Fct:= Variable | String | Indiv_or_fct TypeGuess? TypeConstraint?
TypeGuess     := "\\=" NodeType
TypeConstraint:= "\\" NodeType
NodeType      := Term
Indiv_or_fct := Term                     //individual, e.g. #London
              | Term "(" FctParams* ")"  //function call or non-binary relation
FctParams    := NodeCore (","NodeCore)?
TermDef      := ("type"|"relation") Term DefParams   (":="|":=>"|":<=") FCG
              | "function"          Term DefParams ":->" Term? Var ":=" FCG
DefParams    := "(" (Term? Var ("," Term? Var)+)? ")"

Quantifier   := ExistQuantif | UnivQuantif | NumericQuant "of"? "the"?
ExistQuantif := "some" | "the" | "there is"? ("a"|"an")
UnivQuantif  := "any" | "every" | "all" | "a typical" | "most" "of"? "the"? 
NumericQuant := "about"? Number "%"? 
              | "at" "least" Number "%"? | "at" "most"  Number "%"? 
              | "between" Number "%"? "and" Number "%"? 
              | "from"?   Number "to"  Number "%"? 
              | "mostly"|"several"|"a few"|"few"
              | "dozens"|"hundreds"|"thousands"|"millions"|"billions"

Numbers      :="about"? Number  |  "at" "least" Number  |  "at" "most" Number
              | "between" Number "and" Number  |  "from"? Number "to" Number
Dates        :="about"? Date    |  "at" "least" Date    |  "at" "most" Date
              | "between" Date "and" Date      |  "from"? Date "to" Date
Periods      :="about"? Period  |  "at" "least" Period  |  "at" "most" Period
              | "between" Period "and" Period  |  "from"? Period "to" Period

Restrictors  := (Order? | Order "occurence of") Qualifier? NodeTypes
                                               VarOrIndiv? NodeTypes*
Qualifier    := "good"|"bad" | "important"|"small"|"big"|"great"
NodeTypes    := "!"? Term | '(' NodeTypes (Branches | '|' NodeTypes)? ')'

GroupOf      := ExistQuantif ("group of" | "bag of" |
                              "set of"|"sequence of"|"alternatives")
              | "together"
Collection   := ("BAG"|"SET"|"LIST"|"SEQ"|"XOR"|"OR"|"AND")? (OpenColl|ClosedColl)
OpenColl     := "{"  Elements  "}" CollSize?
ClosedColl   := "{(" Elements ")}" CollSize?
Elements     := Element (","  Element)*
OR_Elems     := Element ("or" Element)*
AND_Elems    := Element ("and" Element)*
Element      := Node | "*"
CollSize     := "@" Number


//Additional lexical rule:
Order        := {Digit}+("st"|"nd"|"rd"|"th")

Some keywords are equivalent: 1) the existential quantifier keywords (see ExistQuantif), 2) "any", "every" and "all", 3) "a typical" and "most" "of"? "the"?, 4) "able to" and "can". In each case, the first presented keyword is to be prefered ("some", "any" and "a typical" because they permit to use singular nouns instead of plural nouns, and "able to" because its meaning is more obvious than "can"). Other equivalences are: 1) "several", "many" and "at least 1", 2) "most" and "at least 50%", 3) "group of" and "bag of".



The FE (Formalized English) sub-language

Apart from more verbose syntactic sugar for connecting concepts (nodes) to relations, FE is identical to FCG. Below is the grammar for asserting and retrieving a FE statement. This grammar is not yet implemented in WebKB-2 (it is in WebKB-1; click here for examples, queries and Bison/Flex grammars).

FE_command   :=  FE | "Is there" FE

FE           := (Tree ("."|"?"))+

Tree         := Concept Branches*
QuotedTree   := "~"? "`" Tree "'" Context?
Context      := "(" Branches2 ")"

Branches     := With  Relation1 Tree (And With? Relation Tree)*
              | With? Relation2 Tree (And With? Relation Tree)*
              | "is"("a"|"an")Tree (And "is"("a"|"an")Tree)*
Branches2    := With? Relation Tree (And With? Relation Tree)*
              | "is"("a"|"an") Tree (And "is"("a"|"an") Tree)*
With         := ("with"|"at"|"has for"|"have for"|"for"|"is"|"are"|
                 ("can"|"may")("be"|"have for")) "the"?
And          := "and" | ","

Relation     := Relation1 | Relation2
Relation1    := (RelationType|Variable) "of"? Annotation? Context? "<="?
Relation2    := ("=>" | "<=>" | "<=" ) Concept
              | ("=" | "!=" | "<" | "=<" | ">" | ">=" | "or") Concept
RelationType := Term
Variable     := ("*"|"?"|"@"|"^")(Term|Number)

Node         := NodeCore Annotation?

NodeCore     := VARorIndiv (Quantifier  Restrictor)?            FCQ?
              |              Quantifier  Restrictor VARorIndiv? FCQ?
              | GroupOf      Quantifier? Restrictor Variable?  Collection?
              | GroupOf      Quantifier?            Variable?  Collection
              | (Number | Date |                    VARorIndiv  FCQ? | FCQ)
VARorIndiv   := Variable | Individual | String
FCQ          :=  QuotedTree | Collection

Restrictor   := Qualifier? NodeType 
              | Qualifier? "[" NodeType Branches "]"
NodeType     := Term
Qualifier    := "good"|"bad" | "important"|"small"|"big"|"great" | "certain"

Quantifier   := ExistQuantif | UnivQuantif
              | "about" Number "%"? "of"? "the"?
              | "at least" Number "%"? "of"? "the"?
              | "at most"  Number "%"? "of"? "the"?
              | "between" Number "%"? "and" Number "%"? "of"? "the"?
              |           Number "to"  Number "%"? "of"? "the"?
              | "from"    Number "to"  Number "%"? "of"? "the"?
              | "mostly" | "several" "of"? "the"? | "a few" "of"? "the"?
              | Number "%"? "of"? "the"?
              | ("many"|"few"|"dozens"|"hundreds"
                       |"thousands"|"millions"|"billions") "of"? "the"?

ExistQuantif := "some" | "the" | "there is"? ("a"|"an")
UnivQuantif  := "any" | "every" | "all" | "a typical" | "most" "of"? "the"? 

GroupOf      := ExistQuantif ("group of" | "bag of" |
                              "set of"|"sequence of"|"alternatives")
              | "together"
Collection   := ("BAG"|"SET"|"LIST"|"SEQ"|"XOR"|"OR"|"AND")? (OpenColl|ClosedColl)
OpenColl     := "{"  Elements  "}" CollSize?
ClosedColl   := "{(" Elements ")}" CollSize?
Elements     := Element ("," Element)*
Element      := Node | "*"
CollSize     := "@" Number



Philippe A. MARTIN