Indexation of Web documents

Dr. Philippe A. MARTIN


Below is a set of Conceptual Graphs (CGs), each of which links a document to concepts representing its content. This kind of indexation is different from the two other forms illustred in the files interviewIndexation.html and images/clubMed/index.html since here the URLs of the indexed documents are used in the CGs (as referents of concepts of type File).

This document is loadable (executable) and searchable by WebKB. For example, call the WebKB Knowledge-based Information Retrieval/Handling Tool (click here), paste the following three lines in the query area, and click the Submit button.

load ../../kb/webkb1/documentsIndexation.html;
spec [Conference]->(S_Location)->[City]->(In)->[Country:Australia];
spec [%Java];

Section 1 shows the ontology used in the CGs. However, not all the terms in the CGs need to be declared. WebKB exploits the signatures of relations to type undeclared terms. The command "no decl" (cf. below) warns WebKB to expect undeclared terms (otherwise, it generates error messages; it is wise to use this command only after making sure that all WebKB error messages are related to undeclared terms).


1  Ontology

1.1  Declaration of top-level concept types

Thing > (Entity Situation);
  Entity > (Temporal_entity Spatial_entity Information_entity);
    Temporal_entity > Date;
    Spatial_entity > (Space_location Physical_entity Imaginary_spatial_entity)
                     Collection;
      Space_location > City Country;
      Physical_entity > Animal Hardware;
        Animal > Person;
          Person > Microsoft_bogan;
        Hardware > Network SGI_hardware;
      Collection > Organization;
        Organization > Bank Firm;
    Information_entity > (Description Description_container Property Property_measure);
      Description > Description_medium Procedure;
        Description_medium > (Symbol Syntax Language Abstract_data_type);
          Abstract_data_type > Structured_ADT;
            Structured_ADT > Graph_ADT;
              Graph_ADT > Conceptual_Graph_ADT;
          Language > (C Lisp Javascript Conceptual_Graph_language HTML);
            Conceptual_Graph_language > CG_linear_notation;
        Procedure > Protocol;
          Protocol > Network_protocol;
            Network_protocol > HTTP_protocol FTP_protocol Telnet_protocol;
              HTTP_protocol > CGI_protocol;
      Description_container > File Image Hologram;
        File > File_in_special_format Special_file Unretrievable_file;
          File_in_special_format > (Textual_file Binary_file);
            Textual_file > Plain_text_file Structured_text_file Encoded_text_file;
              Structured_text_file > HTML_file;
              Encoded_text_file > Postscript_file;
            Binary_file > GIF_file;
          Special_file > Software Documentation Repository;
            Software > Freeware Web_search_engine;
            Documentation > Tutorial Lecture Reference Model;
            Repository > Web_index Yellow_pages Home_page Cookie;
               Web_index > Web_index_and_search_engine;
  Situation > (State Process) Problem;
    Process > Event Problem_solving_process;
      Event > Conference;
      Problem_solving_process > Diagnostic Modification Design;
        Design > Software_development;


/* Here are N-N relations beween some of these concepts:
[Thing]
  { (Descr)->[Description];                //Anything may be abstracted/represented
    (DescrIn)->[Description_container];
  }
[Situation]-
  { (S_Succ)<-[Situation];                 //a situation follows a situation
    (S_Succ)->[Situation];                 //and is followed by a situation
    (Descr)->[Description];                //it may be abstracted/represented
    (DescrIn)->[Description_container]; 
  }
[Process]-
  { (S_Succ)<-[Situation];                 //a process ends a situation
    (S_Succ)->[Situation];                 //and creates a new one
    (Descr)->[Description];                //it may be abstracted/represented/described
    (Agent)->[Entity];                     //it may have agents
    (Method)->[Procedure];                 //an agent may apply a procedure
    (SubProcess)->[Process];               //a procedure may generate sub-processes
    (Result)->[Thing];                 //an entity or a process may be the result
  }
[Description]-
  { (Descr)<-[Thing];                  //a description abstracts a situation
    (Statement)->[Description_medium];     //use symbols or a language
    (Author)->[Entity];                    //it has at least one author
    (InD)->[Description_container]->(For)->[Entity];
                               //it is presented in a container for some entities
    (Material_support)->[Physical_entity]; //it is stored on a physical support
  }
[Software]-
  { (DescrIn)<-[Situation];           //a software countains state/process descriptions
    (Norm)->[Language];               //it follows at least one norm: a language
    (Material_support)->[Hardware];   //and it is stored and executed on hardware
  }
*/

1.2  Declaration of top-level relation types

DescrIn (Thing, Description_container);
S_Location (Thing, Spatial_entity);
T_Location (Situation, Temporal_entity);
In (Spatial_entity, Spatial_entity);
Theme (Conference, Process);
Part (Entity, Entity);
Agent (Process, Entity);
Result (Process, Thing);
Material_support (Information_entity, Physical_entity);
For (Description_container, Entity);
InD (Description, Description_container);
Norm (Description, Description_medium);



2  Indexations

no decl;  //WebKB is warned that the following CG may countain undeclared terms

[File: http://www.cssip.elec.uq.edu.au/~icpr98];

// http://www.cssip.elec.uq.edu.au/~icpr98/ :
[Conference: ICPR_98]- { (Theme)->[Process: Pattern_recognition];
                         (S_Location)->[City:Brisbane]->(In)->[Country:Australia];
                         (T_Location)->[Date:"16-20/8/1998"];
                         (DescrIn)->[File: http://www.cssip.elec.uq.edu.au/~icpr98];
                       };

// http://web.archive.org/web/20011201230540/http://www.cit.gu.edu.au/conferences/ai98/ :
[Conference: AI_98]- { (Theme)->[Process: Artificial_intelligence];
                       (S_Location)->[City:Brisbane]->(In)->[Country:Australia];
                       (T_Location)->[Date:"13-17/7/1998"];
                       (DescrIn)->[File: http://web.archive.org/web/20011201230540/http://www.cit.gu.edu.au/conferences/ai98/];
                     };

// http://www.hic.org.au/ :
[Conference: AI_98]- { (T_Location)->[Date:"1999"];
                       (DescrIn)->[File: http://www.hic.org.au/];
                     };

// http://cwis.kub.nl/~fdl/research/ti/Docs/CMC/ :
[Conference: CMC_98]- { (Theme)->[Process: Cooperative_multimodal_communication];
                        (S_Location)->[City:Eindhoven]->(In)->[Country:Netherlands];
                        (T_Location)->[Date:"28-30/1/1998"];
                      (DescrIn)->[File: http://cwis.kub.nl/~fdl/research/ti/Docs/CMC/];
                      };

// http://www.sd.monash.edu.au/pakdd-98/ :
[Conference: PAKDD_98]- { (Theme)->[Process: Knowledge_discovery_and_data_mining];
                          (S_Location)->[City:Melbourne]->(In)->[Country:Australia];
                          (T_Location)->[Date:"15-17/4/1998"];
                          (DescrIn)->[File: http://www.sd.monash.edu.au/pakdd-98/];
                        };

// http://www.interpac.net/~eingang/CI/default.html :
[Thing]->(DescrIn)->[Unretrievable_file: http://www.interpac.net/~eingang/CI/default.html];

// http://meganesia.int.gu.edu.au/~rjcole/DataComms :
[Process: Data_communications]->(DescrIn)->[Lecture: http://meganesia.int.gu.edu.au/~rjcole/DataComms];

// http://meganesia.int.gu.edu.au/~rjcole/SoftwareDevel :
[Process: Software_development]->(DescrIn)->[Lecture: http://meganesia.int.gu.edu.au/~rjcole/SoftwareDevel];

// http://www.yahoo.com.au/ :
[Web_index_and_search_engine: http://www.yahoo.com.au/];

// http://www.yellowpages.com.au/ :
[Country:Australia]<-(S_Location)<-[Organization]->(DescrIn)->[Yellow_pages:http://www.yellowpages.com.au/];

// https://www.unicu.org.au/netteller/ :
Bank > Uni_Credit_Union;
[Country:Australia]<-(S_Location)<-[Uni_Credit_Union]->(DescrIn)->[File:https://www.unicu.org.au/netteller/];

// http://www.w3.org/TR/REC-html32 :
[HTML:HTML_3.2]->(DescrIn)->[Reference:http://www.w3.org/TR/REC-html32];

// http://www.chhs.niu.edu/graphics/ :
[Repository:http://www.chhs.niu.edu/graphics/]->(Part)->[GIF_file:{*}];

// http://reddgc.ins.gu.edu.au/ :
[HTML]->(DescrIn)->[Model: http://reddgc.ins.gu.edu.au/];

// http://freeware.sgi.com/ :
[SGI_hardware]<-(Material_support)<-[Freeware]<-(Part)<-[Repository: http://freeware.sgi.com/];

// http://www.tnt.uni-hannover.de/soft/imgproc/khoros/ :
[Repository: Khoros]-
  { (Part)->[Software: *s]<-(DescrIn)<-[Process: Information_processing_and_visualization];
    (Part)->[File]<-(DescrIn)<-[Language]<-(Norm)<-[Procedure]->(InD)->[*s];
    (DescrIn)->[Home_page: http://www.tnt.uni-hannover.de/soft/imgproc/khoros/];
  };

// http://www.khoral.com/ :
[Firm: Khoral]-
  { (Agent)<-[Software_development]->(Result)->[Repository: Khoros];
    (DescrIn)->[Home_page: http://www.khoral.com/];
  };

// http://www.amdahl.com/ext/CARP/SBCON/SBCON.html#Standards :
[Network_protocol: SBCON]->(DescrIn)->[Home_page: http://www.amdahl.com/ext/CARP/SBCON/SBCON.html#Standards];

// http://www.iaee.tuwien.ac.at/agcad/gnu_docs/elisp-intro/emacs-lisp-intro_toc.html :
[Lisp]->(DescrIn)->[Unretrievable_file: http://www.iaee.tuwien.ac.at/agcad/gnu_docs/elisp-intro/emacs-lisp-intro_toc.html];

// http://www.gustavo.net/programming/cgi.shtml :
[Repository: http://www.gustavo.net/programming/cgi.shtml]-
  { (DescrIn)<-[CGI_protocol];
    (Part)->[Tutorial: {*}];
    (Part)->[Software];
    (For)->[Microsoft_bogan: {*}];
  };

// http://www.javascriptguide.com/ :
[Javascript: JavaScript_1.2]->(DescrIn)->[Tutorial: http://www.javascriptguide.com/];

// http://meganesia.int.gu.edu.au/~rjcole/DataComms/tutorial1/cookie.cgi :
[Cookie]->(DescrIn)->[Unretrievable_file: http://meganesia.int.gu.edu.au/~rjcole/DataComms/tutorial1/cookie.cgi];

// http://meganesia.int.gu.edu.au/~rjcole/DataComms/tutorial1/ :
[Network_protocol]->(DescrIn)->[Tutorial: http://meganesia.int.gu.edu.au/~rjcole/DataComms/tutorial1/];