OCDM_Collection Class Reference

The OCDM_Collection class incorporates every function related to the manipulation of the collections. Before start reading the documentation of the function, you should recollect that the collection is the central element of storing data in Ellogon is the Collection. A Collection is nothing more than a finite set of Documents. In other words, the Collection can be thought of as the corpus we want to process. Then the Documents of that Collection represent the actual documents of the corpus, bundled together.

#include <OCDM.h>

List of all members.

Public Member Functions

 OCDM_Collection ()
 This is a null constructor of an OCDM_Document Object.
 OCDM_Collection (const char *name)
 This is the overloaded version of the above constructor with a different usability.
 OCDM_Collection (const class OCDM_Collection &obj)
 This is the default Copy constructor.
 OCDM_Collection (const char *name, const OCDM_AttributeSet &AttrSet, const char *encoding)
 This is a constructor of the class OCDM_Collection.
 OCDM_Collection (const CDM_Collection)
 A constructor that maps a CDM_Collection to an OCDM_Collection object.
 ~OCDM_Collection ()
 This is the destructor of the OCDM_Collection Class.
OCDM_Collectionoperator= (const class OCDM_Collection &obj)
 This is the default Assignment operator.
void storeObject (const class OCDM_Object *objPtr) const
void storeObject (const class OCDM_Document *objPtr) const
const class OCDM_ObjectgetStoredObject (void) const
void releaseStoredObject (void) const
OCDM_BOOL AttributeExists (const char *name) const
 This function will return true if an Attribute with the specified name exists in the Collection object.
void CreateDocument (const char *XID, const OCDM_ByteSequence &RawData, const OCDM_AnnotationSet &Annotations, const OCDM_AttributeSet &Attributes, const char *encoding)
 This function creates a new Document object in an existing Collection object.
void CreateDocument (const char *XID, const OCDM_ByteSequence &RawData, const OCDM_AnnotationSet &Annotations, const OCDM_AttributeSet &Attributes)
int Length (void) const
 This function will return the Length (the number of all the Documents) of the specified Collection.
 OCDM_REF (OCDM_Attribute) GetAttribute(const char *name) const
 OCDM_REF (OCDM_AttributeSet) GetAttributes(void) const
int PutAttribute (const OCDM_Attribute &Attr)
const char * GetName (void) const
 This function will return the Name of the specified Collection.
int RemoveAttribute (const char *name)
const char * GetEncoding (void) const
const char * SetEncoding (const char *encoding)
 OCDM_REF (OCDM_ByteSequence) Status(void) const
int Sync (void) const
int AnnotateColection (void)
int Destroy (const char *name)
 OCDM_REF (OCDM_Document) FirstDocument(void) const
 OCDM_REF (OCDM_Document) GetByExternalId(const char *XID) const
 OCDM_REF (OCDM_Document) GetDocument(const char *ID) const
 OCDM_REF (OCDM_Document) NextDocument(void) const
const char * GetOwner (void) const
int RemoveDocument (const char *Id)
const char * SetName (const char *Name)
const char * SetOwner (const char *Owner)
int SetAssociatedInfo (const OCDM_ByteSequence &Info)
void Log (const char *str,...) const
 This method logs information. It is equivalent to OCDM_Utilities::Log().
long size (void) const
 This function returns the number of Documents contained in the Collection object.
OCDM_BOOL Valid (void) const
const char * toString (void) const
const char * objectType (void) const


Constructor & Destructor Documentation

OCDM_Collection  ) 
 

Description
The purpose of this function is to create a null document, just to be used for programming reasons.

OCDM_Collection const char *  name  ) 
 

Description
Arguments
  • name: It is the Collection's name which must be the absolute path of the Collection on disk,If name is of the form "Col*", then it refers to an already open Collection. Else, we assume that name is a path, and the Collection is loaded from the disk...

OCDM_Collection const class OCDM_Collection obj  ) 
 

OCDM_Collection const char *  name,
const OCDM_AttributeSet AttrSet,
const char *  encoding
 

Description
It's main purpose is to create a new Collection object. It accepts three arguments: the Collection's name which must be the absolute path of the Collection on disk, an object of type OCDM_AttributeSet, representing a set of Attributes that will be inserted into the new Collection and the encoding of the Collection. The Collection's encoding will be the default encoding that will be used when a new Document will be created in this Collection. Note that each Document can have a different encoding than the encoding of the container Collection. The only place that this encoding will be used, is in the case where a new Document of unspecified encoding will be created. If the Collection encoding argument is ommited, it defaults to "CDM_DefaultEncoding". "CDM_DefaultEncoding" is a global variable (of type char *), that has as initial value the system encoding.
This variable is also exported at the Tcl level under the same name and as a result its value can easily be changed by the end user.

Arguments
  • name: It is the Collection's name which must be the absolute path of the Collection on disk,
  • AttrSet: It is an object of type OCDM_AttributeSet which represents a set of Attributes that will be inserted into the new Collection
  • encoding: The encoding of the Collection
Note:
In the new Collection will be created only in memory. Nothing will be saved on the disk. In order for the Collection to be saved in disk, the function CDM_Sync must be called after the new Collection is created in memory. This is usually done after all of the desired Documents have been added to the newly created Collection.

OCDM_Collection const   CDM_Collection  ) 
 

~OCDM_Collection  ) 
 

Description
It closes the given Collection and frees all the memory occupied by the Collection and its Documents. All the deleted objects will also be unregistered from the current Tcl interpreter (CDM_Interp).


Member Function Documentation

int AnnotateColection void   ) 
 

This function will run the specified annotator over the given Collection... [Not Implemented]

OCDM_BOOL AttributeExists const char *  name  )  const
 

Arguments:
  • name: The Attribute name to be found.
Note:
In case of an Error an Exception of type OCDM_Exception will be thrown.

void CreateDocument const char *  XID,
const OCDM_ByteSequence RawData,
const OCDM_AnnotationSet Annotations,
const OCDM_AttributeSet Attributes
 

\ brief This is an overloaded version of the above one. The only difference is that it does not hold encoding information.

void CreateDocument const char *  XID,
const OCDM_ByteSequence RawData,
const OCDM_AnnotationSet Annotations,
const OCDM_AttributeSet Attributes,
const char *  encoding
 

Description
This function creates a new Document object in an existing Collection object.
Arguments
It accepts four arguments:
  • XID: The External Id of the Document (which represents the full path in disk of the original file that contains the Document's text)
  • RawData: the text (RawData or ByteSequence) of the new Document,
  • Annotations: An initial Annotation set,
  • Attributes: An initial Attribute set
  • Encoding: The encoding of the Document.
The External Id must be a valid (absolute) path. A local copy of this argument will be created for use by this function. The existance of this path will not be checked. This path value will be converted automatically to a valid platform dependant path. As a result, for this value can be safely used the notation used for paths under the unix operating system. That means that the path:
/Users/petasis/Collections will be converted internally to the following path, assuming windows as operating system :

C:\Users\petasis\Collections

The RawData parameter must contain the desired for the new Document text, in UTF-8 format. A local copy of this variable will be also created. If the text is not in UTF-8 (i.e. contains text just red from a file using the standart C library routines and is in the ISO 8859-7 encoding) then must be converted to UTF using CDM_ExternalToUtf. If the origin of this string is Tcl, then it is already in UTF format, as Tcl uses ONLY UTF for encoding strings internally. The Annotations parameter must be of type OCDM_AnnotationSet.Note that a reference to this object will also be kept. The Attributes parameter must hold a valid Attribute set created, A reference to this object will also be kept. Finally, an encoding can be specified. A local copy of this string will be created. The value of this parameter must be a standart Tcl encoding value (like iso8859-7 or cp1253). For all available Tcl encodings please refer to the Tcl manuals. If this parameter is ommitted, then a default value will be used. This value will be inherited from the parent Collection object. It is important for the encoding to correctly describe the text. If the given UTF string cannot be converted to the Document's encoding, then the text will be filled with the character "?" in places where the conversion will fail.

int Destroy const char *  name  ) 
 

Description
This function will delete the disk representation of the Collection object in use. The Name parameter must be the (absolute) path to the directory that holds the Collection. Having a Collection object, this information can be obtained through the use of the function OCDM_GetName.
Description
  • name: The name of the collection to be destroyed
Note:
this function will only delete the current disk representation of a Collection. If the Collection is loaded in memory and we save the loaded Collection (with the help of OCDM_Sync), then the representaion of the Collection in disk will be re-created.

const char * GetEncoding void   )  const
 

Description
This function will return the encoding of the specified Collection. The return value will be of type char* and will be owned by CDM. Its value will be a standard Tcl encoding value (like iso8859-7 or cp1253). For all available Tcl encodings please refer to the Tcl manuals.

const char * GetName void   )  const
 

Description
Returns:
This function will return the Name of the specified Collection. This value will be the (absolute) full path to the directory that contains the disk representation of the specified Collection. The returned value will be encoded using the UTF-8 encoding (thus enabling the existance of non Latin characters in the value).

const char * GetOwner void   )  const
 

Description
This function will return the Owner of the specified Collection. Its value is not of great importance, and usually defaults to the value "CDM". The returned value will be encoded using the UTF-8 encoding (thus enabling the existance of non Latin characters in the value).
Returns:
The return value will be of type char* and will be owned by CDM.

const class OCDM_Object * getStoredObject void   )  const
 

int Length void   )  const
 

void Log const char *  str,
  ...
const
 

const char* objectType void   )  const [inline]
 

OCDM_REF OCDM_Document   )  const
 

OCDM_REF OCDM_Document   )  const
 

OCDM_REF OCDM_Document   )  const
 

OCDM_REF OCDM_Document   )  const
 

OCDM_REF OCDM_ByteSequence   )  const
 

OCDM_REF OCDM_AttributeSet   )  const
 

OCDM_REF OCDM_Attribute   )  const
 

class OCDM_Collection & operator= const class OCDM_Collection obj  ) 
 

int PutAttribute const OCDM_Attribute Attr  ) 
 

Description
This function will add a given Attribute to the specified Collection object. If an Attribute with the same name already exists in the Collection, then it will be overwritten by the new Attribute. Else, the new Attribute will be appended to the existing Attribute set of the specified Collection.
Arguments
  • Attr: The name of the attribute to be inserted in the collection.

void releaseStoredObject void   )  const
 

int RemoveAttribute const char *  name  ) 
 

Description
This function will remove the Attribute named exactly as the "Name" parameter. If the requested Attribute does not exist, an Exception of type OCDM_exception will be thrown
Description
  • name: The name of the attribute to be removed

int RemoveDocument const char *  Id  ) 
 

Description
This function will remove the Document object that has as Id the specified by the parameter "Id" value. If a Document with the requested Id does not exist, then an Exception of type OCDM_Exception will be thrown
Arguments
  • Id: The Id parameter that specifies the document to be removed

int SetAssociatedInfo const OCDM_ByteSequence Info  ) 
 

Description
This function sets the provided Tcl object (of type Tcl_Obj*) as associated information to the Collection object we are currently use. This "associated information" (Tcl object) can contain anything that the user wants to store ("associate") with this Collection. CDM does not modify or use this information in any way.
If an error occurs an exception of type OCDM_Exceptio will be thrown

const char * SetEncoding const char *  encoding  ) 
 

Description
This function will change the encoding of a Collection object to the encoding specified by the "Encoding" parameter. The function will return a pointer (of type char*) to the string buffer that holds the new encoding value. This pointer will be at a different memory location than the given parameter value, as the CDM will create and manipulate a local copy. In case of an error, an exception of type OCDM_Exception will be thrown Note that the returned pointer is property of the CDM and should never be freed or modified in any way by the caller.

const char * SetName const char *  Name  ) 
 

Description
This function will change the Name of the Collection object in use. The Name of the Collection represents the (absolute) path of the Collection's disk representation. Changing the Name of a Collection will result into a new Collection representation in disk if the Collection is saved (through the use of the function CDM_Sync). Note that this function will not perform any checks on the value of the "Name" parameter.
Arguments
  • Owner; the parameter of the new name
Note:
If an invalid path is given, or a path that corresponds to an existing file, no error will be returned by this function. (Although OCDM_Sync will return an error if the caller tries to save a Collection with an invalid Name.) The value of the "Name" parameter will not modified in any way by CDM, as a local copy will be created and modified internally by the CDM.

const char * SetOwner const char *  Owner  ) 
 

Description
This function will change the Owner of the Collection object in use. The value of the "Owner" parameter will not modified in any way by CDM, as a local copy will be created and modified internally by the CDM.
Arguments
  • Owner; the parameter of the new owner
Note:
This function will return a pointer (of type char*) to the string buffer that holds the new Collection Owner. This pointer will be at a different memory location than the given parameter value, as the CDM will create and manipulate a local copy.

long size void   )  const
 

void storeObject const class OCDM_Document objPtr  )  const
 

void storeObject const class OCDM_Object objPtr  )  const
 

int Sync void   )  const
 

Description
This function will save a Collection object in disk. If the Collection object has never been saved again, then this function will create the directory that the Name of the Collection specifies. A representation of the Collection will be saved in this directory. If the directory already exists it will be overwritten. If the directory cannot be created an an exception of type OCDM_Exception will be thrown..
Note:
The return value from this function will be a standard Tcl completion code (of type int) with one of the values TCL_OK and TCL_ERROR. If the requested Collection is successfully saved, TCL_OK will be returned. In case of an error, an exception of type OCDM_Exception will be thrown.

const char * toString void   )  const
 

Description
Return object as a formatted string.

OCDM_BOOL Valid void   )  const
 

Description
As the name implies this function checks for a valid Collection. In other words, if a Document exists it returns a true OCDM_BOOLean variable. Otherwise it returns false


Generated on Wed Aug 16 22:32:02 2006 for PythonCDM by  doxygen 1.4.6