pdftron::PDF::ElementReader Class Reference

ElementReader can be used to parse and process content streams. More...

#include <ElementReader.h>

List of all members.

Public Member Functions

 ElementReader ()
 ~ElementReader ()
void Begin (const Page &page, OCG::Context *ocg_context=0)
 Begin processing a page.
void Begin (SDF::Obj content_stream, SDF::Obj resource_dict=0, OCG::Context *ocg_context=0)
 Begin processing given content stream.
Element Next ()
Element Current ()
void FormBegin ()
 When the current element is a form XObject you have the option to skip form processing (by not calling FormBegin()) or to open the form stream and continue Element traversal into the form.
GState PatternBegin (SDF::Obj pattern, bool inherit_gs=true)
 A method used to spawn a sub-display list representing a tiling pattern.
void Type3FontBegin (SDF::Obj glyph_stream, SDF::Obj font, SDF::Obj resource_dict=0)
 A method used to spawn a sub-display list representing a Type3 Font glyph.
bool End ()
 Close the current display list.
GSChangesIterator GetChangesIterator ()
bool IsChanged (GState::GStateAttribute attrib)
void ClearChangeList ()
 Clear the list containing identifiers of modified graphics state attributes.
SDF::Obj GetFont (const char *name)
SDF::Obj GetXObject (const char *name)
SDF::Obj GetShading (const char *name)
SDF::Obj GetColorSpace (const char *name)
SDF::Obj GetPattern (const char *name)
SDF::Obj GetExtGState (const char *name)


Detailed Description

ElementReader can be used to parse and process content streams.

ElementReader provides a convenient interface used to traverse the Element display list of a page. The display list representing graphical elements (such as text-runs, paths, images, shadings, forms, etc) is accessed using the intrinsic iterator. ElementReader automatically concatenates page contents spanning multiple streams and provides a mechanism to parse contents of sub-display lists (e.g. forms XObjects and Type3 fonts).

A sample use case for ElementReader is given below:

 ...
 ElementReader reader;
 reader.Begin(page);
 for (Element element=reader.Next(); element; element = reader.Next()) // Read page contents
 {
   switch (element.GetType())   {
     case Element::e_path: { // Process path data...
         const double* data = element.GetPathPoints();
         int sz = element.GetPointCount();
     }
     break; 
     case Element::e_text: 
         // ...
     break;
   }
 }
 reader.End();

For a full sample, please refer to ElementReader and ElementReaderAdv sample projects.


Constructor & Destructor Documentation

pdftron::PDF::ElementReader::ElementReader (  ) 

pdftron::PDF::ElementReader::~ElementReader (  ) 


Member Function Documentation

void pdftron::PDF::ElementReader::Begin ( const Page page,
OCG::Context ocg_context = 0 
)

Begin processing a page.

Parameters:
page A page to start processing.
ocg_context An optional parameter used to specify the Optional Content (OC) Context that should be used when processing the page. When the OCG::Context is specified, Element::IsOCVisible() will return 'true' or 'false' depending on the visibility of the current Optional Content Group (OCG) and the states of flags in the given context.
Note:
When page processing is completed, make sure to call ElementReader.End().

void pdftron::PDF::ElementReader::Begin ( SDF::Obj  content_stream,
SDF::Obj  resource_dict = 0,
OCG::Context ocg_context = 0 
)

Begin processing given content stream.

The content stream may be a Form XObject, Type3 glyph stream, pattern stream or any other content stream.

Parameters:
content_stream - A stream object representing the content stream (usually a Form XObject).
resource_dict - An optional '/Resource' dictionary parameter. If content stream refers to named resources that are not present in the local Resource dictionary, the names are looked up in the supplied resource dictionary.
ocg_context An optional parameter used to specify the Optional Content (OC) Context that should be used when processing the page. When the OCG::Context is specified, Element::IsOCVisible() will return 'true' or 'false' depending on the visibility of the current Optional Content Group (OCG) and the states of flags in the given context.
Note:
When page processing is completed, make sure to call ElementReader.End().

Element pdftron::PDF::ElementReader::Next (  ) 

Returns:
a page Element or a 'NULL' element if the end of current-display list was reached. You may use GetType() to determine the type of the returned Element.
Note:
Every call to ElementReader::Next() destroys the current Element. Therefore, an Element becomes invalid after subsequent ElementReader::Next() operation.

Element pdftron::PDF::ElementReader::Current (  ) 

Returns:
the current Element or a 'NULL' Element. The current element is the one returned in the last call to Next().
Note:
Every call to ElementReader::Next() destroys the current Element. Therefore, an Element becomes invalid after subsequent ElementReader::Next() operation.

void pdftron::PDF::ElementReader::FormBegin (  ) 

When the current element is a form XObject you have the option to skip form processing (by not calling FormBegin()) or to open the form stream and continue Element traversal into the form.

To open a form XObject display list use FormBegin() method. The Next() returned Element will be the first Element in the form XObject display list. Subsequent calls to Next() will traverse form's display list until NULL is returned. At any point you can close the form sub-list using ElementReader::End() method. After the form display list is closed (using End()) the processing will return to the parent display list at the point where it left off before entering the form XObject.

GState pdftron::PDF::ElementReader::PatternBegin ( SDF::Obj  pattern,
bool  inherit_gs = true 
)

A method used to spawn a sub-display list representing a tiling pattern.

You can call this method at any point during processing of the parent display list.

Parameters:
pattern the Pattern SDF/Cos stream in the Resource dictionary (e.g. which can be obtained through GState interface of the current Element or through some other means).
inherit_gs An optional parameter used to indicate whether the pattern should inherit the graphics state from the parent content stream (the content stream in which the pattern is defined as a resource). By default, inherit_gs is 'true'.
To open a tiling pattern sub-display list use PatternBegin(pattern) method. The Next() returned Element will be the first Element in the pattern display list. Subsequent calls to Next() will traverse pattern's display list until NULL is encountered. At any point you can close the pattern sub-list using ElementReader::End() method. After the pattern display list is closed, the processing will return to the parent display list at the point where pattern display list was spawned.

Returns:
The graphics state that was in effect at the beginning of the pattern’s parent content stream.

void pdftron::PDF::ElementReader::Type3FontBegin ( SDF::Obj  glyph_stream,
SDF::Obj  font,
SDF::Obj  resource_dict = 0 
)

A method used to spawn a sub-display list representing a Type3 Font glyph.

You can call this method at any point during processing of the parent display list.

Parameters:
glyph_stream SDF/Cos stream containing glyph description (e.g. returned by Font::GetType3GlyphStream()).
font SDF/Cos object representing the Type3 font (e.g. returned by Font::GetSDFObj()).
resource_dict - An optional '/Resource' dictionary parameter. If any glyph descriptions refer to named resources but Font Resource dictionary is absent, the names are looked up in the supplied resource dictionary.
To open a Type3 font sub-display list use Type3FontBegin() method. The Next() returned Element will be the first Element in the glyph's display list. Subsequent calls to Next() will traverse glyph's display list until NULL is returned. At any point you can close the glyph sub-list using ElementReader::End() method. After the glyph display list is closed, the processing will return to the parent display list at the point where glyph display list was spawned.

bool pdftron::PDF::ElementReader::End (  ) 

Close the current display list.

If the current display list is a sub-list created using FormBegin(), PatternBegin(), or Type3FontBegin() methods, the function will end the sub-list and will return processing to the parent display list at the point where it left off before entering the sub-list.

Returns:
true if the closed display list is a sub-list or false if it is a root display list.

GSChangesIterator pdftron::PDF::ElementReader::GetChangesIterator (  ) 

Returns:
an iterator to the beginning of the list containing identifiers of modified graphics state attributes since the last call to ClearChangeList(). The list can be consulted to determine which graphics states were modified between two Elements. Attributes are ordered in the same way as they are set in the content stream. Duplicate attributes are eliminated.

bool pdftron::PDF::ElementReader::IsChanged ( GState::GStateAttribute  attrib  ) 

Returns:
true if given GState attribute was changed since the last call to ClearChangeList().

void pdftron::PDF::ElementReader::ClearChangeList (  ) 

Clear the list containing identifiers of modified graphics state attributes.

The list of modified attributes is then accumulated during a subsequent call(s) to ElementReader.Next().

SDF::Obj pdftron::PDF::ElementReader::GetFont ( const char *  name  ) 

Returns:
SDF/Cos object matching the specified name in the current resource dictionary. For 'Page' the name is looked up in the page's /Resources/<Class> dictionary. For Form XObjects, Patterns, and Type3 fonts that have a content stream within page content stream the specified resource is first looked-up in the resource dictionary of the inner stream. If the resource is not found, the name is looked up in the outer content stream’s resource dictionary. The function returns NULL if the resource was not found.

SDF::Obj pdftron::PDF::ElementReader::GetXObject ( const char *  name  ) 

SDF::Obj pdftron::PDF::ElementReader::GetShading ( const char *  name  ) 

SDF::Obj pdftron::PDF::ElementReader::GetColorSpace ( const char *  name  ) 

SDF::Obj pdftron::PDF::ElementReader::GetPattern ( const char *  name  ) 

SDF::Obj pdftron::PDF::ElementReader::GetExtGState ( const char *  name  ) 


© 2002-2010 PDFTron Systems Inc.