XML Parsing in a non-RocketSled Application using XmlUtility (29 May 2007)

by Iain Dooley | Permalink

explore : RocketSled, XML, SAX Parsing, Integration

Introduction

This tutorial is about two things:

  • Utilising RocketSled classes from a non-RocketSled application
  • Using the XmlUtility to parse XML documents using SAX

Benefits of SAX

SAX (Simple API for XML) XML parsing has the benefit that you do not need to load the entire file into memory prior to parsing it. With the advent of the SimpleXML interface in PHP5, XML parsing in PHP got a real shot in the arm. However, because this is a DOM parser, it can quickly exhaust your memory limit on larger than normal XML files.

How to Access RocketSled from a non-RocketSled Application

Because RocketSled is a "package centric" application, you can checkout the core of RocketSled without need to setup a database or anything like that. First step is to either download the core from here or you can alternatively check it out of CVS if you have a CVS account with Working Software (this is the recommended method) by doing:

 
cvs -d :ext:user@socata.iaindooley.com:/home/iain/cvsroot \
co -r v0-4 RocketSled

Not many people have a CVS account with Working Software yet :) we are working on anonymous CVS access.

Once you've done that, do the following to prepare your installation:

 
#NB: if you already have a file index.php, rename it
mv RocketSled/* WEBROOT
cd WEBROOT
cd managed_code
cvs update -d
cd ..
sudo chown -R www:www managed_code
cd ext
cvs release -d hn_captcha tcpdf phpmailer
cd ../packages
cvs release -d antenna/ blueprint/ decal/ logbook/ nozzle/ \
pilot/ remote/ sample/

All the above 'cvs release' and 'cvs update -d' action would not be necessary if you had downloaded the tarball instead, but this would make it harder to update your version when needed.

Now that you've got the sys package, you need to access the classes from within your application. RocketSled defines an __autoload function. If your application defines an __autoload function as well then you will need to modify the file:

 
WEBROOT/managed_code/rs_functions.php

simply change the name of __autoload to rsAutoload and call rsAutoload from within your __autoload function. The rest of this tutorial will assume you do not currently have an __autoload function already defined.

Now in your WEBROOT create a file rs.php with the following content:

 
<?php
    define('BASE_DIR',dirname(__FILE__));
    chdir(constant('BASE_DIR'));
    require_once('master.config.php');
    /*some system functions*/
    require_once('managed_code/rs_functions.php');

    if(file_exists('managed_code/rs_config.php')&&
      !constant('DEVELOPMENT'))
        require_once('managed_code/rs_config.php');
    else
        rsGetConfig();

    $test = new XmlNode('my test');
    echo('Got: '.(string)$test);
?>

Then point your web browser at rs.php, you should see something like:

 
Got: Object id 

You now have carte blanche access to all the classes in RocketSled! Remove the lines:

 
$test = new XmlNode('my test');
echo('Got: '.(string)$test);

and then you can just do:

 
require_once('rs.php');

from any file you need to get access to RocketSled classes in.

How to use the XmlUtility for SAX Parsing

First, a brief description of how it works. The class XmlUtility is created, and parses a file, sending messages to an event handler class you have defined. You keep track of your document in this event handler class. Here is a sample XmlEventHandler class (below is a description of each part, NB: newlines have been added for readability, marked by \):

 
<?php
    class MyXmlEventHandler extends XmlEventHandler
    {
        private $root_level_name;
        private $root_level_version;
        private $nodes;

        public function MyXmlEventHandler()
        {
            $this->nodes = array();
            $xml_file_path = 'sample.xml';
            $util = new XmlUtility($xml_file_path,$this);
            while($util->parseDoc());
        }

        public function elementStarted($util)
        {
            switch($util->currentElementName())
            {
                case 'root-level-node':
                    $this->startRootLevelNode($util);
                break;

                case 'sub-node':
                    $this->startSubNode($util);
                break;
            }
        }

        public function elementStopped($util)
        {
            switch($util->stoppedElementName())
            {
                case 'root-level-node':
                    $this->stopRootLevelNode($util);
                break;

                case 'sub-node':
                    $this->stopSubNode($util);
                break;
            }
        }

        public function charDataEncountered($util,$data)
        {
            echo('encountered char data: '.$data.'<br />');
        }

        public function startRootLevelNode($util)
        {
            $this->root_level_name    = $util-> \
                                        attributeValue('name');
            $this->root_level_version = $util-> \
                                     attributeValue('version', \
                                     XmlUtility::OPTIONAL);
        }

        public function stopRootLevelNode($util)
        {
            echo('Done parsing '.$this->root_level_name. \
           ' version: '.$this->root_level_version.'<br />');
        }

        public function startSubNode($util)
        {
            $this->nodes[] = array('name' => \
                            $util->attributeValue('name'));
        }

        public function stopSubNode($util)
        {
            $popped = array_pop($this->nodes);
            echo('Stopped: '.$popped['name'].' now in: '. \
                $util->currentElementName().' <br />');
        }
    }
?>

for more details on the XmlEventHandler class, see the file:

 
WEBROOT/packages/sys/xml/xml_event_handler.class.php

You can see all the available methods of the XmlUtility at (click on the 'Method Detail' tab up the top):

http://rocketsled.workingsoftware.com.au/sys-xml/XmlUtility.html

Of particular interest to you while developing your event handler are:

 
void  attributeList  ()
void attributes ()
void attributeValue ( $att_name, [ $optional = ''])
void cease ()
void currentElementName ()
void decrementElementCount (mixed $name)
void getCharData ()
void incrementElementCount (mixed $name)
void lastEvent ()
void parentElementName ()
void stoppedElementName ()

Now create a file called test.php with the following contents in your WEBROOT:

 
<?php
    require_once('rs.php');
    require_once('my_xml_event_handler.class.php');
    $handler = new MyXmlEventHandler();
?>

And create a file called sample.xml with the following contents:

 
<root-level-node name="My Name" version="1.0">
    <sub-node name="Sub One" />
    <sub-node name="Sub Two" />
</root-level-node>

Now point your browser at WEBROOT/test.php, you should see something like:

 
Stopped: Sub One now in: root-level-node
Stopped: Sub Two now in: root-level-node
Done parsing My Name version: 1.0

So What's Happening?

TO BE CONTINUED ...

Comments

This could actually spawn a more general blog entry about how to use any RocketSled classes from a non-RocketSled application
- Iain Dooley (2007-06-21)