Cheap Web Hosting for Developers

PHP, MySQL, Java, Unix Cheap Web Hosting

152 CHAPTER 4 XPATH, XPOINTER, XINCLUDE, AND

Filed under: PHP and XML — webmaster @ 17:41

152 CHAPTER 4 XPATH, XPOINTER, XINCLUDE, AND THE FUTURE parsing. It occurs when instructed by the user of the document, which can occur while parsing, after the fact, or even not at all. External entities also must be defined in a DTD. XInclude does not require a DTD to work within a document. This allows it to work independently of validation. Failure to load an external entity normally results in a failure to load the base document. XInclude, on the other hand, offers the ability to provide alternatives in the event the remote data cannot be loaded. Using a fallback mechanism allows the base document to load successfully even though a remote source may be unavailable. The following sections will explain the syntax used to employ XInclude as well as how you can use it within an XML document. XInclude defines the namespace http://www.w3.org/2001/XInclude. Although you can associate any prefix with this namespace, the typical prefix used is xi. This namespace contains two elements, includeand fallback. Within the following sections, the xi prefix will refer to the http://www.w3.org/2001/XInclude namespace, so the elements will appear as xi:include and xi:fallback. Listing 4-7 is a small portion of the courses XML document. This document resides in the file courses.xml, and I will use it in the following sections for illustration. Listing 4-7. Small XML Course Document for the File courses.xml Introduction to Languages Introduction to French xi:include The xi:include element defines the location of the entity to include as well as any additional information that may be needed to parse the entity when including. This element takes the following form: xi:include attributes Although the attributes are optional, many of the requirements for attributes are dependant upon each other. href The value of the href attribute specifies the URI of the resource to include. This is an optional attribute. When omitted or set to an empty string (href=”"), the location references the same document.

Note: If you are looking for good and high quality web space to host and run your application check Lunarwebhost Cheap Web Hosting services

PHP, MySQL, Java, Unix Cheap Web Hosting

CHAPTER 4 XPATH, XPOINTER, XINCLUDE, AND THE

Filed under: PHP and XML — webmaster @ 06:34

CHAPTER 4 XPATH, XPOINTER, XINCLUDE, AND THE FUTURE 151 start-point This is the syntax for start-point: location-set start-point(location-set) This function returns a location set composed of all the starting points for each location of the location-setinput parameter. For example, start-point(//chapter)would return a set of points immediately following the opening tag of a chapterelement, and start-point(chapter[1])would return a single point located after the opening tag of the first chapterelement. end-point This is the syntax for end-point: location-set end-point(location-set) This function returns a location set composed of all the ending points for each location of the location-set input parameter: here This is the syntax for here: location-set here() This function is valid only when being interpreted within an XML document or external parsed parameter. It returns a location-setcomposed of a single member, which is the node that contains the expression being evaluated. For a text node within an element node, the element node is returned. origin This is the syntax for origin: location-set origin() This function is applicable only when using XLink. It returns a location-set that locates the element from where the traversal began: XPointer Summary XPointer has not yet achieved recommendation status from the W3C. It has actually been broken up into several specifications. Using XPath syntax should be safe without having to anticipate any changes. This syntax is fully supported in libxml and the PHP 5 extensions where XPointer is applicable. The extended functionality presented here may change over time, and currently the extended functionality is not fully supported in libxml. Introducing XInclude XInclude is a W3C specification for including external documents, fragments, and other content within an XML document. This technology differs from the use of external entities in many ways. External entities are processed while a document is parsing. XInclude is independent of

Note: If you are looking for good and high quality web space to host and run your application check Lunarwebhost Cheap Web Hosting services

PHP, MySQL, Java, Unix Cheap Web Hosting

150 CHAPTER 4 XPATH, XPOINTER, XINCLUDE, AND

Filed under: PHP and XML — webmaster @ 20:34

150 CHAPTER 4 XPATH, XPOINTER, XINCLUDE, AND THE FUTURE Given the following document: everything between the opening chaptertag with the xml:id=”chap1″and the closing chapter tag with the xml:id=”chap2″would be selected. string-range This is the syntax for string-range: location-set string-range(location-set, string, position?, length?) This function returns a set of ranges where the string value of the location-set matches the string parameter: The position parameter is optional and indicates the starting point of the range being returned relative to the matched string. The default value, when not specified, is 1, meaning that the starting point of the range will be the point preceding the character of the matched string. This finds all occurrences of the string Joe in name elements: xpointer(string-range(//name,”Joe”)) This selects the character e from the first occurrence of the string Joe: xpointer(string-range(/,”Joe”,2,1)[position()=1]) range This is the syntax for range: location-set range(location-set) This function returns a location set composed of the ranges for each location of the location-set input parameter: range-inside This is the syntax for range-inside: location-set range-inside(location-set) This function returns a location set composed of the ranges contained within each location of the location-setinput parameter. A location, which is a range, returns the range itself. Other locations use the location as the container node and return the range within the container.

Note: If you are looking for good and high quality web space to host and run your application check Lunarwebhost Cheap Web Hosting services

PHP, MySQL, Java, Unix Cheap Web Hosting

CHAPTER 4 XPATH, XPOINTER, XINCLUDE, AND THE

Filed under: PHP and XML — webmaster @ 08:38

CHAPTER 4 XPATH, XPOINTER, XINCLUDE, AND THE FUTURE 149 XPointer Extending XPath At first glance, it may seem that XPointer is just an XPointer function taking an XPath expression as an argument. For the most part it is, but it also extends XPath to offer some additional functionality. XPath introduces some additional concepts such as locations, location types, location sets, points, and ranges. It adds some functions that can be used under XPointer. The following sections are not a complete, in-depth examination of XPointer and its extended functionality. At the current time, XPointer is still a working draft, and not all functionality is implemented in libxml. All XPath topics covered to this point are fully supported, however. Location, Location Types, and Location Sets The basic unit within XPath is the node, and a document is a tree of nodes. XPointer generalizes this and uses the concept of a location. A location not only includes nodes, from the XPath point of view, but also includes points and ranges, which I will explain shortly. A location type is a node type, point type, or range type. Location sets are generalized node sets. They not only include nodes, but they also include points and ranges. Points and Ranges Points and ranges represent non-node locations, but they are considered to be two additional node types that can be used when writing expressions. A point can represent the position preceding or following an element node as well as a location preceding any individual character within a text node, comment, attribute value, or PI. It is defined by a container node and an index, which is a non-negative integer. The index, unlike an XPath position, is zero-based. Points do not have expanded names and have empty string values. A range, defined by starting and ending points, contains all the XML structure in between. Just as a point is just some position within a document, a range can contain partial pieces of nodes. Ranges for nodes other than element, text, and root nodes must have the same container node for the starting and ending points. For example, a range with a starting point inside a comment node must have an ending point within the comment node. The ending point cannot extend past the comment node. Functions XPointer adds some new functions to those already available from XPath. You can use these functions to deal with ranges, location sets, and pointers, which are not part of XPath. range-to This is the syntax for range-to: location-set range-to(location-set) This function returns a range consisting of a starting point from the context and an ending point determined from the location set passed in as the parameter. The following example would return a range from the starting point for the element identified by the ID chap1 to the ending point of the element identified by the ID chap2: xpointer(id(”chap1″)/range-to(id(”chap2″)))

Note: If you are looking for good and high quality web space to host and run your application check Lunarwebhost Cheap Web Hosting services

PHP, MySQL, Java, Unix Cheap Web Hosting

148 CHAPTER 4 XPATH, XPOINTER, XINCLUDE, AND

Filed under: PHP and XML — webmaster @ 21:53

148 CHAPTER 4 XPATH, XPOINTER, XINCLUDE, AND THE FUTURE xpointer(//*[@specials=”BADVALUE”])xpointer(//*[@specials]) The results of this will be all elements containing the attribute specials because of the failure of the first expression. The following example returns the same results as the previous example. The expression xpointer(//*[@specials]) resulted in returning data, so the last expression, xpointer(//*), is never executed. xpointer(//*[@specials=”BADVALUE”])xpointer(//*[@specials])xpointer(//*) XPointer and Namespaces When I discussed namespaces with regard to XPath, one of problems encountered was dealing with default namespaces in documents. I mentioned that some technologies offer ways to register namespaces and prefixes to be used within the XPath queries. XPointer is one of the technologies providing functionality for this. For example: tomato lettuce apple Given this document containing a default namespace of http://www.example.com/ produce, all vegetable elements need to be retrieved. Using XPath, you would need to test either the local names of the elements or the namespace uri for the elements: /*/*[@local-name()=”vegetable”] XPointer adds the ability to register namespaces to be used for the XPointer expressions in the following form: xmlns(prefix=URI) prefix is the prefix to associate with the namespace URI identified by URI. Using this notation, the XPointer expression would be as follows: xmlns(veg=http://www.example.com/produce)xpointer(//veg:vegetable) Just as the XPointer expressions can be stacked, so can the namespace registrations (the following code has been split over two lines because of length): xmlns(veg=http://www.example.com)xmlns(fr=”http://www.example.com/fruit) xpointer(//veg:vegetable) In the event the same prefix is defined multiple times, the rightmost definition is the one used. An example of this is when you define the prefix veg multiple times. For example: xmlns(veg=http://www.example.com)xmlns(veg=”http://www.example.com/fruit) This causes veg to be associated with the namespace http://www.example.com/fruit.

Note: If you are looking for good and high quality web space to host and run your application check Lunarwebhost Clan Web Hosting services

PHP, MySQL, Java, Unix Cheap Web Hosting

CHAPTER 4 XPATH, XPOINTER, XINCLUDE, AND THE

Filed under: PHP and XML — webmaster @ 11:35

CHAPTER 4 XPATH, XPOINTER, XINCLUDE, AND THE FUTURE 147 XPointer and XPath Expressions XPointer, being an extension of XPath, uses the XPath syntax. This section will not attempt to cover the full XPath syntax, because I explained this earlier. I will use the document in Listing 4-4 to show how to reimplement the XPath expressions here using XPointer. There is little new information in this section because writing XPointer expressions is a simple as this: xpointer(xpath_expression) Taking a few of the example XPath expressions, the equivalent versions in XPointer are as follows: /* Select all elements containing the attribute names specials */ xpointer(//*[@specials]) /* Select all time elements having a parent named fruit */ xpointer(//time[../self::fruit]) /* Select all elements with a child element named price having a value > 1.99 */ xpointer(//*[price > 1.99]) XPointer is really as easy as that. When used with a URI, the xpointer part is the document fragment portion of the URI. For example, suppose the produce document from Listing 4-4 was a file located at http:// www.example.com/produce.xml. The desired result is to retrieve all elements that contain the specials attribute, which was the first example listed previously. For example: http://www.example.com/produce.xml#xpointer(//*[@specials]) The URL is broken down into two components: the base URL, which is http:// www.example.com/produce.xml, and the document fragment, xpointer(//*[@specials]). In essence, the full URL is equivalent to saying, Using the produce.xml file located at http:// www.example.com, return all elements containing a specials attribute from the document. As you will see in later sections of this chapter, you don t always need full URLs because you can imply them by other means; therefore, simply using the xpointer(xpath_expression) syntax may be enough. It is also worthy to note that XPointer is most often used when employing XInclude, which will be covered in the Introducing XInclude section, and XSL, which will be covered in Chapter 10. You will also see XPointer used in conjunction with XLink. I have included a brief introduction to XLink, but this technology is really out of the scope of this book. Currently, XLink is not supported by libxml2, the underlying XML library used within PHP 5, and no future plans exist to support it. Stacking XPointer Expressions Another nice feature of XPointer is the ability to stack expressions. If the first expression fails, then the following expression runs. You can add expressions to be processed only if the preceding expression has failed. Continuing to use the data from Listing 4-4, XPointer will first attempt to retrieve all elements with the attribute specials having the value BADVALUE. This document doesn t have any of these attributes with that value, so the expression fails, and the second expression is processed:

Note: If you are looking for good and high quality web space to host and run your application check Lunarwebhost Cheap Web Hosting services

PHP, MySQL, Java, Unix Cheap Web Hosting

146 CHAPTER 4 XPATH, XPOINTER, XINCLUDE, AND

Filed under: PHP and XML — webmaster @ 01:42

146 CHAPTER 4 XPATH, XPOINTER, XINCLUDE, AND THE FUTURE Calculations Using functions within XPath allows some calculations to be performed. Calculations and functions are typically reserved for use in a predicate. It is possible, though, for XPath to return results other than node sets. Using Listing 4-6, you can obtain the sum of all priceelements. For these examples, brevity over optimization will be the factor for writing the expressions. For example: sum(//*[local-name()=”price”]) This will return the value 100439.95. This will also retrieve the total number of price elements, indicating the number of items in the store: count(//*[local-name()=”price”]) This returns the value 7. Using these two results, you can obtain the average item price, which will be rounded: round(sum(//*[local-name()=”price”]) / count(//*[local-name()=”price”])) The resulting value for the rounded average price is 14349. Using calculations to return non-node sets in XPath is pretty limited. For example, you simply cannot calculate the worth of inventory on hand. This involves taking the sumof (price * qty)for each item. The sumfunction takes a node set as an argument, so you have no way to perform this mathematically. You can also perform calculations within the predicate. For some strange reason, your workflow requires that every other bookelement needs to be selected for processing: //*[local-name()=”book” and position() mod 2 = 1] The position of the book element is tested to find out whether it is odd or even. You can do this through the position() mod 2 piece of the predicate. The operator mod returns the remainder from a truncating division, so the value 1 means the position is odd. This query returns every other bookelement in the document starting with the first one encountered. XPath Summary You can use XPath locate and retrieve information from a document. As you have seen, it is simple to use yet offers the ability for advanced and complex querying. In Chapters 6, 7, and 10, which cover the PHP 5 XML extensions, you will be exposed to more XPath techniques. You will not only use it through the extensions but also as the foundation of XSLT. Introducing XPointer XPointer is a W3C specification, though still a working draft, used for fragment identification for URI references. It is an extension of XPath so uses the same syntax to address the internal structure of an XML document using a URI. You must perform character escaping for XPointer expressions depending upon the content of the expression. This means that if XPointer is used within a URI, it must follow the same escaping rules a URI follows; for instance, you must escape a space to %20. When used within an XML document, it must follow the escaping rules for XML. For example, XPath uses quotes around string values. XPointer, when embedded within a document, must have the quotes escaped, such as using ".

Note: If you are looking for good and high quality web space to host and run your application check Lunarwebhost PHP Web Hosting services

PHP, MySQL, Java, Unix Cheap Web Hosting

CHAPTER 4 XPATH, XPOINTER, XINCLUDE, AND THE

Filed under: PHP and XML — webmaster @ 13:43

CHAPTER 4 XPATH, XPOINTER, XINCLUDE, AND THE FUTURE 145 This still hasn t selected all the nodes you originally wanted. The node set is supposed to contain both bookand magazine elements. Right now, you have two distinct queries. One selects the bookelements, and the other selects the magazineelements. An easy way at this point to get the desired results is to use the union operator. This operator joins node sets together. If you thought the previous queries were overwhelming, take a look at how to use a union with the two queries: /*/*[1]/*/*[*[local-name()=”pubdate” and substring(., 1, 4)=”2002″]] | /*/*[2]/*[*[local-name()=”issue” and substring(., 1, 4)=”2002″]] This query is actually a single line. It joins the first query, selecting the book elements, with the second query, selecting the magazine elements using |, which is the union operator. If you re using XML, you probably tend to be more on the daring side. You must be able to write a query without using the union operator that will select all the elements in one shot, right? A simplified way is to write this: //*[*[(local-name()=”pubdate” or local-name()=”issue”) and substring(., 1, 4)=”2002″]] This again doesn t fit on a single line, but in the XML world you can ignore insignificant whitespace. This query checks every element in the document to see whether it has a child element with the local name pubdateor issue. If either of these is TRUE, then it checks the substring of the string value for that child element: /*/*[local-name()=”books” or local-name()=”magazines”] //*[*[(local-name()=”pubdate” or local-name()=”issue”) and substring(., 1, 4)=”2002″]] This is another one-liner broken into multiple lines. This is an optimized version of the previous query. The previous query selected every element in the document. In this revised version, it specifies to select only from the booksor magazine subtree. The document in Listing 4-6 has a cdstree, which could contain any number of cd elements. Rather than checking those, because only book and magazineelements are to be returned, the two subtrees are explicitly set in the path. Within those subtrees, on the other hand, every element is checked. You will notice the use of //after the predicate for the booksand magazines elements. That again is the abbreviation for descendants-or-self::node(), where node() is the element because of the axis. The following queries are alternative ways to write this query. Each is specific to the document in Listing 4-6. If you added types, such as dvdselements, they may not work. /* Using position of element */ /*/*[position() < 3]//*[*[(local-name()="pubdate" or local-name()="issue") and substring(., 1, 4)="2002"]] /* Checking for != cds */ /*/*[local-name() != "cds"]//*[*[(local-name()="pubdate" or local-name()="issue") and substring(., 1, 4)="2002"]] These queries all select the same node sets. Since I ve already covered everything you need to break these queries down, I will leave it up to you to figure out how they work.

Note: If you are looking for good and high quality web space to host and run your application check Lunarwebhost Cheap Web Hosting services

PHP, MySQL, Java, Unix Cheap Web Hosting

144 CHAPTER 4 XPATH, XPOINTER, XINCLUDE, AND

Filed under: PHP and XML — webmaster @ 03:38

144 CHAPTER 4 XPATH, XPOINTER, XINCLUDE, AND THE FUTURE Comparisons I demonstrated simple comparisons earlier in this chapter, but that was before the introduction of functions. This section will provide a more in-depth look at expressions performing comparisons as well as calculations. I ll continue to use the document in Listing 4-6 as the document being queried. Performing a search based on a date may seem like a daunting task. Within the document, the element pubdate is using the format YYYY-MM-DD, which also conforms to the XML Schema date type. Unfortunately, XPath does not offer any date functions, so these values are treated as strings. However, string functions are available that can be manipulated to accomplish the task at hand. So, how do you go about selecting all books and magazines published in 2002? You will need substring functions to split the date apart. It is a given, because the dates conform to the XML Schema date type, that the first four characters are the year, so using the substring function, the starting position is 1 and the length is 4: /*/*[1]/*/*[*[local-name()=”pubdate” and substring(., 1, 4)=”2002″]] No, you are not going cross-eyed. This is really a valid XPath query. The initial path should look familiar to you. The path /*/*[1]/*/*is within the books subtree because you are using the first position, and it selects all element nodes on the level at which the bookelements reside. Within the document, this selects all bookelements, because no other types of elements are on this level within the bookssubtree. The predicate is where you may get a little bug-eyed. Breaking the predicate, [*[local-name()=”pubdate” and substring(., 1, 4)=”2002″]], into pieces, the first * indicates that the filter takes place on all child elements of the current node set. The current node set, in this case, consists of all the book elements. That leaves another predicate: [local-name()=”pubdate” and substring(., 1, 4)=”2002″]. This predicate is performed on all the child elements of the current node set. The first test is to see whether the local name matches pubdate. If this returns TRUE, then you know the current node being run against this filter is a pubdate element. You can then check the string value of this element using the substring function to see whether the first four characters match 2002. The reason the first parameter is . (a period) is that the context node itself or the current node is being passed as an argument to the function. You can also write the substringfunction as substring(self::*, 1, 4) or substring(child::text(), 1, 4). An element has a string value that consists of all text nodes within its contents and the contents of its children. Passing in the context node, which must be a pubdate element since it passed the first check, will effectively pass in the text containing the date being searched. This query may have looked complicated but, once broken out, should be easy to understand. Well, you have selected all the book elements, but the query is supposed to also return all the magazine elements published in 2002. You face a few problems: the elements do not live on the same level within the document, the names of the elements being returned are not the same, the element names containing the dates are not the same, and they also live in different subtrees. For starters, the magazine elements that have an issue date in 2002 will be selected: /*/*[2]/*[*[local-name()=”issue” and substring(., 1, 4)=”2002″]] This query is almost the same as the query for the book elements. The differences here are that the magazine subtree is being traversed (indicated by the /*/*[2]portion of the path), the steps are not as deep (notice there is a /* removed from the path), and the local name test is now performed against the string issue. The query is broken down the same way the previous book selection was broken down.

Note: If you are looking for good and high quality web space to host and run your application check Lunarwebhost PHP Web Hosting services

PHP, MySQL, Java, Unix Cheap Web Hosting

CHAPTER 4 XPATH, XPOINTER, XINCLUDE, AND THE

Filed under: PHP and XML — webmaster @ 14:49

CHAPTER 4 XPATH, XPOINTER, XINCLUDE, AND THE FUTURE 143 All these queries are equivalent. The predicate is filtering based on the position of the node within the node set returned from /*/*. If you can be certain, usually from a DTD or schema, that the first child element of the store element is the books element, then each of the expressions filters for the node that is the first node in document order in the node set. The last expression uses the single numeric 1. A single numeric as an expression is the abbreviation for writing position()=[number]. Tip You can abbreviate the expression [position()=x] as simply [x]. Using a number alone is equivalent to calling the position() function. Within the bookselement, the books are contained within parent elements that describe the types. At this point, the types are of no concern, so this step will take the form of *. The last step is to select the bookelements. I have already presented the expression for this; you use a check on the local name. Combining all the steps, you could write queries of the following forms: /*/*[local-name() = “books”]/*/*[local-name()=”book”] /*/*[position()=1]/*/*[local-name()=”book”] /*/*[position() < 2]/*/*[local-name()="book"] /*/*[1]/*/*[local-name()="book"] Each of these queries will result in the selection of the five book elements. This raises an interesting question. You may know the structure of the document, but how could you select only book elements within the http://www.example.com/classicbook namespace? In Listing 4-6, the book element within this namespace has redefined the bk prefix, so using the QName with a prefix of bk is not an option. The prefix bk will be associated with the http://www.example.com/book namespace because of scoping. You aren t using any technologies at this point that allow you to register a namespace and prefix, so that is also not viable. One way to accomplish this is to test the actual namespace on the element: /*/*[1]/*/*[namespace-uri()="http://www.example.com/classicbook"] Rather than testing for the local name of the element, you can test the actual URI of the namespace. This example assumes no other elements on the same document level as the book elements exist and reside in the same namespace. If this is a possibility, the predicate can include the check of the local name: [local-name()="book" and namespace-uri()="http://www.example.com/classicbook"] In this case, it first makes sure the element has a local name of book and, if that is TRUE, checks whether the namespace URI is http://www.example.com/classicbook. You can also optimize this expression. Once an expression returns FALSE, no further filtering takes place for the current node. In the case of the books element, you can safely assume that the majority of the child elements are book elements. Most of them, however, would not be in the namespace being searched. Checking the namespace URI first would eliminate almost every check for the local name of the node. So, an optimized predicate would be as follows: [namespace-uri()="http://www.example.com/classicbook" and local-name()="book"]

Note: If you are looking for good and high quality web space to host and run your application check Lunarwebhost Cheap Web Hosting services

« Previous PageNext Page »

Powered by Cheap Web Hosting