Using the XPath API to figure out my stats

One major problem I have with base WordPress is while it does a good job giving you stats around views (the number of hits you get per post) it does not do a good job for post counts. In the previous post I had to create a chart that showed how many blog posts I wrote in the months and years past. I am sure there is a plugin out there for this so if you know of one please mention it in the comments!

My first attempt was to export my posts in XML format and then use Excel to parse out the post date and do reporting that way. Unfortunately my export XML was over 5MB and Excel choked on it when attempting to import the data set.

My next attempt was to write a small Java program using Eclipse and the XPath API’s to get the dates of the post and pump them out to a comma¬†delimited file. Believe it or not that was actually very easy and it only took me about 20 minutes. The source in its entirety is below. The result was a single column list of dates in a file, I then imported that into Excel and was able to make the charts.

The key line that makes this so easy is the expression line:

XPathExpression expr = xpath.compile("/rss/channel/item/post_date");

By passing in the path to the first “post_date”, the XPath API was able to return an entire NodeList of all elements similar (an array of elements). I then just got the text content of that element and wrote it out to a file:

			for (int x=0; x<result.getLength(); x++){
				Node n = result.item(x);

				String d = n.getTextContent();

				out.println(d);
			}

 

Full Source Code:

import java.io.BufferedOutputStream;
import java.io.File;
import java.io.FileOutputStream;
import java.io.IOException;
import java.io.PrintWriter;

import javax.xml.parsers.DocumentBuilder;
import javax.xml.parsers.DocumentBuilderFactory;
import javax.xml.parsers.ParserConfigurationException;
import javax.xml.xpath.XPath;
import javax.xml.xpath.XPathConstants;
import javax.xml.xpath.XPathExpression;
import javax.xml.xpath.XPathExpressionException;
import javax.xml.xpath.XPathFactory;

import org.w3c.dom.Document;
import org.w3c.dom.Node;
import org.w3c.dom.NodeList;
import org.xml.sax.SAXException;

public class WordPress {

	/**
	 * @param args
	 */
	public static void main(String[] args) {
		Document document;
		try {
			DocumentBuilder dom;
			DocumentBuilderFactory factory = DocumentBuilderFactory.newInstance();
			factory.setNamespaceAware(false);
			factory.setIgnoringElementContentWhitespace(true);
			factory.setValidating(false);
			factory.setXIncludeAware(false);
			factory.setExpandEntityReferences(false);
			factory.setIgnoringComments(true);

			XPathFactory f = XPathFactory.newInstance();

			XPath xpath = f.newXPath();

			dom = factory.newDocumentBuilder();

			document = dom.parse("c:\bob039sblog.wordpress.2012-12-17.xml");

			XPathExpression expr = xpath.compile("/rss/channel/item/post_date");

			NodeList result = (NodeList)expr.evaluate(document, XPathConstants.NODESET);		

			if (result.getLength()>= 0) return;

			File outFile;
			PrintWriter out;
			outFile = new File("wordpress.csv");

			out = new PrintWriter(new BufferedOutputStream(new FileOutputStream(outFile)));
			out.println("Pub_Date");

			for (int x=0; x<result.getLength(); x++){
				Node n = result.item(x);

				String d = n.getTextContent();

				out.println(d);
			}

			out.flush();

			out.close();

		} catch (SAXException e) {
			// TODO Auto-generated catch block
			e.printStackTrace();
		} catch (IOException e) {
			// TODO Auto-generated catch block
			e.printStackTrace();
		} catch (XPathExpressionException e) {
			// TODO Auto-generated catch block
			e.printStackTrace();
		} catch (ParserConfigurationException e) {
			// TODO Auto-generated catch block
			e.printStackTrace();
		}
	}

}

One thought on “Using the XPath API to figure out my stats

  1. Pingback: Using XPath API to figure out my stats post is now translated in Spanish! | Bob's Blog

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s