Scrape XML Data Using PHP

In this how to, we are going to detail the steps needed to pull remote XML data into our web application using Curl and PHP. The information we are interested in is job listing data from Clear Company. Clear Company provides a “Complete¬†Talent Management Software” solution to aid Human Resources departments in hiring and on-boarding new employees. With that said, we support multiple clients using this system so it was to our benefit to write a PHP function that would pull the job listing data directly into our customers web sites. For this task we had two primary routes to choose from. We could use either SimpleXML (Built into PHP) or we could use Curl. In order to use SimpleXML, we would need to ensure that our PHP environment¬† was configured with allow_url_fopen = On. For most of our customers, this setting was not available. So our next option was Curl. In order to use Curl, our host must have php5-curl installed. This can be accomplished on Debian or Ubuntu systems with the following command:

It is important to note that most hosting companies do support PHP Curl, however allow_url_fopen = On, is almost always disabled for security reasons. So for obvious reason, we have chosen to use php5-curl as our solution. Below is the function we created to pull in job listing data and format it in an unrecorded list for use in our web application

Below you will see a screenshot of the finished product.job-listings

Did you find this article useful? Why not share it with your friends?

Leave a Reply

Your email address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.