Skip to content
By Jason Stirk

Rolling your own Ruby SOAP client with Typhoeus and Nokogiri

Overview

As our applications become better connected with the net in general, we find we need to communicate with 3rd party services.

If we are lucky, these are clean, well documented, RESTful APIs which make our heart sing. If we are super lucky, they even confirm to what ActiveResource expects, and will be trivial to implement.

But often luck leaves us wanting, and we have to integrate with legacy applications offering only SOAP or XML-RPC APIs. Often, these come saddled with little or lacklustre documentation.

Today we are going to look at consuming SOAP interfaces using some lightweight Ruby with the help of Typhoeus and Nokogiri.

You can apply the techniques in this post for XML-RPC too - it's a much simpler XML format - but we won't be documenting it here.

The full code for this post is available on GitHub .

What is SOAP?

SOAP is a web service protocol using XML over HTTP.

In most cases it is a RPC-style protocol, meaning that you are dispatching requests to run a particular method on the server, as opposed to a RESTful API where you deal primarily with documents.

There is also a document style approach for SOAP, but the RPC approach seems to be far more common.

Strictly formatted SOAP "Envelopes" are crafted and sent to the server, where the request parameters are parsed, the relevant server side code is run, the response is serialised into XML and returned in another strictly structured envelope.

Whilst this is a very simplified explanation, we can think of a SOAP Client as just sending and receiving structured XML.

Whilst not strictly part of SOAP, it can also be useful to be aware of WSDL, which is an XML format for describing a web service.

Many servers will provide machine-readable documentation for their API using this format, and this can provide handy documentation if the human-documentation is lacking.

Check your vendor's documentation for details, or try adding ?wsdl to the end of the service URL.

Tools like SOAP Client can parse these, and can help you to get an idea of what a valid request might look like.

Alternatives

Rolling your own SOAP client isn't always necessarily the best way to do, and there are alternatives :

Typically, SOAP Clients in strongly typed languages tend to be in the form of code automatically generated from a WSDL definition. In Ruby this can tend to be overkill unless you are dealing with a particularly complex API.

We have also found drawbacks with this approach, including the generated code being brittle and hard to maintain. Generated code code can also force the server's naming standards upon the client, which may not be desirable.

Typhoeus

Typhoeus is a Ruby HTTP client that uses curl under the hood. It's fast, but most importantly it gives us the option to perform parallel connections to the server.

If you are making requests in the background using rake tasks or DJ this won't be so exciting, but if you have cause to make multiple SOAP requests during the request cycle, this can dramatically speed up the process.

Typhoeus is also structured in a way that makes testing our SOAP client much easier - we can easily stub out requests to return fixture data without needing to stub out more than is necessary.

To use it, just add gem 'typhoeus' to your Gemfile.

We will focus on sequential requests to start with, with some notes at the end about making parallel requests using Typhoeus::Hydra.

Nokogiri

Nokogiri is a fast, feature rich XML library. We're going to be producing and consuming a fair bit of XML using this technique, so we need a library that can handle XPath and namespaces, and do it quickly.

We also find that Nokogiri is so heavily used elsewhere that it's a tool that we're already comfortable with, reducing the learning curve, and ensuring that the code is easy to maintain.

Add gem 'nokogiri' to your Gemfile if you haven't already.

XPath and Namespaces

If you don't know much about how XML namespaces work, I suggest you read up a little about them first - A Gentle Introduction to Namespaces.

The SOAP envelope is structured from elements in several namespaces, and in most cases the actual request and response will belong to a vendor-specific namespace.

In the end, we really only care about the vendor-specific namespace, but we need to be aware of the namespaces to ensure that our XPath queries will work correctly.

Structure

We are going to structure our SOAP client as several classes, all namespaced into a module to keep things tidy.

Today we'll call our client MySoapService.

Our Base class will contain all of our core methods, like the HTTP transport, and the logic for generating the SOAP Envelope boilerplate. We can then inject our specific method calls within that framework.

We will subclass our Base class for the specific domain objects that we are going to work with. A SOAP service's methods are all in the same, flat namespace, so you will often find them named according to the domain object they work against (eg. "GetCustomerDetails", "CreateOrder", etc.). For neatness, we find it easiest to break these into separate classes - in this case, Customer and Order classes.

This has some major benefits:

  • Because we are manually serialising and deserialising the records to XML, we have complete control over any transformations or calculations that need to be done.

    For example, our SOAP API may always return timestamps in a particular timezone - we have the flexibility to shift those into UTC before it even leaves our SOAP Client.

  • We can also perform any validations before we make the SOAP request, saving the round trip to the server to have the request rejected.

    This can be particularly handy for requests where certain fields are mandatory depending on other data.

Finally, we need to handle any SOAP Fault objects that may be returned. For this purpose we define a SoapError class so as that we can use the standard Ruby exception handling functionality.

    module MySoapService
      class Base
      class Customer < Base
      class Order < Base
      class SoapError < StandardError
    end

The Base Class

The Base class has the responsibility to handle :

  • all the SOAP Envelope boilerplate, and help abstract away dealing with the different namespaces;
  • any common part of the request, like authentication;
  • the HTTP communication with the server;
  • returning the XML response, or turning any SOAP Fault objects into Ruby exceptions.

First off, we need to build up the SOAP envelope using Nokogiri.

    module MySoapService
      NAMESPACE    = "http://mysoapservice.com/"
      CLIENT_TOKEN = "ABCDEABCDE"

      class Base
        def self.construct_envelope(&block)
          Nokogiri::XML::Builder.new do |xml|
            xml.Envelope("xmlns:soap12" => "http://www.w3.org/2003/05/soap-envelope",
                         "xmlns:xsi" => "http://www.w3.org/2001/XMLSchema-instance",
                         "xmlns:xsd" => "http://www.w3.org/2001/XMLSchema") do
              xml.parent.namespace = xml.parent.namespace_definitions.first
              xml['soap12'].Header do
                # Header information goes here
                xml.ClientHeader("xmlns" => NAMESPACE) do
                  xml.ClientToken CLIENT_TOKEN
                end
              end
              xml['soap12'].Body(&block)
            end
          end
        end
      end
    end

This gives us a method we can call with a block to generate our request Envelope.

You will notice there's quite a bit of SOAP boilerplate there - we have a "soap12:Envelope", with "soap12:Header" and "soap12:Body" elements inside. SOAP utilises several separate XML schemas, hence the plethora of namespaces on the "Envelope" element.

In the case of our server, we need to include a "ClientHeader" in every request with the relevant "ClientToken". Other servers may expect authentication data in the Envelope Body, may not require authentication, or may authenticate at the HTTP level instead.

Note that we make sure that our "ClientHeader" element belongs to our vendor-specific namespace. This will also need to be done for the contents of the body, otherwise they will inherit the "soap12" namespace and be invalid.

We will detail exactly how the subclasses perform a request later, but for the moment we can see that we can generate an Envelope like so :

    envelope = MySoapService::Base.construct_envelope do |xml|
      xml.GetCustomer('xmlns' => NAMESPACE) do
        xml.CustomerID ('12345')
      end
    end

    envelope.to_xml # =>
    # <?xml version="1.0"?>
    # <soap12:Envelope xmlns:soap12="http://www.w3.org/2003/05/soap-envelope" xmlns:xsd="http://www.w3.org/2001/XMLSchema" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance">
    #   <soap12:Header>
    #     <ClientHeader xmlns="http://mysoapservice.com/">
    #       <ClientToken>ABCDEABCDE</ClientToken>
    #     </ClientHeader>
    #   </soap12:Header>
    #   <soap12:Body>
    #     <GetCustomer xmlns="http://mysoapservice.com/">
    #       <CustomerID>12345</CustomerID>
    #     </GetCustomer>
    #   </soap12:Body>
    # </soap12:Envelope>

Now, we need to send this request off to the server using Typhoeus. Something like this will suffice :

    module MySoapService
      SERVICE_URL  = "http://soap.example.com/"

      class Base
        cattr_reader :last_request, :last_response

        # Performs the SOAP request
        def self.soap_it!(&block)
          data = construct_envelope(&block)
          @@last_request = data
          response = Typhoeus::Request.post(SERVICE_URL,
                                  :body    => data.to_xml,
                                  :headers => {'Content-Type' => "text/xml; charset=utf-8"})

          process_response(response)
        end
      end
    end

Here we can see Typhoeus making the POST request to the server. Nothing particularly exciting.

One thing to note is that the Content-Type used above isn't a certainty - some servers will insist on accepting only application/xml, others will be more liberal.

At this stage, Typhoeus will make the request immediately, as soon as the soap_it! request is made. Whilst this is suitable in some cases, we will look at how to make Typhoeus perform parallel requests a bit later.

Finally, let's implement that process_response method and turn any SOAP Fault responses into an Exception.

    # Processes the response and decides whether to handle an error or
    # whether to return the content
    def self.process_response(response)
      @@last_response = response

      if response.body =~ /soap:Fault/ then
        handle_error(response)
      else
        return response
      end
    end

    # Parses a soap:Fault error and raises it as a MySoapService::SoapError
    def self.handle_error(response)
      xml   = Nokogiri::XML(response.body)
      xpath = '/soap:Envelope/soap:Body/soap:Fault//ExceptionMessage'
      msg   = xml.xpath(xpath).text

      # TODO: Capture any app-specific exception messages here.
      #       For example, if the server returns a Fault when a search
      #       has no results, you might rather return an empty array.

      raise MySoapService::SoapError.new("Error from server: #{msg}", @@last_request, @@last_response)
    end

Putting it all together in a neat bundle, we have something like this.

Our SoapError class doesn't need to be too exciting - it just adds some helpers to let us easily see the request and responses that caused the exception. See here.

    class SoapError < StandardError
      attr_reader :last_request, :last_response

      def initialize(message, last_request, last_response)
        @last_request  = last_request
        @last_response = last_response
        super(message)
      end
    end

Domain Object Subclasses

Now that we have our base class handling the communication, we can start making real SOAP requests.

We do this by inheriting from the Base controller, and having our methods call soap_it! like so :

    module MySoapService
      class Customer < Base
        def self.find_by_email(email)
          response = soap_it! do |xml|
            xml.GetCustomerDetail('xmlns' => NAMESPACE) do
              xml.EmailAddress email
            end
          end

          parse_customer_response(response)
        end
      end
    end

This will make our "GetCustomerDetail" request to the server, sending the "EmailAddress" parameter. Because it's just a builder, you can easily handle whatever structure the server expects.

Some servers insist on you providing the data types, and you can do this like so:

    xml.EmailAddress("xsi:type" => "xsd:string", email)

XSD defines a variety of Data Types, and you will want to check the documentation from your API vendor for the exact requirements.

Finally, we need to implement our parse_customer_response method, which might look like this :

    # Map XML elements to our local fields
    CUSTOMER_FIELD_MAPPING = { :Name => :name, :EmailAddress => :email }

    def self.parse_customer_response(response)
      if response then
        xml    = Nokogiri::XML(response.body)
        xpath  = '/soap:Envelope/soap:Body/mss:GetCustomerDetailResponse/mss:GetCustomerDetailResult/mss:Customers/mss:Customer'
        result = xml.xpath(xpath, namespaces(xml)).first
        data   = {}
        CUSTOMER_FIELD_MAPPING.each do |soap_element, field|
          data[field] = result.xpath("./mss:#{soap_element}", namespaces(xml)).first.text
        end

        ::Customer.new(data)
      else
        # Nothing to be done, no data returned
        false
      end
    end

You will notice that our XPath expression uses a "mss:" namespace prefix - this relates to our namespaces() method, which we will define next.

The other thing to note is that we use ::Customer to force it to instantiate the top level Customer class defined in our application, rather than MySoapService::Customer. In this case it's likely to be an ActiveRecord model.

(The astute reader will also notice that these calls to .xpath(...) could be neatened up into a helper on the Base class)

The namespaces() method lives in the Base class, and is a simple helper to tell XPath the namespace alias for the vendor-specific elements, like so :

    # Merges in the MySoapService namespace
    def self.namespaces(xml)
      { 'mss' => NAMESPACE }.merge(xml.document.namespaces)
    end

Putting it all together, we have a Customer class something like this.

Making a request to the SOAP API is now as simple as :

    customer = MySoapService::Customer.find_by_email('foo@example.com')
    customer.email # => "foo@example.com"

Testing your Client

You will obviously want to heavily test your SOAP client. To do this, we need to stub the calls to the server - this ensures that your tests are consistent, fast, and can't change when someone changes data on the server.

Thankfully, Typhoeus makes stubbing the HTTP requests quite easy, and we use XML fixture files containing the response. We use these helpers in our rSpec tests to do this :

    def typhoeus_stub(name)
      response = Typhoeus::Response.new(:code => 200, :headers => "", :body => xml_fixture(name), :time => 0.3)
      Typhoeus::Request.should_receive(:post).with(any_args()).at_least(:once).and_return(response)
    end

    def xml_fixture(name)
      File.read(File.join(Rails.root, 'spec', 'fixtures', 'soap', name.to_s + '.xml'))
    end

    def assert_in_soap_request(regex)
      MySoapService::Base.last_request.to_xml.should =~ regex
    end

We place XML files in spec/fixtures/soap/ - it's easy to add debugging statements to your Base class so as that you can write the XML request and response to the logs. It's fairly simple to make the queries from the console by hand using the soap_it! call, and you can tinker with the parameters until you receive the response you are expecting.

Finally, we run the response XML through xmllint --format before saving as a fixture so as that we can easily refer to the fixture data.

Another approach is to catch the XML content directly from the wire using tools like sudo tcpdump -i en1 -vvvA host soap.example.com or Wireshark, however these can be unreasonably difficult if accessing the API over HTTPS.

An example spec might look like this :

    it "should request a specific email address, returning the Customer" do
      typhoeus_stub(:customer)
      result = MySoapService::Customer.find_by_email('foo@example.com')

      # Ensure that the request XML includes the expected elements
      assert_in_soap_request /<EmailAddress>foo@example.com<\/EmailAddress>/

      # Ensure that our response has populated the class
      result.should be_a(Customer)
      result.email.should == 'foo@example.com'
    end

The important bit here is that we only stub out the actual HTTP request - soap_it! still receives a real Typhoeus::Response object, and everything else works as it actually will. This is essential to ensure that our tests fail if we upgrade to a Typhoeus version that has major API changes.

Parallel Requests using Typhoeus::Hydra

One of the main benefits of using Typhoeus is making parallel requests to the server, and thankfully it's fairly easy.

First off, we revise our Base#soap_it! method like so :

    # Performs the SOAP request
    def self.soap_it!(for_hydra=false, &block)
      data = construct_envelope(&block)
      @@last_request = data
      if for_hydra then
        Typhoeus::Request.new(SERVICE_URL,
                                :method => :post,
                                :body => data.to_xml,
                                :headers => {'Content-Type' => "text/xml; charset=utf-8"})
      else
        response = Typhoeus::Request.post(SERVICE_URL,
                                :body    => data.to_xml,
                                :headers => {'Content-Type' => "text/xml; charset=utf-8"})

        process_response(response)
      end
    end

What is happening is that if we call soap_it!(true) we return a Typhoeus::Request object rather than running the request immediately.

Then, we can create a request in our subclass like this :

    def self.update_all_by_emails(emails)
      hydra = Typhoeus::Hydra.new(:max_concurrency => 10)
      emails.each do |email|
        request = soap_it!(true) do |xml|
          xml.GetCustomerDetail('xmlns' => NAMESPACE) do
            xml.EmailAddress email
          end
        end

        request.on_complete do |r|
          update_customer_from_response(r)
        end

        hydra.queue request
      end

      hydra.run
      true
    end

In this example we are making a request to the server to get multiple Customers by their email addresses, and calling the update_customer_from_response() method for each result.

Typhoeus::Hydra will make all the requests at once (up to the max_concurrency setting) and will process them as soon as they complete.

Because you are simply returning a Typhoeus::Request object, you could even feasibly collect these objects from a range of subclasses, add them to a single Hydra queue and have different request process in parallel.

The best way to structure this is an exercise left to the reader.

Testing the Hydra calls is a little more complex, but not too much so.

We use the following rSpec helper to stub the HTTP requests for these connections:

    def typhoeus_hydra_stub(name)
      response = Typhoeus::Response.new(:code => 200, :headers => "", :body => xml_fixture(name), :time => 0.3)
      hydra=Typhoeus::Hydra.hydra
      hydra.stub(:post, /http/).and_return(response)
      Typhoeus::Hydra.should_receive(:new).and_return(hydra)
    end

GOTCHA: Methods with Underscores

Be alert if you are working with methods with "unusual" names; some servers allow extended characters in the method names, and these are often escaped in creative ways.

The particularly problematic ones are underscores, as Nokogiri has some magic helpers in the builder that use underscores to create elements with the same name as reserved words. For example, xml.class would call Object#class, rather than create the element, so Nokogiri allows you to use xml.class_ and the underscore will be silently trimmed off.

In one case, we ran into a method named CreateCustomerOrder_x0028_OrderXML_x0029_. Not only was the documentation a bit lacking on this method (we eventually checked the WSDL by hand to find the method name!) but Nokogiri's magic "underscore" names kept removing the trailing underscore when we were creating the element.

Instead, we had to do this :

    response = soap_it! do |xml|
      # GOTCHA!!!! Nokogiri has some magic with trailing undescores. As part of that, it trims the
      #            underscore off. So, we have to put TWO underscores at the end - one for stripping
      #            by Nokogiri, and one for keeps.
      xml.send('CreateCustomerOrder_x0028_OrderXML_x0029__', 'xmlns' => NAMESPACE) do
        # ... method parameters go here ...
      end
    end

Conclusion

Using a combination of fast, feature rich libraries like Typhoeus and Nokogiri it is possible to quickly and easily construct a Ruby SOAP Client that is fast, can support parallel connections, and is easy for developers to maintain.

Latest Articles by Our Team

Our expert team of designers and developers love what the do and enjoy sharing their knowledge with the world.

We Hire Only the Best

reinteractive is Australia’s largest dedicated Ruby on Rails development company. We don’t cut corners and we know what we are doing.

We are an organisation made up of amazing individuals and we take pride in our team. We are 100% remote work enabling us to choose the best talent no matter which part of the country they live in. reinteractive is dedicated to making it a great place for any developer to work.

Free Community Workshops

We created the Ruby on Rails InstallFest and Ruby on Rails Development Hub to help introduce new people to software development and to help existing developers hone their skills. These workshops provide invaluable mentorship to train developers, addressing key skills shortages in the industry. Software development is a great career choice for all ages and these events help you get started and skilled up.

  • Webinars

    Webinars

    Webinars are our online portal for tips, tricks and lessons learned in everything we do. Make the most of this free resource to help you become a better developer.

    Learn more about webinars

  • Installfest

    Installfest

    The Ruby on Rails Installfest includes a full setup of your development environment and step-by-step instructions on how to build your first app hosted on Heroku. Over 1,800 attendees to date and counting.

    Learn more about Installfest

  • Development Hub

    Development Hub

    The Ruby on Rails Development Hub is a monthly event where you will get the chance to spend time with our team and others in the community to improve and hone your Ruby on Rails skills.

    Learn more about Development Hub

Get the “reinteractive Review” Monthly Email