Rolling your own Ruby SOAP client with Typhoeus and Nokogiri
Overview
As our applications become better connected with the net in general, we find we need to communicate with 3rd party services.
If we are lucky, these are clean, well documented, RESTful APIs which make our heart sing. If we are super lucky, they even confirm to what ActiveResource expects, and will be trivial to implement.
But often luck leaves us wanting, and we have to integrate with legacy applications offering only SOAP or XML-RPC APIs. Often, these come saddled with little or lacklustre documentation.
Today we are going to look at consuming SOAP interfaces using some lightweight Ruby with the help of Typhoeus and Nokogiri.
You can apply the techniques in this post for XML-RPC too - it's a much simpler XML format - but we won't be documenting it here.
The full code for this post is available on GitHub .
What is SOAP?
SOAP is a web service protocol using XML over HTTP.
In most cases it is a RPC-style protocol, meaning that you are dispatching requests to run a particular method on the server, as opposed to a RESTful API where you deal primarily with documents.
There is also a document style approach for SOAP, but the RPC approach seems to be far more common.
Strictly formatted SOAP "Envelopes" are crafted and sent to the server, where the request parameters are parsed, the relevant server side code is run, the response is serialised into XML and returned in another strictly structured envelope.
Whilst this is a very simplified explanation, we can think of a SOAP Client as just sending and receiving structured XML.
Whilst not strictly part of SOAP, it can also be useful to be aware of WSDL, which is an XML format for describing a web service.
Many servers will provide machine-readable documentation for their API using this format, and this can provide handy documentation if the human-documentation is lacking.
Check your vendor's documentation for details, or try adding ?wsdl
to the end of the service URL.
Tools like SOAP Client can parse these, and can help you to get an idea of what a valid request might look like.
Alternatives
Rolling your own SOAP client isn't always necessarily the best way to do, and there are alternatives :
Typically, SOAP Clients in strongly typed languages tend to be in the form of code automatically generated from a WSDL definition. In Ruby this can tend to be overkill unless you are dealing with a particularly complex API.
We have also found drawbacks with this approach, including the generated code being brittle and hard to maintain. Generated code code can also force the server's naming standards upon the client, which may not be desirable.
Typhoeus
Typhoeus is a Ruby HTTP client that uses curl under the hood. It's fast, but most importantly it gives us the option to perform parallel connections to the server.
If you are making requests in the background using rake tasks or DJ this won't be so exciting, but if you have cause to make multiple SOAP requests during the request cycle, this can dramatically speed up the process.
Typhoeus is also structured in a way that makes testing our SOAP client much easier - we can easily stub out requests to return fixture data without needing to stub out more than is necessary.
To use it, just add gem 'typhoeus'
to your Gemfile.
We will focus on sequential requests to start with, with some notes at
the end about making parallel requests using Typhoeus::Hydra
.
Nokogiri
Nokogiri is a fast, feature rich XML library. We're going to be producing and consuming a fair bit of XML using this technique, so we need a library that can handle XPath and namespaces, and do it quickly.
We also find that Nokogiri is so heavily used elsewhere that it's a tool that we're already comfortable with, reducing the learning curve, and ensuring that the code is easy to maintain.
Add gem 'nokogiri'
to your Gemfile if you haven't already.
XPath and Namespaces
If you don't know much about how XML namespaces work, I suggest you read up a little about them first - A Gentle Introduction to Namespaces.
The SOAP envelope is structured from elements in several namespaces, and in most cases the actual request and response will belong to a vendor-specific namespace.
In the end, we really only care about the vendor-specific namespace, but we need to be aware of the namespaces to ensure that our XPath queries will work correctly.
Structure
We are going to structure our SOAP client as several classes, all namespaced into a module to keep things tidy.
Today we'll call our client MySoapService
.
Our Base
class will contain all of our core methods, like the HTTP
transport, and the logic for generating the SOAP Envelope boilerplate.
We can then inject our specific method calls within that framework.
We will subclass our Base
class for the specific domain objects that we
are going to work with. A SOAP service's methods are all in the same,
flat namespace, so you will often find them named according to the
domain object they work against (eg. "GetCustomerDetails", "CreateOrder",
etc.). For neatness, we find it easiest to break these into separate
classes - in this case, Customer
and Order
classes.
This has some major benefits:
-
Because we are manually serialising and deserialising the records to XML, we have complete control over any transformations or calculations that need to be done.
For example, our SOAP API may always return timestamps in a particular timezone - we have the flexibility to shift those into UTC before it even leaves our SOAP Client.
-
We can also perform any validations before we make the SOAP request, saving the round trip to the server to have the request rejected.
This can be particularly handy for requests where certain fields are mandatory depending on other data.
Finally, we need to handle any SOAP Fault objects that may be returned. For this purpose we define a SoapError class so as that we can use the standard Ruby exception handling functionality.
module MySoapService class Base class Customer < Base class Order < Base class SoapError < StandardError end
The Base Class
The Base
class has the responsibility to handle :
- all the SOAP Envelope boilerplate, and help abstract away dealing with the different namespaces;
- any common part of the request, like authentication;
- the HTTP communication with the server;
- returning the XML response, or turning any SOAP Fault objects into Ruby exceptions.
First off, we need to build up the SOAP envelope using Nokogiri.
module MySoapService NAMESPACE = "http://mysoapservice.com/" CLIENT_TOKEN = "ABCDEABCDE" class Base def self.construct_envelope(&block) Nokogiri::XML::Builder.new do |xml| xml.Envelope("xmlns:soap12" => "http://www.w3.org/2003/05/soap-envelope", "xmlns:xsi" => "http://www.w3.org/2001/XMLSchema-instance", "xmlns:xsd" => "http://www.w3.org/2001/XMLSchema") do xml.parent.namespace = xml.parent.namespace_definitions.first xml['soap12'].Header do # Header information goes here xml.ClientHeader("xmlns" => NAMESPACE) do xml.ClientToken CLIENT_TOKEN end end xml['soap12'].Body(&block) end end end end end
This gives us a method we can call with a block to generate our request Envelope.
You will notice there's quite a bit of SOAP boilerplate there - we have a "soap12:Envelope", with "soap12:Header" and "soap12:Body" elements inside. SOAP utilises several separate XML schemas, hence the plethora of namespaces on the "Envelope" element.
In the case of our server, we need to include a "ClientHeader" in every request with the relevant "ClientToken". Other servers may expect authentication data in the Envelope Body, may not require authentication, or may authenticate at the HTTP level instead.
Note that we make sure that our "ClientHeader" element belongs to our vendor-specific namespace. This will also need to be done for the contents of the body, otherwise they will inherit the "soap12" namespace and be invalid.
We will detail exactly how the subclasses perform a request later, but for the moment we can see that we can generate an Envelope like so :
envelope = MySoapService::Base.construct_envelope do |xml| xml.GetCustomer('xmlns' => NAMESPACE) do xml.CustomerID ('12345') end end envelope.to_xml # => # <?xml version="1.0"?> # <soap12:Envelope xmlns:soap12="http://www.w3.org/2003/05/soap-envelope" xmlns:xsd="http://www.w3.org/2001/XMLSchema" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"> # <soap12:Header> # <ClientHeader xmlns="http://mysoapservice.com/"> # <ClientToken>ABCDEABCDE</ClientToken> # </ClientHeader> # </soap12:Header> # <soap12:Body> # <GetCustomer xmlns="http://mysoapservice.com/"> # <CustomerID>12345</CustomerID> # </GetCustomer> # </soap12:Body> # </soap12:Envelope>
Now, we need to send this request off to the server using Typhoeus. Something like this will suffice :
module MySoapService SERVICE_URL = "http://soap.example.com/" class Base cattr_reader :last_request, :last_response # Performs the SOAP request def self.soap_it!(&block) data = construct_envelope(&block) @@last_request = data response = Typhoeus::Request.post(SERVICE_URL, :body => data.to_xml, :headers => {'Content-Type' => "text/xml; charset=utf-8"}) process_response(response) end end end
Here we can see Typhoeus making the POST request to the server. Nothing particularly exciting.
One thing to note is that the Content-Type used above isn't a certainty -
some servers will insist on accepting only application/xml
, others
will be more liberal.
At this stage, Typhoeus will make the request immediately, as soon as
the soap_it!
request is made. Whilst this is suitable in some cases, we
will look at how to make Typhoeus perform parallel requests a bit later.
Finally, let's implement that process_response method and turn any SOAP Fault responses into an Exception.
# Processes the response and decides whether to handle an error or # whether to return the content def self.process_response(response) @@last_response = response if response.body =~ /soap:Fault/ then handle_error(response) else return response end end # Parses a soap:Fault error and raises it as a MySoapService::SoapError def self.handle_error(response) xml = Nokogiri::XML(response.body) xpath = '/soap:Envelope/soap:Body/soap:Fault//ExceptionMessage' msg = xml.xpath(xpath).text # TODO: Capture any app-specific exception messages here. # For example, if the server returns a Fault when a search # has no results, you might rather return an empty array. raise MySoapService::SoapError.new("Error from server: #{msg}", @@last_request, @@last_response) end
Putting it all together in a neat bundle, we have something like this.
Our SoapError
class doesn't need to be too exciting - it just adds some
helpers to let us easily see the request and responses that caused the
exception. See here.
class SoapError < StandardError attr_reader :last_request, :last_response def initialize(message, last_request, last_response) @last_request = last_request @last_response = last_response super(message) end end
Domain Object Subclasses
Now that we have our base class handling the communication, we can start making real SOAP requests.
We do this by inheriting from the Base controller, and having our methods
call soap_it!
like so :
module MySoapService class Customer < Base def self.find_by_email(email) response = soap_it! do |xml| xml.GetCustomerDetail('xmlns' => NAMESPACE) do xml.EmailAddress email end end parse_customer_response(response) end end end
This will make our "GetCustomerDetail" request to the server, sending the "EmailAddress" parameter. Because it's just a builder, you can easily handle whatever structure the server expects.
Some servers insist on you providing the data types, and you can do this like so:
xml.EmailAddress("xsi:type" => "xsd:string", email)
XSD defines a variety of Data Types, and you will want to check the documentation from your API vendor for the exact requirements.
Finally, we need to implement our parse_customer_response
method, which
might look like this :
# Map XML elements to our local fields CUSTOMER_FIELD_MAPPING = { :Name => :name, :EmailAddress => :email } def self.parse_customer_response(response) if response then xml = Nokogiri::XML(response.body) xpath = '/soap:Envelope/soap:Body/mss:GetCustomerDetailResponse/mss:GetCustomerDetailResult/mss:Customers/mss:Customer' result = xml.xpath(xpath, namespaces(xml)).first data = {} CUSTOMER_FIELD_MAPPING.each do |soap_element, field| data[field] = result.xpath("./mss:#{soap_element}", namespaces(xml)).first.text end ::Customer.new(data) else # Nothing to be done, no data returned false end end
You will notice that our XPath expression uses a "mss:" namespace prefix -
this relates to our namespaces()
method, which we will define next.
The other thing to note is that we use ::Customer
to force it to
instantiate the top level Customer
class defined in our application,
rather than MySoapService::Customer
. In this case it's likely to be an
ActiveRecord model.
(The astute reader will also notice that these calls to .xpath(...)
could be
neatened up into a helper on the Base
class)
The namespaces()
method lives in the Base class, and is a simple helper
to tell XPath the namespace alias for the vendor-specific elements, like
so :
# Merges in the MySoapService namespace def self.namespaces(xml) { 'mss' => NAMESPACE }.merge(xml.document.namespaces) end
Putting it all together, we have a Customer class something like this.
Making a request to the SOAP API is now as simple as :
customer = MySoapService::Customer.find_by_email('foo@example.com') customer.email # => "foo@example.com"
Testing your Client
You will obviously want to heavily test your SOAP client. To do this, we need to stub the calls to the server - this ensures that your tests are consistent, fast, and can't change when someone changes data on the server.
Thankfully, Typhoeus makes stubbing the HTTP requests quite easy, and we use XML fixture files containing the response. We use these helpers in our rSpec tests to do this :
def typhoeus_stub(name) response = Typhoeus::Response.new(:code => 200, :headers => "", :body => xml_fixture(name), :time => 0.3) Typhoeus::Request.should_receive(:post).with(any_args()).at_least(:once).and_return(response) end def xml_fixture(name) File.read(File.join(Rails.root, 'spec', 'fixtures', 'soap', name.to_s + '.xml')) end def assert_in_soap_request(regex) MySoapService::Base.last_request.to_xml.should =~ regex end
We place XML files in spec/fixtures/soap/ - it's easy to add debugging statements to your Base
class so as that you can write the XML request and response to the logs.
It's fairly simple to make the queries from the console by hand using
the soap_it!
call, and you can tinker with the parameters until you
receive the response you are expecting.
Finally, we run the response XML through xmllint --format
before saving as a fixture so as
that we can easily refer to the fixture data.
Another approach is to catch the XML content directly from the wire using tools like sudo tcpdump -i en1 -vvvA host soap.example.com or Wireshark, however these can be unreasonably difficult if accessing the API over HTTPS.
An example spec might look like this :
it "should request a specific email address, returning the Customer" do typhoeus_stub(:customer) result = MySoapService::Customer.find_by_email('foo@example.com') # Ensure that the request XML includes the expected elements assert_in_soap_request /<EmailAddress>foo@example.com<\/EmailAddress>/ # Ensure that our response has populated the class result.should be_a(Customer) result.email.should == 'foo@example.com' end
The important bit here is that we only stub out the actual HTTP request -
soap_it!
still receives a real Typhoeus::Response
object, and everything
else works as it actually will. This is essential to ensure that our tests
fail if we upgrade to a Typhoeus version that has major API changes.
Parallel Requests using Typhoeus::Hydra
One of the main benefits of using Typhoeus is making parallel requests to the server, and thankfully it's fairly easy.
First off, we revise our Base#soap_it!
method like so :
# Performs the SOAP request def self.soap_it!(for_hydra=false, &block) data = construct_envelope(&block) @@last_request = data if for_hydra then Typhoeus::Request.new(SERVICE_URL, :method => :post, :body => data.to_xml, :headers => {'Content-Type' => "text/xml; charset=utf-8"}) else response = Typhoeus::Request.post(SERVICE_URL, :body => data.to_xml, :headers => {'Content-Type' => "text/xml; charset=utf-8"}) process_response(response) end end
What is happening is that if we call soap_it!(true)
we return a
Typhoeus::Request
object rather than running the request immediately.
Then, we can create a request in our subclass like this :
def self.update_all_by_emails(emails) hydra = Typhoeus::Hydra.new(:max_concurrency => 10) emails.each do |email| request = soap_it!(true) do |xml| xml.GetCustomerDetail('xmlns' => NAMESPACE) do xml.EmailAddress email end end request.on_complete do |r| update_customer_from_response(r) end hydra.queue request end hydra.run true end
In this example we are making a request to the server to get multiple
Customers by their email addresses, and calling the
update_customer_from_response()
method for each result.
Typhoeus::Hydra
will make all the requests at once (up to the
max_concurrency
setting) and will process them as soon as they complete.
Because you are simply returning a Typhoeus::Request
object, you could
even feasibly collect these objects from a range of subclasses, add them
to a single Hydra queue and have different request process in parallel.
The best way to structure this is an exercise left to the reader.
Testing the Hydra calls is a little more complex, but not too much so.
We use the following rSpec helper to stub the HTTP requests for these connections:
def typhoeus_hydra_stub(name) response = Typhoeus::Response.new(:code => 200, :headers => "", :body => xml_fixture(name), :time => 0.3) hydra=Typhoeus::Hydra.hydra hydra.stub(:post, /http/).and_return(response) Typhoeus::Hydra.should_receive(:new).and_return(hydra) end
GOTCHA: Methods with Underscores
Be alert if you are working with methods with "unusual" names; some servers allow extended characters in the method names, and these are often escaped in creative ways.
The particularly problematic ones are underscores, as Nokogiri has some
magic helpers in the builder that use underscores to create elements
with the same name as reserved words. For example, xml.class
would call
Object#class
, rather than create the xml.class_
and the underscore will be silently trimmed
off.
In one case, we ran into a method named CreateCustomerOrder_x0028_OrderXML_x0029_
. Not only was the documentation a bit lacking on this method (we eventually checked the WSDL by hand to find the method name!) but Nokogiri's magic "underscore" names kept removing the trailing underscore when we were creating the element.
Instead, we had to do this :
response = soap_it! do |xml| # GOTCHA!!!! Nokogiri has some magic with trailing undescores. As part of that, it trims the # underscore off. So, we have to put TWO underscores at the end - one for stripping # by Nokogiri, and one for keeps. xml.send('CreateCustomerOrder_x0028_OrderXML_x0029__', 'xmlns' => NAMESPACE) do # ... method parameters go here ... end end
Conclusion
Using a combination of fast, feature rich libraries like Typhoeus and Nokogiri it is possible to quickly and easily construct a Ruby SOAP Client that is fast, can support parallel connections, and is easy for developers to maintain.
Latest Articles by Our Team
Our expert team of designers and developers love what the do and enjoy sharing their knowledge with the world.
-
No app left behind: Upgrade your application to Ruby 3.0 and s...
-
A look forward from 2020
-
Testing Rails applications on real mobile devices (both design...
We Hire Only the Best
reinteractive is Australia’s largest dedicated Ruby on Rails development company. We don’t cut corners and we know what we are doing.
We are an organisation made up of amazing individuals and we take pride in our team. We are 100% remote work enabling us to choose the best talent no matter which part of the country they live in. reinteractive is dedicated to making it a great place for any developer to work.
Free Community Workshops
We created the Ruby on Rails InstallFest and Ruby on Rails Development Hub to help introduce new people to software development and to help existing developers hone their skills. These workshops provide invaluable mentorship to train developers, addressing key skills shortages in the industry. Software development is a great career choice for all ages and these events help you get started and skilled up.