Introduction
To put this in context, I'm currently writing my master-thesis on methods and strategies for implementing database-independent, distributed transactions, involving multiple web services. My goal was to build small and useful components, that can be used to implement and then achieve transactional security for otherwise independent web services.
A lot of my test scenarios included one or more services failing or delaying their request/response. One specific scenario involved delaying requests in such a manner that a lost update (as known from the database world) would occur on my non transactional test system.
So one of my API endpoints /events
aggregated the following requests to update the publishingDate of both an event and a job:
PUT /event/:id
PUT /job/:id
A lost update would occur, if the PUT /job/:id
part of a request A to /events
would be delayed (because of some latency issue) and request B would slip in and finish before request A finishes. So the update of the job for request B actually is lost in favor of the delayed request A.
As a result of this, the database is in a dirty state. The event we updated now contains the data of request B, while the job contains data of request A.
This (among other database consistency problems) is illustrated very well in this talk and speaker-deck by Martin Kleppmann
API-Monkey solution
So I actually needed a way of stating "this part of my request chain should delay for a bit", so the other part is executed first and a lost update occurs. This is when the api-monkey was born.
I decided to implement a small middleware on top of my express APIs, which checks for a route specific request header and then reacts accordingly. This is well illustrated in the documentation at: https://www.npmjs.com/package/api-monkey
for the example I illustrated before, I'd set my test cases up like this:
/**
* 2 requests from user interactions produce a "lost update"
* request a -> Update Event
* request b -> Update Event
* request b -> Update Job
* request a -> Update Job
* the updated data from request b gets lost, despite originally starting later
* Event and Job data are corrupted (not matching anymore)
*/
describe('When issuing 2 conflicting Event PUT request...', () => {
let eventId = null;
before('Prepare: Create an Event', (done) => {
// create the event which we want to update
});
it('the requests should succeed', done => {
function requestA() {
return new Promise((resolve, reject) => {
supertest(app)
.put(`/events/${eventId}`)
.send({ publishingDate: publishingDate2 })
// Here we delay the Update on the Job for 750ms
.set('monkey_PUT_job', '750/none')
.expect(200)
.end((err, res) => {
if (!err && res) {
resolve();
}
reject(err);
});
});
}
function requestB() {
// Request B without monkey header
}
// actually perform those 2 requests with Promise.all()
});
it('The Event in MongoDB should contain data from Request B', done => {});
it('The Job in Redis should contain data from Request A', done => {});
});
In my application, I basically just have to forward those monkey headers in the requests to the /event
and /job
APIs.
app.use(apiMonkey());
app.put('/events/:id', (req, res) => {
const eventId = req.params.id;
...
// Access monkeyHeaders on the request object
const headers = req.monkeyHeaders;
// Use the monkey headers in the follow up requests
request.put({ uri: `${eventApiUrl}/${eventId}`, json: req.body, headers })
...
});
Features
The Api-Monkey currently supports:
- Delaying requests of specific API Endpoints by milliseconds
- Generating custom Errorcodes
- Wildcards for API Endpoints
Feel free to report bugs/feature requests on github or contact me about anything else here on steemit!
References
Martin Kleppmann on transactions at strange loop
API-Monkey on NPM
API-Monkey on Github
Well done.. i wan to start some learning on node.js too... give us some, newbie how to start links..
Thanks, I plan on publishing more about node.js regularly, since I'm already quite into it and have lots of entry level material from earlier on.