Paul's Internet Landfill/ 2019/ How NOT to Communicate a Technical Problem

How NOT to Communicate a Technical Problem

So the secret Eventbrite RSS feeds I have been using to populate Watcamp recently broke. This was frustrating, but not as frustrating as my experiences trying to figure out what was wrong.

The trouble started when I called their search endpoint, and received a "403 forbidden" error. I was at the university at the time, so I hoped that the problem was because I was on a foreign network, and not because I was in trouble for abusing their API somehow. But no: when I tried again at home I got the same error.

I saw there was an Eventbrite Status page. On Oct 18 (the day I noticed the outage) it reported that there were "increased error rates" for the search endpoint. I was using that search endpoint, so I presumed that this was an API problem they would fix. Sure enough, the page reported the issue as fixed the next day, so I tried my script again, and again it gave me a 403 error.

Now I was scared. I navigated the (confusing) Eventbrite support page and sent an email. I wanted to know whether I had done something wrong. A few days later I received the following response:

Thank you for your patience. We recently made some changes to our APIs in an effort to improve platform functionality and performance. Some of these adjustments necessitated the unforeseen deprecation of one of our public Event Search APIs ("/events/search/" endpoint). Since this change was made, the team has been working hard on potential solutions for providing access to our event feed for API users that are not part of our official distribution partner program. We wanted to reach out to acknowledge the frustration you’re feeling and let you know that we will update you with more details over the coming week as we determine the viability of a potential replacement solution.

Okay, fair enough. This told me some important information: (a) they broke something and (b) this probably was not my fault. So I was patient. I did reply and ask whether there was a website I could follow to keep up to date with progress on this issue, but I got radio silence back.

On a whim, I tried the API call again today. It seemed to work, for four requests! Then I got a "429 Client Error" message. Looking up this status code in the API Documentation I saw that this meant I was exceeding the API rate limit, which was supposed to be 2000 requests per hour. I had made four! Was there another rate limit I was supposed to know about? I tried making the API call again, but this time I got a scarier "406 Not Acceptable" response, which made me think I was in trouble again.

The API documentation pointed to a Google Group for the API. I had seen this months earlier but did not remember the URL or where to find it. This forum was (and currently is) filled with threads similar to my complaint. The same form response has been posted to a few of the threads. Some people had also noticed that the API seemed to be back up, but with only four API calls allowed before rate limiting/blocking kicked in.

What's Going On?

I feel that Eventbrite broke their API sometime around October 18, and that they are trying to develop a solution. I am guessing they are testing out this solution, and in the process they (temporarily?) enabled API access, but only for four calls.

What Eventbrite Did Right

The Google Group exists, so I can see what is happening without having to re-contact support myself.

The form letter was somewhat helpful in explaining that they messed something up.

Acknowledging my frustration helped calm me down.

What Eventbrite Did Wrong

Eventbrite did a LOT of things wrong, but one thing I don't blame them much for is breaking their API. From the Google Group it seems that some people depend upon this API for their livelihoods, but I am not in that camp. The problem is irritating and it means Watcamp is more sparse, but it is not a huge deal to me. Furthermore, things break sometimes. In this DevOpsy world I suppose there should never be a service outage this severe, but my guess is that the search endpoint put a lot of strain on the servers in some unexpected way.

I feel bad for the developers who are sweating through this situation now. I have been in similar situations and it has not been fun.

Having said all that, I think that Eventbrite is handling things relatively poorly.

First and most critically, there is no canonical status page to follow. The developer/support team threw a couple of vague statements on the Google Group and has been silent since. So now upset API consumers are hammering the Google Group with unwelcome "Is it fixed yet????" messages, and I bet they are getting a lot of support requests along the same lines.

There is the Eventbrite Status page I linked above, but it says that all systems are fine. This is an untruth, and it is actively harmful to say otherwise. Instead, Eventbrite should have a message saying that the search functionality of the API is down, and preferably a breakdown of what users should expect. There should probably be a FAQ. Even if the answers to a bunch of questions is hazy ("Q: Is the search functionality coming back?" "A: We are not sure yet.") it would at least acknowledge the situation.

Ideally, the Eventbrite API team would publish updates like "We are temporarily enabling the API, but only for four queries. Do not use this yet! We are just testing solutions!" That is a lot to ask, but if the effects of your testing are accessible by customers/consumers, then you should try to keep them updated when things appear to start working but break in different ways.

This page should not only exist, but people like me who ask questions related to the issue should be pointed to it as a place where they can keep up to date about what is going on.

The form letter promised to update me "through the week" as they looked into this issue, but nobody has done that.

There is an Eventbrite API Twitter that has not been updated since 2017. If they are not going to keep this updated then the Twitter account should be deactivated.

It is not clear where people should look for status updates. It is not clear that the Google Group is the best place to go. Why isn't this group linked from the main Eventbrite support page?

One reason the Eventbrite API people might be avoiding status updates is because of lawyers and liability. If this is true then it makes me angry. It is more important to give people clarity than to worry that revealing too much information will subject you to a class action lawsuit.

TL;DR

Eventbrite should have a single, clear, transparent status page telling API consumers about what is happening.

They should not have support channels (their Twitter) that are not updated. They should not have support channels (their status page) that are not truthful.

If your business model depends upon other people's APIs you are in for a rough time.