Reasons why some developers prefer IntelliJ IDEA over Visual Studio Code

IntelliJ IDEA and Visual Studio Code are both highly regarded integrated development environments (IDEs) used by developers for coding, testing, and debugging. While both have their strengths, some…

Smartphone

独家优惠奖金 100% 高达 1 BTC + 180 免费旋转




Access password protected Tumblr programmatically

Last Saturday I wanted to code a Tumblr scrapper avoiding the Tumblr API just for fun. For this purpose I chose to develop a script in Node.js.

By now I realized Node.js wasn’t perhaps the best option and Python would have served to my purposes faster. Nonetheless, this proved to be a good excuse to play around with JavaScript’s Promises.

At first glance, this procedure might not seem interesting or worth doing, given that there is several Tumblr scrappers in Github. However all the ones I’ve found use Tumblr’s API (which of course, requires having an account and an API key, etc.). None of them would allow access to password protected Tumblrs. The interesting bit I’ve come across while looking into this, is that Tumblr’s authentication is quite peculiar.

were we are presented with the authentication form:

If we take a look to the source code, we find it’s a pretty straightforward web form:

If we submit the form with the wrong password, we can expect it to return the corresponding wrong password web. Now, the unexpected behavior happens when we submit the correct password.

Of course, when accessing through the browser I didn’t notice anything (my Tumblr loaded correctly). However, when I access programmatically, the form returns the same particular login page every time; which I will now proceed to describe.

An inspect at the network traffic doesn’t reveal anything odd:

I struggled with that for a while, checking cookies, headers and network traffic without realizing what’s going on. What’s even more intriguing, is that I had an extra request from the browser to the login page with this data in the form:

Where did that 1505662833:77OCC…. come from? Perhaps some local JavaScript generated it, but… is a second submit. The first one (as seen in previous picture) is sent to the server as plain text.
Finally I re-examined the login page and… surprise! the second time we get the login page (after we submit the correct password) we get a few more extra things. These are:

Once we make the second request to the form, we finally get the expected cookie for the next calls to the Tumblr:

Despite this not being particularly complicated or hackery, It isn’t straightforward nor seems to make much sense either. Sure Tumblr engineers have their reasons to design this double submit form. To me it is still a mystery :)

So, TL;DR:

Add a comment

Related posts:

Snow Came Again

New Year saw warm temperatures of 13 degrees C. It was the mildest new year on record. Two weeks later we had sub-zero weather and one day of snow, although it was sparse and did not last long. This…

AmazonAccounts en gros ohne EmailValidierung

Ich habe ein Problem mit Amazon: neue Accounts lassen sich bei Amazon.{de,com,…} mit beliebigen Emailadressen anlegen, ohne dass irgendeine Validierung der Adresse stattfindet. Um es noch schlimmer…

About Aging

When I ask someone about his age, I always get a number in response. But how does he know what age I meant? Or does everyone assume we only have one age? I suppose so. But we have more than one age…