Internet, Unix en security

Faulty RSS-feeds

Taking a look at some logs from a RSS-collector two things raised my eyebrows. The first is how many feeds are being served by FeedBurner instead of directly being served by the website it self. The part that worries me is that a lot of those feeds also are about security, privacy and compliance. I think a lot of those people have something to think about in 2012.

The other thing that worries me even more is something I discussed with WordPress developers a couple of years ago and I know others who have done the same with other projects. A lot of projects learned to do input validation, but most of them still need to learn to do output validation. The parser I currently use appears to be very strict luckily and drops a feed when it doesn’€™t parse correctly. Here comes the funny part, other parses like from Google Reader seems to be more forgiving.

When I search for “libxml exploit” on Google Search I get 1.220.000 results back. I didn’t start searching for parsers currently in use, but this doesn’t look very promising. With current hash-issues in mind, how could this be used to be an attack vector? Keep in mind that a lot of sites use FeedBurner to take the load of there site. And yes, FeedBurner doesn’t really clean things up if I may believe my current logs. So the recipe looks like a good exploit to misuse, a high profile WordPress based website with FeedBurner enabled and watch the fireworks.

So maybe it is a good idea for 2012 to see if the parser I’m currently using is up to standard. This can become nasty very quickly if things go wrong. Maybe also a note to others, output validation matters together with input validation. The JavaScript-alert is still a funny one to deploy on websites.

Life and society

Feeds farewell and thanks for all the fish

As my viewing port on the Internet has become an RSS-reader more and more during 2011 I also started to pay attention on the content presented. So during my Christmas break I’m going to remove some feeds from my RSS-reader. As side note, the compressed database dump grows with 1 megabyte between every 5 and 8 days now.

But the first feeds that have to go are websites or blogs that only present a snippet and hope you come to there website to continue reading the article. Some comments I have read why people do that is banners or hoping that you also read other content. For the first there are solutions to embed banners in your RSS-feed. The later is just b*llsh*t as that person is subscribed to your RSS-feed and how much more commitment do you want on reading your content?

What may be a problem is the experience people have reading your RSS-feed as a lot of sites, and yes I’m looking at you also WordPress, that do not include the right CSS in the feed. This is something that needs and can be solved. The other remark is notification and traffic and the question is if those are real issues with the use of a ping-servers and a distribution hubs. FeedBurner is one for example which can take the load of your website or blog. Load that was also there when they where forcing people towards your website.

I may sound hars, but I have to spent my time more wisely. With 125+ feeds in my reader and with a few of those being OPML-feeds it is really time to clean things up. It also makes me wonder how easy it would be to integrate certain features from Google Reader into TT-RSS to get figures how much you read and what you’re reading and what not. First the Christmas cleaning as it takes the backend about 30 days to stop fetching the feed after the last user unsubscribed.

Internet, Unix en security Privacy & veiligheid

FeedBurner sites misleiden

Op verzoek van Antoon een vervroegde posting over hoe je RSS-feeds kan misleiden om toch de content binnen te krijgen zonder te worden gemeten bij FeedBurner. De makkelijkste oplossing is om proxy te gebruiken en dit voorbeeld ga uit van Privoxy om de juiste headers te herschrijven.

Als je verschillende plugins bekijkt dan zie je dat de plugins een redirect doen voor elk verzoek om de feed te downloaden. De redirect gaat naar de juiste URL bij FeedBurner, maar de plugins controleren eerste de User-Agent string. Als bij deze controle een match wordt gemaakt met de string “feedburner”, dan wordt er geen redirect gedaan en wordt de daadwerkelijk content geserveerd.

Dit laat dus de mogelijkheid open om ons voor te doen als FeedBurner. Gelukkig kan Privoxy naast inkomende content ook uitgaande HTTP-headers herschrijven. Door de volgende regels toe te voegen aan het bestand user.action van Privoxy zal in veel gevallen een aanpaste HTTP-header worden gezonden naar website ipv de HTTP-header van de RSS-reader.

{+crunch-incoming-cookies \
+hide-user-agent{FeedBurner/1.0 (} }

Wat sommige mensen zal opvallen is de optie om cookies te vernietigen, maar helaas blijken ook veel feeds oa cookies mee te sturen om te kijken hoeveel mensen hun feed lezen. Ook gelijk de waarschuwing dat ik nog niet voldoende tijd heb gehad om dit alle goed en lang te testen, maar voorlopig lijkt het correct te werken bij het bekijken van netwerktraces.

Internet, Unix en security Privacy & veiligheid

FeedBurner buitensluiten

Na Google Analytics gaat nu ook FeedBurner in de ban. Want monitoren is leuk, maar er zijn grenzen en zeker als alle data wordt geëxporteerd naar de United States of America. Een kleine uitbreiding aan de hosts-file lost dit snel op.

Helaas stopte hiermee een redelijk aantal RSS-feeds met werken, maar daar is snel afscheid van genomen. Binnenkort eens kijken of we de juiste filter regels kunnen maken voor Privoxy.