[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [tor-talk] How to write program that uses Tor network



Asio is only a socket library which means you would need to build all the
Http logic on top of it, which is not very fun but everything you need to
know is documented in RFCs if you really want to go down that route.

The "best/ easiest" way would be to use a http library specifically for the
purpose of fetching webpages. Curl is a good one. To integrate Tor support
it is simply a matter of setting a SOCKs proxy, the same way you configure
a web browser to use Tor.

Make sure that your library contains an option to proxy DNS as well. If
fetching bing.com works but an onion site doesn't then you probably have a
DNS leak. Curl provides an option to fix this but it is not enabled by
default.

This is not really related to Tor but are you sure C++ is the right
language for this? You will quickly discover that web developers have a
very easy life. Not a single one of them is capable of writing valid HTML
but browsers need to process it anyway (hence why there are so many bugs in
browsers).

You can get kind of far using regular expressions. You can get kind of
further with libtidy and an XML parser. If you are serious though I would
recommend an alternative language such as ruby + nokogiri or python +
beautiful soup, at least to do the HTML parsing.

Of course you can always embed a parser written in another language into an
existing C++ code base (Python is easy, Ruby is harder but I have done it).
If you are still at the greenfields stage of the project you should think
about this early.

I hope this helps.
-- 
tor-talk mailing list - tor-talk@lists.torproject.org
To unsubscribe or change other settings go to
https://lists.torproject.org/cgi-bin/mailman/listinfo/tor-talk