[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [tor-talk] Funded search engine for onionspace?
Leeroy, to avoid being indexed by Googlebot et al, place the appropriate
/robots.txt at your root. It's described in the FAQ.
As a historical note, the reason Aaron and I chose Tor2web's URL design was
so search engines would automatically see any /robots.txt an onionsite
On Fri, Feb 13, 2015 at 3:30 PM, l.m <firstname.lastname@example.org> wrote:
> >Alas no. I'm aware this is suboptimal. I see GOOG search engine as
> >temporary-ladder just to get the ball rolling. I am open to using
> >other index. For what it's worth I'm very pleased with GOOG's
> >performance---right now it's searching an index of 650k onion pages
> and the
> >number grows every day.
> If you instead use a google search appliance couldn't you use google
> engine for indexing without having to use google itself? Wouldn't that
> also avoid the problem of google queries being associated with the
> client making the request?
> >Although we technically could read provided passwords, we don't keep
> >of passed traffic. However, I understand that many users don't
> >the tor2web threat model. But this is the same as all Tor2web nodes,
> >This is not at all unique to OnionCity. As far as I know all Tor2web
> >allow form submissions.
> What is unique to onion.city is that access to someonion.onion.city
> occurs using http and doesn't redirect to the .onion if Tor is in use.
> That the tor2web mirror might snoop is implicit--that the exit (if
> using tor) might also snoop is more of a concern.
> >You mentioned it'd be better to have it randomly pick among the
> >Tor2web nodes instead of everything going through OnionCity. This
> >the GOOG search engine which only wants to return "canonical" URLs.
> >could talk about making OnionCity a DNS round-robin akin to how
> >currently works, but then I'm just replicating Tor2web.
> The ability of tor2web to provide mirrors should be optional. If you
> only know one mirror and that mirror cannot service the request then
> how are you going to get any of the other mirrors? Google engine can
> return related addresses in an order based on the success of loading
> the mirror itself. If onion.city always works it will tend to precede
> tor2web.org. If onion.city goes down (having search front-end separate
> from tor2web mirror) the search engine can reorder the result to
> improve the success of the first click.
> >Right now I aggregate existing lists of onion sites and put them
> into the
> >site map.
> >* https://ahmia.fi/onions/
> >* http://skunksworkedp2cg.onion.city/sites.txt
> >* http://xlmvhk3rpdux26dz.onion.city/
> >* http://kkkkkku5juzqh33a.onion.city/
> If google is itself handling the indexing won't that cause a problem
> for sites in those lists, which are normally okay with being indexed,
> just not by googlebot? I for one couldn't care less about being
> indexed by ahmia.fi but it'll be a cold day in hell before I let
> googlebot. Precisely because of how easy it is to link the search to
> the requester.
> tor-talk mailing list - email@example.com
> To unsubscribe or change other settings go to
tor-talk mailing list - firstname.lastname@example.org
To unsubscribe or change other settings go to