[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [tor-talk] Cryptographic social networking project

Dear evervigilant, no we do not consider running Diaspora
behind Tor since that is not a good idea in both terms of
anonymity and scalability. Diaspora already has scalability
issues, it would certainly not improve if each transaction
travels across six Tor relays. And the way each Diaspora
node has a pretty large view of the entire social graph
makes it an easy target for de-anonymization of that graph
as shown by the 2009 "De-anonymizing Social Networks" paper
I mentioned earlier. See also "Scalability & Paranoia in a
Decentralized Social Network" concerning the safety of having 
social data on federated servers and VPS in the clear.

Back to Ms or Mr Sharebook,

What I wrote earlier:
> >If in a distant future the encryption fails us, attackers would
> >be able to decrypt what they see right there plus how much they have
> >been keeping as a "full take" or "Tor snapshot." That I hope is different
> >from being able to access the entire history of all social network
> >interactions, because they're all in that cloud.

On Thu, Jan 15, 2015 at 10:04:02PM +0000, contact@sharebook.com wrote:
> Criticizing cloud storage in the case of a cryptanalysis breakthrough is
> unrealistic. attackers wiretap communications and pickup cipher-texts in
> transit not just from servers. They can easily detect cipher-texts (e.g
> PGP encrypted emails) from plain-texts to store them forever so assuming
> that we didn't stored cipher-text on a server won't help anyone if
> attackers break ciphers themselves. 

There is no plain text involved in the communication architecture we
are talking about. Not in ours and I hope also not in yours. So making
a reference to the dangers of using PGP is moot here. Also, being the 
guy who collected 15 reasons not to start using PGP I am kind of familiar
with what you say ( http://secushare.org/PGP ).

And no, it is not unrealistic to criticize cloud architectures this way.
I have heard from several sources that the NSA does not archive P2P/file 
sharing interactions. Copyright infringement is even less of their 
concern as pedophilia. Even GCHQ's "full take" is thrown away after a
certain time. What they do is archive each and every e-mail, PGP or not,
for indefinite time - because it is so obvious to do. They may be
archiving Tor traffic for as long as it is affordable, although by the
forward secret nature of Tor it isn't even a very worthwhile exercise
even if decryption were feasible, but if we additionally teach Tor to 
be multicast scalable, then Tor would start homing a lot of social 
networking and maybe even forms of multi-recipient streaming. That 
drives the cost of archival pretty high, possibly out of the range
of reasonable affordability.

I think this is an advantage worth considering, whereas a cloud system
already provides for the optimal archival for later decryption.

> >Also, who pays for Utah-like storage requirements? What is your business model >for financing the sharebook cloud servers?
> it's not Utah-like. We can get donations if people really use the
> application in mass scales (think of wikipedia)

Well, I would like to challenge the likes of Facebook, not make yet
another social network for aware minorities.

> >You are trading in scalability for what you think is the necessary
> >cryptography but researches seem to be of a different opinion as the
> >following papers show.
> >2009, "De-anonymizing Social Networks" by Arvind Narayanan and Vitaly
> >Shmatikov is about correlating Twitter and Flickr users.
> >Is this really what you mean? Sounds pretty off-topic to me.
> why do you think our one-to-many pseudonyms graph would be different
> from Flickr? pubsub attach some metadata to pseudonymous vertices that
> can be used for analyzing them 

I like that you are using "our" while speaking of the secushare model,
I assume you are considering joining forces.  ;)  Since it is necessary
to take over all involved relay nodes for each distribution tree, I
think that it is a very expensive operation trying to obtain even parts
of the social graph. If you only have some parts of the tree, you should
not be able to correlate which parts belong to the same tree. Only after
you have somehow managed to obtain pieces of the social graph you can
attempt a de-anonymization according to the paper you cited - that in
our scheme means you either take over a lot of relay nodes, or - much
easier - you obtain information from the user's laptops and smartphones
as they access the social network. That unfortunately is a weak point
that all social applications on the Internet will always have - and a
much cheaper point of weakness than the threat of being able to figure
out the structure of a multicast distribution happening across the
network. Still, the paper we cited above states explicitly that the
operation of de-anonymization works best if it has access to large
social graphs. That was easy to do with Flickr vs Twitter, but it
isn't at all if all you have is unnamed nodes in unstructured pieces
of trees. That is how I conclude that the research done in that paper
is not applicable to neither of our designs - it would be applicable
only in the scenario of "let's just use Diaspora via Tor."

So, from an anonymity protection point of view it doesn't matter
if we unicast notifications or send data over ratchet multicasts -
in both cases it is much cheaper to access social data from the
end user devices. Not so in the case of Diaspora via Tor - that
is sufficiently insecure that it is probably cheaper to p0wn nodes.
Therefore it seems pretty logical to me that missing out on the
advantages of multicast is only damaging to the end goal of providing
an alternative to the current social networking services.

> anonymity is a different topic. i'm talking about compromising social
> graphs. for instance in netflix attack vertices are already anonymous
> but attackers try match some data from IMDB that gave real identities
> for patterns to deanonymize similar patterns on anonymized netflix
> dataset and they really did!

I'm not familiar with this incident. Do you have some info?

> i guess you searched "social network anonymity" in scholar and just sent
> me the results. but those papers are not protections against link
> prediction algorithms that attackers use for deanonymizing social
> graphs. they talk about cucumbers not apples.

You said I should search for that and I did. If you are alluding
to any other research, then please bring it on.

> >The disadvantages of requiring a storage cloud are more heavy-weight.
> If "disadvantage" means a deanonymization attack that breaks our threat
> model (attacker can't break Tor, majority of exit nodes aren't
> concluding with attacker) then explain it, but if "disadvantage" is

s/concluding/colluding/ ?

> depending on a feudal vendor rather than having fun with a liberal
> distributed network, then we try overthrow any feudal part as soon as
> possible but it's very hard to do that when we can't find a distributed
> alternative. 

I elaborated on the disadvantages, compared to a multicast architecture,
in previous mails. I would accept the cloud thing as a trade-off if our
architecture wasn't probable of functioning, but so far the research
papers suggest that scalable anonymous multicast is something humanity
should try out ASAP, to have a privacy-friendly alternative to cloud
technology in general.

> >I challenge that, at least in the current Tor network. If the attacker
> >applies traffic shaping to the outgoing notification. Only if the
> >notification has a fixed size the third hop can avoid replicating the
> >shaped traffic and thus allow an observer to see which rendez-vous
> >points are being addressed - possibly de-anonymizing many involved
> >hidden services behind them. Probably there is even a chance of
> >de-anonymization if notifications had a fixed size, since the third hop
> >will suddenly be busy sending out all similarly shaped packets to 167 RPs.
> First; as I said in our threat model we assume majority of ORs aren't
> concluding with attacker in same time and we assume anonymity works
> (attacker can't deanonymize Tor). 

Well, what's the point in assuming something that isn't true? Why
don't you state that in order to be able to provide social graph 
privacy you need Tor to implement "alpha mixing" or equivalent, and
assume that it will function?

We are less dependent on something like that since the multicast
system is packet-oriented, not circuit-oriented, so we only send
out complete packets, thus unshaping any traffic shaping that may
have happened on the incoming Tor circuit. There's more to be said
about multicast anonymity, but it's something worth debating on the 
secushare mailing lists rather than here.

I believe at the current state of Tor, that a combination of Tor
and GNUnet, by combining two different models of anonymity, 
ultimately achieves more anonymity. We get both onion routing *and* 
traffic shaping resistance. It's like using PGP within Pond, or 
HTTPS to an .onion.

> Second; an observer can see Alice's third hope sends packet to what RPs.


> But attacker can't determine these 167 packet that third hop OR sends to
> 167 RP, is from Alice to her 167 friends or 167 different person at that
> OR send one packet to one of their friends at each RP which in this
> scenario it becomes connection between pseudonyms in a linear paradigm.


> But what pubsub as far as I understand (maybe I got it wrong) do a
> subscription between root sender and leaf receivers which reveals the
> connection to an observer in between when root multicast a packet to
> leaves subscribers. 

Only if the multicast signaling were unencrypted/not anonymous, which 
should not be the case. Maybe you are thinking of "IP Multicast," the 
1992 standard?

> >I challenge that as well. Given a high latency packet-oriented multicast
> >system being fed from the third hop, distributing the content to a network
> >of reception points, the maximum de-anonymization that can be achieved
> >is by p0wning some nodes, seeing some fragments of somebody's trees,
> >still not being able to tell where the stuff came from and where it
> >will end up.
> First question is how much bandwidth in numbers a high latency
> packet-oriented multicast saves compared to asking Alice instantly
> unicast the packets? 

You mean if Alice sends the complete data events rather than
just notifications - since only that would be a fair comparison?
I haven't done the maths but it should be similar to comparing
n unicast file transfers versus a single bittorrent.
I have enough experience to not need maths for that, but I welcome
anyone to pick out suitable research papers.

> Second question is if the multicaster sends same packets to recipients
> then how is that possible to tell an observer that don't draw edges
> between anonymous vertices? Even 1 hour delay doesn't change packets
> semantically, while in unicasting random packets the observer can't draw
> edges between root vertex and recipient vertices. 

Did you understand what I said when I mentioned multicast ratchets?
You maintain the advantages of distributing just one packet across
the network while at the same time having a differently encrypted
packet on each node of the tree. If I wasn't clear, please ask - not skip.

> I agree that in real world scenarios multicasting is not that bad but
> it's better choose stronger theories when we have the opportunity,
> despite the fact that nobody attacks those weak parts which we scared
> from. I propose we use the pubsub multicast strategy as a backing plan
> when we get ride of available bandwidth in Tor network, to switch


> unicasting Notifications into multicasting one Notification that is
> encrypted by an epoch forward secure key for all friends. 

Even better if that key changes on each branch of the tree.
Not doing the multicast does not improve the anonymity a lot then,
only introduces other trade-offs.

> >Of course accessing blocks from a third party server is a trade-off
> >in excessive bandwidth, please.
> How much excessive MB/GB/TB it would be in your estimation when we
> download block from server?? 

As I said it is like having n unicast downloads vs one Bittorrent.
I don't need numbers to know, that the multicast architecture will
be more efficient in most use cases, but I am sure research has
plenty of numbers to offer. Please investigate.

We are planning to use multicast even for 1:1 conversations so that 
we can include a person's multiple devices if she likes and employ 
relay nodes for spooling if the recipient is currently offline.

Several use cases that in XMPP and SMTP are treated as complicated
special cases requiring extra protocol extensions or custom protocols
like POP or IMAP, in the PSYC/secushare architecture become natural
ways to always use the same thing: You come back online, you reconnect
your guard nodes, DHT or rendez-vous points and all the events that
you missed in the mean-time are spooled down to you. Private or
"public" messages, indistinguishibly. Whatever that device is permitted
to access and has therefore subscribed.

> >Exactly, so your model with the centralized block cloud is doomed,
> >as I see it.
> it is not doomed if someone pay Amazon's bills but it will be flooded
> with blocks and there is no alternative solution for keeping those
> blocks on somewhere else as p2p networks are only good on distributing
> data not keeping the data itself for long terms with 100% reliability. 

Whatever you mean by p2p networks is probably a simplification that
doesn't apply to the scenario I described.

> I love to get ride of PseudonymousServer because i'm the one who is
> responsible for paying off Amazon bills but even if we do multicasting
> above hidden services then we still need it for "future retrievals" when
> requesters can't find a seeder for desired block on Bittorrent network
> in the future and for "asynchronous retrievals" when Bob's hidden
> service is offline and when he becomes online gets Notification from
> public pool to retrieve the block but Alice is offline at that time.
> Also we need PseudonymousServer for backuping PDB and many different
> things. 

As I explained, the multicast network itself does a certain amount
of spooling - and if you *really* need to recover old data, you
can still reach out for the other subscribers of each pubsub. We
also intend to use pubsubs among devices to share configuration
and serve as each other's backup. If you have a relay that is your
friend, then it can keep a copy of your backups - so whenever you
lose your smartphone you can recreate your identity and secushare
experience on a new one. No need for cloud storage for anything.
The definition of "friend" for a relay is complicated, so let's
not discuss that here.

> They systematically won't! you are a good man but most of others out
> there aren't like you. just think of those zombies who sued apple
> [http://www.bloomberg.com/news/2014-12-31/apple-customers-sue-over-shortage-of-storage-space-in-ios-8-1-.htm]
> because of asking them free some space for installing an important
> update. people really want when they replace their smartphone, simply by
> entering a username and password recover everything back. This is not
> subterfuge, it's substantial. Usability is a security parameter because
> if majority of users don't use our secure software then attackers easily
> compromise them. 

So? If you don't have enough disk space on your device, secushare
will only offer limited functionality like real-time chat.. so what.
Your free choice.

> We are a mobile application (for various important reasons) and mobile
> phones have a few GB free space which is very valuable, we only can
> cache texture contents for a long time, cached media contents will be
> removed after a short period of time. Without backing up blocks on a
> server, for sure they will lose them during time. 

Is social networking all about hi-res pictures and movie sharing?
I don't think it takes a gig to maintain the current dashboard,
basic social graph info and several messaging threads. If you
don't have enough space on your mobile phone, simply install
secushare also to your laptop. But seriously, how much is a
storage card for your phone? If this is going to be the next
Facebook, people will want it to work smoothly. And they will
love being able to spend hours on the social network without
needing Internet connectivity. I heard in some places on Earth
people cannot afford being online all the time. For them, this
would be great! It's like UUCP taken to how the Internet should
be today.

tor-talk mailing list - tor-talk@lists.torproject.org
To unsubscribe or change other settings go to